January 31, 2020

3224 words 16 mins read

Paper Group ANR 181

Prudence When Assuming Normality: an advice for machine learning practitioners. Shoestring: Graph-Based Semi-Supervised Learning with Severely Limited Labeled Data. Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings. Bayesian Optimized 1-Bit CNNs. STEFANN: Scene Text Editor using Font Adaptive Neural Network …

Prudence When Assuming Normality: an advice for machine learning practitioners


Title	Prudence When Assuming Normality: an advice for machine learning practitioners
Authors	Waleed A. Yousef
Abstract	In a binary classification problem the feature vector (predictor) is the input to a scoring function that produces a decision value (score), which is compared to a particular chosen threshold to provide a final class prediction (output). Although the normal assumption of the scoring function is important in many applications, sometimes it is severely violated even under the simple multinormal assumption of the feature vector. This article proves this result mathematically with a counter example to provide an advice for practitioners to avoid blind assumptions of normality. On the other hand, the article provides a set of experiments that illustrate some of the expected and well-behaved results of the Area Under the ROC curve (AUC) under the multinormal assumption of the feature vector. Therefore, the message of the article is not to avoid the normal assumption of either the input feature vector or the output scoring function; however, a prudence is needed when adopting either of both.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12852v2
PDF	https://arxiv.org/pdf/1907.12852v2.pdf
PWC	https://paperswithcode.com/paper/prudence-when-assuming-normality-an-advice
Repo
Framework

Shoestring: Graph-Based Semi-Supervised Learning with Severely Limited Labeled Data


Title	Shoestring: Graph-Based Semi-Supervised Learning with Severely Limited Labeled Data
Authors	Wanyu Lin, Zhaolin Gao, Baochun Li
Abstract	Graph-based semi-supervised learning has been shown to be one of the most effective approaches for classification tasks from a wide range of domains, such as image classification and text classification, as they can exploit the connectivity patterns between labeled and unlabeled samples to improve learning performance. In this work, we advance this effective learning paradigm towards a scenario where labeled data are severely limited. More specifically, we address the problem of graph-based semi-supervised learning in the presence of severely limited labeled samples, and propose a new framework, called {\em Shoestring}, that improves the learning performance through semantic transfer from these very few labeled samples to large numbers of unlabeled samples. In particular, our framework learns a metric space in which classification can be performed by computing the similarity to centroid embedding of each class. {\em Shoestring} is trained in an end-to-end fashion to learn to leverage the semantic knowledge of limited labeled samples as well as their connectivity patterns with large numbers of unlabeled samples simultaneously. By combining {\em Shoestring} with graph convolutional networks, label propagation and their recent label-efficient variations (IGCN and GLP), we are able to achieve state-of-the-art node classification performance in the presence of very few labeled samples. In addition, we demonstrate the effectiveness of our framework on image classification tasks in the few-shot learning regime, with significant gains on miniImageNet ($2.57%\sim3.59%$) and tieredImageNet ($1.05%\sim2.70%$).
Tasks	Few-Shot Learning, Image Classification, Node Classification, Text Classification
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12976v1
PDF	https://arxiv.org/pdf/1910.12976v1.pdf
PWC	https://paperswithcode.com/paper/shoestring-graph-based-semi-supervised
Repo
Framework

Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings


Title	Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings
Authors	Hwiyeol Jo, Ceyda Cinarel
Abstract	We propose a novel and simple method for semi-supervised text classification. The method stems from the hypothesis that a classifier with pretrained word embeddings always outperforms the same classifier with randomly initialized word embeddings, as empirically observed in NLP tasks. Our method first builds two sets of classifiers as a form of model ensemble, and then initializes their word embeddings differently: one using random, the other using pretrained word embeddings. We focus on different predictions between the two classifiers on unlabeled data while following the self-training framework. We also use early-stopping in meta-epoch to improve the performance of our method. Our method, Delta-training, outperforms the self-training and the co-training framework in 4 different text classification datasets, showing robustness against error accumulation.
Tasks	Sentiment Analysis, Text Classification
Published	2019-01-22
URL	https://arxiv.org/abs/1901.07651v3
PDF	https://arxiv.org/pdf/1901.07651v3.pdf
PWC	https://paperswithcode.com/paper/delta-training-simple-semi-supervised-text
Repo
Framework

Bayesian Optimized 1-Bit CNNs


Title	Bayesian Optimized 1-Bit CNNs
Authors	Jiaxin Gu, Junhe Zhao, Xiaolong Jiang, Baochang Zhang, Jianzhuang Liu, Guodong Guo, Rongrong Ji
Abstract	Deep convolutional neural networks (DCNNs) have dominated the recent developments in computer vision through making various record-breaking models. However, it is still a great challenge to achieve powerful DCNNs in resource-limited environments, such as on embedded devices and smart phones. Researchers have realized that 1-bit CNNs can be one feasible solution to resolve the issue; however, they are baffled by the inferior performance compared to the full-precision DCNNs. In this paper, we propose a novel approach, called Bayesian optimized 1-bit CNNs (denoted as BONNs), taking the advantage of Bayesian learning, a well-established strategy for hard problems, to significantly improve the performance of extreme 1-bit CNNs. We incorporate the prior distributions of full-precision kernels and features into the Bayesian framework to construct 1-bit CNNs in an end-to-end manner, which have not been considered in any previous related methods. The Bayesian losses are achieved with a theoretical support to optimize the network simultaneously in both continuous and discrete spaces, aggregating different losses jointly to improve the model capacity. Extensive experiments on the ImageNet and CIFAR datasets show that BONNs achieve the best classification performance compared to state-of-the-art 1-bit CNNs.
Tasks
Published	2019-08-17
URL	https://arxiv.org/abs/1908.06314v1
PDF	https://arxiv.org/pdf/1908.06314v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-optimized-1-bit-cnns
Repo
Framework

STEFANN: Scene Text Editor using Font Adaptive Neural Network


Title	STEFANN: Scene Text Editor using Font Adaptive Neural Network
Authors	Prasun Roy, Saumik Bhattacharya, Subhankar Ghosh, Umapada Pal
Abstract	Textual information in a captured scene play important role in scene interpretation and decision making. Pieces of dedicated research work are going on to detect and recognize textual data accurately in images. Though there exist methods that can successfully detect complex text regions present in a scene, to the best of our knowledge there is no work to modify the textual information in an image. This paper deals with a simple text editor that can edit/modify the textual part in an image. Apart from error correction in the text part of the image, this work can directly increase the reusability of images drastically. In this work, at first, we focus on the problem to generate unobserved characters with the similar font and color of an observed text character present in a natural scene with minimum user intervention. To generate the characters, we propose a multi-input neural network that adapts the font-characteristics of a given characters (source), and generate desired characters (target) with similar font features. We also propose a network that transfers color from source to target character without any visible distortion. Next, we place the generated character in a word for its modification maintaining the visual consistency with the other characters in the word. The proposed method is a unified platform that can work like a simple text editor and edit texts in images. We tested our methodology on popular ICDAR 2011 and ICDAR 2013 datasets and results are reported here.
Tasks	Decision Making
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01192v1
PDF	http://arxiv.org/pdf/1903.01192v1.pdf
PWC	https://paperswithcode.com/paper/stefann-scene-text-editor-using-font-adaptive
Repo
Framework

Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation


Title	Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation
Authors	Al-Hussein A. El-Shafie, Mohamed Zaki, Serag El-Din Habib
Abstract	Object trackers based on Convolution Neural Network (CNN) have achieved state-of-the-art performance on recent tracking benchmarks, while they suffer from slow computational speed. The high computational load arises from the extraction of the feature maps of the candidate and training patches in every video frame. The candidate and training patches are typically placed randomly around the previous target location and the estimated target location respectively. In this paper, we propose novel schemes to speed-up the processing of the CNN-based trackers. We input the whole region-of-interest once to the CNN to eliminate the redundant computations of the random candidate patches. In addition to classifying each candidate patch as an object or background, we adapt the CNN to classify the target location inside the object patches as a coarse localization step, and we employ bilinear interpolation for the CNN feature maps as a fine localization step. Moreover, bilinear interpolation is exploited to generate CNN feature maps of the training patches without actually forwarding the training patches through the network which achieves a significant reduction of the required computations. Our tracker does not rely on offline video training. It achieves competitive performance results on the OTB benchmark with 8x speed improvements compared to the equivalent tracker.
Tasks	Object Tracking
Published	2019-01-09
URL	http://arxiv.org/abs/1901.02620v1
PDF	http://arxiv.org/pdf/1901.02620v1.pdf
PWC	https://paperswithcode.com/paper/fast-cnn-based-object-tracking-using
Repo
Framework

Short and Wide Network Paths


Title	Short and Wide Network Paths
Authors	Lavanya Marla, Lav R. Varshney, Devavrat Shah, Nirmal A. Prakash, Michael E. Gale
Abstract	Network flow is a powerful mathematical framework to systematically explore the relationship between structure and function in biological, social, and technological networks. We introduce a new pipelining model of flow through networks where commodities must be transported over single paths rather than split over several paths and recombined. We show this notion of pipelined network flow is optimized using network paths that are both short and wide, and develop efficient algorithms to compute such paths for given pairs of nodes and for all-pairs. Short and wide paths are characterized for many real-world networks. To further demonstrate the utility of this network characterization, we develop novel information-theoretic lower bounds on computation speed in nervous systems due to limitations from anatomical connectivity and physical noise. For the nematode Caenorhabditis elegans, we find these bounds are predictive of biological timescales of behavior. Further, we find the particular C. elegans connectome is globally less efficient for information flow than random networks, but the hub-and-spoke architecture of functional subcircuits is optimal under constraint on number of synapses. This suggests functional subcircuits are a primary organizational principle of this small invertebrate nervous system.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00344v1
PDF	https://arxiv.org/pdf/1911.00344v1.pdf
PWC	https://paperswithcode.com/paper/short-and-wide-network-paths
Repo
Framework

Deterministic Completion of Rectangular Matrices Using Ramanujan Bigraphs – I: Error Bounds and Exact Recovery


Title	Deterministic Completion of Rectangular Matrices Using Ramanujan Bigraphs – I: Error Bounds and Exact Recovery
Authors	Shantanu Prasad Burnwal, Mathukumalli Vidyasagar
Abstract	In this paper we study the matrix completion problem: Suppose $X \in {\mathbb R}^{n_r \times n_c}$ is unknown except for an upper bound $r$ on its rank. By measuring a small number $m \ll n_r n_c$ of the elements of $X$, is it possible to recover $X$ exactly, or at least, to construct a reasonable approximation of $X$? At present there are two approaches to choosing the sample set, namely probabilistic and deterministic. Probabilistic methods can guarantee the exact recovery of the unknown matrix, but only with high probability. At present there are very few deterministic methods, and they mostly apply only to square matrices. The focus in the present paper is on deterministic methods that work for rectangular as well as square matrices, and where possible, can guarantee exact recovery of the unknown matrix. We achieve this by choosing the elements to be sampled as the edge set of an asymmetric Ramanujan graph or Ramanujan bigraph. For such a measurement matrix, we (i) derive bounds on the error between a scaled version of the sampled matrix and unknown matrix; (ii) derive bounds on the recovery error when max norm minimization is used, and (iii) present suitable conditions under which the unknown matrix can be recovered exactly via nuclear norm minimization. In the process we streamline some existing proofs and improve upon them, and also make the results applicable to rectangular matrices. This raises two questions: (i) How can Ramanujan bigraphs be constructed? (ii) How close are the sufficient conditions derived in this paper to being necessary? Both questions are studied in a companion paper.
Tasks	Matrix Completion
Published	2019-08-02
URL	https://arxiv.org/abs/1908.00963v2
PDF	https://arxiv.org/pdf/1908.00963v2.pdf
PWC	https://paperswithcode.com/paper/deterministic-completion-of-rectangular
Repo
Framework

Deep Multi-Facial patches Aggregation Network for Expression Classification from Face Images


Title	Deep Multi-Facial patches Aggregation Network for Expression Classification from Face Images
Authors	Amine Djerghri, Ahmed Rachid Hazourli, Alice Othmani
Abstract	Emotional Intelligence in Human-Computer Interaction has attracted increasing attention from researchers in multidisciplinary research fields including psychology, computer vision, neuroscience, artificial intelligence, and related disciplines. Human prone to naturally interact with computers face-to-face. Human Expressions is an important key to better link human and computers. Thus, designing interfaces able to understand human expressions and emotions can improve Human-Computer Interaction (HCI) for better communication. In this paper, we investigate HCI via a deep multi-facial patches aggregation network for Face Expression Recognition (FER). Deep features are extracted from facial parts and aggregated for expression classification. Several problems may affect the performance of the proposed framework like the small size of FER datasets and the high number of parameters to learn. For That, two data augmentation techniques are proposed for facial expression generation to expand the labeled training. The proposed framework is evaluated on the extended Cohn-Konade dataset (CK+) and promising results are achieved.
Tasks	Data Augmentation
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10305v2
PDF	https://arxiv.org/pdf/1909.10305v2.pdf
PWC	https://paperswithcode.com/paper/190910305
Repo
Framework

Word Embeddings: A Survey


Title	Word Embeddings: A Survey
Authors	Felipe Almeida, Geraldo Xexéo
Abstract	This work lists and describes the main recent strategies for building fixed-length, dense and distributed representations for words, based on the distributional hypothesis. These representations are now commonly called word embeddings and, in addition to encoding surprisingly good syntactic and semantic information, have been proven useful as extra features in many downstream NLP tasks.
Tasks	Word Embeddings
Published	2019-01-25
URL	http://arxiv.org/abs/1901.09069v1
PDF	http://arxiv.org/pdf/1901.09069v1.pdf
PWC	https://paperswithcode.com/paper/word-embeddings-a-survey
Repo
Framework

S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks


Title	S3: A Spectral-Spatial Structure Loss for Pan-Sharpening Networks
Authors	Jae-Seok Choi, Yongwoo Kim, Munchurl Kim
Abstract	Recently, many deep-learning-based pan-sharpening methods have been proposed for generating high-quality pan-sharpened (PS) satellite images. These methods focused on various types of convolutional neural network (CNN) structures, which were trained by simply minimizing a spectral loss between network outputs and the corresponding high-resolution multi-spectral (MS) target images. However, due to different sensor characteristics and acquisition times, high-resolution panchromatic (PAN) and low-resolution MS image pairs tend to have large pixel misalignments, especially for moving objects in the images. Conventional CNNs trained with only the spectral loss with these satellite image datasets often produce PS images of low visual quality including double-edge artifacts along strong edges and ghosting artifacts on moving objects. In this letter, we propose a novel loss function, called a spectral-spatial structure (S3) loss, based on the correlation maps between MS targets and PAN inputs. Our proposed S3 loss can be very effectively utilized for pan-sharpening with various types of CNN structures, resulting in significant visual improvements on PS images with suppressed artifacts.
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05480v2
PDF	https://arxiv.org/pdf/1906.05480v2.pdf
PWC	https://paperswithcode.com/paper/s3-a-spectral-spatial-structure-loss-for-pan
Repo
Framework

A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning


Title	A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning
Authors	Xuanqing Liu, Si Si, Xiaojin Zhu, Yang Li, Cho-Jui Hsieh
Abstract	In this paper, we proposed a general framework for data poisoning attacks to graph-based semi-supervised learning (G-SSL). In this framework, we first unify different tasks, goals, and constraints into a single formula for data poisoning attack in G-SSL, then we propose two specialized algorithms to efficiently solve two important cases — poisoning regression tasks under $\ell_2$-norm constraint and classification tasks under $\ell_0$-norm constraint. In the former case, we transform it into a non-convex trust region problem and show that our gradient-based algorithm with delicate initialization and update scheme finds the (globally) optimal perturbation. For the latter case, although it is an NP-hard integer programming problem, we propose a probabilistic solver that works much better than the classical greedy method. Lastly, we test our framework on real datasets and evaluate the robustness of G-SSL algorithms. For instance, on the MNIST binary classification problem (50000 training data with 50 labeled), flipping two labeled data is enough to make the model perform like random guess (around 50% error).
Tasks	data poisoning
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14147v1
PDF	https://arxiv.org/pdf/1910.14147v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-data-poisoning-attack
Repo
Framework

Natural Question Generation with Reinforcement Learning Based Graph-to-Sequence Model


Title	Natural Question Generation with Reinforcement Learning Based Graph-to-Sequence Model
Authors	Yu Chen, Lingfei Wu, Mohammed J. Zaki
Abstract	Natural question generation (QG) aims to generate questions from a passage and an answer. In this paper, we propose a novel reinforcement learning (RL) based graph-to-sequence (Graph2Seq) model for QG. Our model consists of a Graph2Seq generator where a novel Bidirectional Gated Graph Neural Network is proposed to embed the passage, and a hybrid evaluator with a mixed objective combining both cross-entropy and RL losses to ensure the generation of syntactically and semantically valid text. The proposed model outperforms previous state-of-the-art methods by a large margin on the SQuAD dataset.
Tasks	Graph-to-Sequence, Question Generation
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08832v1
PDF	https://arxiv.org/pdf/1910.08832v1.pdf
PWC	https://paperswithcode.com/paper/natural-question-generation-with
Repo
Framework

DynGraph2Seq: Dynamic-Graph-to-Sequence Interpretable Learning for Health Stage Prediction in Online Health Forums


Title	DynGraph2Seq: Dynamic-Graph-to-Sequence Interpretable Learning for Health Stage Prediction in Online Health Forums
Authors	Yuyang Gao, Lingfei Wu, Houman Homayoun, Liang Zhao
Abstract	Online health communities such as the online breast cancer forum enable patients (i.e., users) to interact and help each other within various subforums, which are subsections of the main forum devoted to specific health topics. The changing nature of the users’ activities in different subforums can be strong indicators of their health status changes. This additional information could allow health-care organizations to respond promptly and provide additional help for the patient. However, modeling complex transitions of an individual user’s activities among different subforums over time and learning how these correspond to his/her health stage are extremely challenging. In this paper, we first formulate the transition of user activities as a dynamic graph with multi-attributed nodes, then formalize the health stage inference task as a dynamic graph-to-sequence learning problem, and hence propose a novel dynamic graph-to-sequence neural networks architecture (DynGraph2Seq) to address all the challenges. Our proposed DynGraph2Seq model consists of a novel dynamic graph encoder and an interpretable sequence decoder that learn the mapping between a sequence of time-evolving user activity graphs and a sequence of target health stages. We go on to propose dynamic graph hierarchical attention mechanisms to facilitate the necessary multi-level interpretability. A comprehensive experimental analysis of its use for a health stage prediction task demonstrates both the effectiveness and the interpretability of the proposed models.
Tasks	Graph-to-Sequence
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08497v1
PDF	https://arxiv.org/pdf/1908.08497v1.pdf
PWC	https://paperswithcode.com/paper/dyngraph2seq-dynamic-graph-to-sequence
Repo
Framework

On-demand teleradiology using smartphone photographs as proxies for DICOM images


Title	On-demand teleradiology using smartphone photographs as proxies for DICOM images
Authors	Christine Podilchuk, Siddhartha Pachhai, Robert Warfsman, Richard Mammone
Abstract	The use of photographs of the screen of displayed medical images is explored to circumvent the challenges involved in transferring images between sites. The photographs can be conveniently taken with a smartphone and analyzed remotely by either human or AI experts. An autoencoder preprocessor is shown to improve the performance for human experts. The AI performance provided by photographs is shown to be statistically equivalent to using the original DICOM images. The autoencoder preprocessor increases the PSNR by 15 dB or greater and provides an AUC that is statistically equivalent to using the original DICOM images. The photo approach is an alternative to IHE-based teleradiology applications while avoiding the problems inherit in navigating the proprietary and security barriers that limit DICOM communication between PACS in practice.
Tasks
Published	2019-09-06
URL	https://arxiv.org/abs/1909.05669v2
PDF	https://arxiv.org/pdf/1909.05669v2.pdf
PWC	https://paperswithcode.com/paper/on-demand-teleradiology-using-smartphone
Repo
Framework