October 19, 2019

3288 words 16 mins read

Paper Group ANR 196

Real-to-Virtual Domain Unification for End-to-End Autonomous Driving. Reconstruction of training samples from loss functions. Meta-Gradient Reinforcement Learning. Deep Object-Centric Policies for Autonomous Driving. Studying the History of the Arabic Language: Language Technology and a Large-Scale Historical Corpus. Graph Refinement based Tree Ext …

Real-to-Virtual Domain Unification for End-to-End Autonomous Driving


Title	Real-to-Virtual Domain Unification for End-to-End Autonomous Driving
Authors	Luona Yang, Xiaodan Liang, Tairui Wang, Eric Xing
Abstract	In the spectrum of vision-based autonomous driving, vanilla end-to-end models are not interpretable and suboptimal in performance, while mediated perception models require additional intermediate representations such as segmentation masks or detection bounding boxes, whose annotation can be prohibitively expensive as we move to a larger scale. More critically, all prior works fail to deal with the notorious domain shift if we were to merge data collected from different sources, which greatly hinders the model generalization ability. In this work, we address the above limitations by taking advantage of virtual data collected from driving simulators, and present DU-drive, an unsupervised real-to-virtual domain unification framework for end-to-end autonomous driving. It first transforms real driving data to its less complex counterpart in the virtual domain and then predicts vehicle control commands from the generated virtual image. Our framework has three unique advantages: 1) it maps driving data collected from a variety of source distributions into a unified domain, effectively eliminating domain shift; 2) the learned virtual representation is simpler than the input real image and closer in form to the “minimum sufficient statistic” for the prediction task, which relieves the burden of the compression phase while optimizing the information bottleneck tradeoff and leads to superior prediction performance; 3) it takes advantage of annotated virtual data which is unlimited and free to obtain. Extensive experiments on two public driving datasets and two driving simulators demonstrate the performance superiority and interpretive capability of DU-drive.
Tasks	Autonomous Driving
Published	2018-01-10
URL	http://arxiv.org/abs/1801.03458v2
PDF	http://arxiv.org/pdf/1801.03458v2.pdf
PWC	https://paperswithcode.com/paper/real-to-virtual-domain-unification-for-end-to
Repo
Framework

Reconstruction of training samples from loss functions


Title	Reconstruction of training samples from loss functions
Authors	Akiyoshi Sannai
Abstract	This paper presents a new mathematical framework to analyze the loss functions of deep neural networks with ReLU functions. Furthermore, as as application of this theory, we prove that the loss functions can reconstruct the inputs of the training samples up to scalar multiplication (as vectors) and can provide the number of layers and nodes of the deep neural network. Namely, if we have all input and output of a loss function (or equivalently all possible learning process), for all input of each training sample $x_i \in \mathbb{R}^n$, we can obtain vectors $x’_i\in \mathbb{R}^n$ satisfying $x_i=c_ix’_i$ for some $c_i \neq 0$. To prove theorem, we introduce the notion of virtual polynomials, which are polynomials written as the output of a node in a deep neural network. Using virtual polynomials, we find an algebraic structure for the loss surfaces, called semi-algebraic sets. We analyze these loss surfaces from the algebro-geometric point of view. Factorization of polynomials is one of the most standard ideas in algebra. Hence, we express the factorization of the virtual polynomials in terms of their active paths. This framework can be applied to the leakage problem in the training of deep neural networks. The main theorem in this paper indicates that there are many risks associated with the training of deep neural networks. For example, if we have N (the dimension of weight space) + 1 nonsmooth points on the loss surface, which are sufficiently close to each other, we can obtain the input of training sample up to scalar multiplication. We also point out that the structures of the loss surfaces depend on the shape of the deep neural network and not on the training samples.
Tasks
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07337v1
PDF	http://arxiv.org/pdf/1805.07337v1.pdf
PWC	https://paperswithcode.com/paper/reconstruction-of-training-samples-from-loss
Repo
Framework

Meta-Gradient Reinforcement Learning


Title	Meta-Gradient Reinforcement Learning
Authors	Zhongwen Xu, Hado van Hasselt, David Silver
Abstract	The goal of reinforcement learning algorithms is to estimate and/or optimise the value function. However, unlike supervised learning, no teacher or oracle is available to provide the true value function. Instead, the majority of reinforcement learning algorithms estimate and/or optimise a proxy for the value function. This proxy is typically based on a sampled and bootstrapped approximation to the true value function, known as a return. The particular choice of return is one of the chief components determining the nature of the algorithm: the rate at which future rewards are discounted; when and how values should be bootstrapped; or even the nature of the rewards themselves. It is well-known that these decisions are crucial to the overall success of RL algorithms. We discuss a gradient-based meta-learning algorithm that is able to adapt the nature of the return, online, whilst interacting and learning from the environment. When applied to 57 games on the Atari 2600 environment over 200 million frames, our algorithm achieved a new state-of-the-art performance.
Tasks	Meta-Learning
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09801v1
PDF	http://arxiv.org/pdf/1805.09801v1.pdf
PWC	https://paperswithcode.com/paper/meta-gradient-reinforcement-learning
Repo
Framework

Deep Object-Centric Policies for Autonomous Driving


Title	Deep Object-Centric Policies for Autonomous Driving
Authors	Dequan Wang, Coline Devin, Qi-Zhi Cai, Fisher Yu, Trevor Darrell
Abstract	While learning visuomotor skills in an end-to-end manner is appealing, deep neural networks are often uninterpretable and fail in surprising ways. For robotics tasks, such as autonomous driving, models that explicitly represent objects may be more robust to new scenes and provide intuitive visualizations. We describe a taxonomy of “object-centric” models which leverage both object instances and end-to-end learning. In the Grand Theft Auto V simulator, we show that object-centric models outperform object-agnostic methods in scenes with other vehicles and pedestrians, even with an imperfect detector. We also demonstrate that our architectures perform well on real-world environments by evaluating on the Berkeley DeepDrive Video dataset, where an object-centric model outperforms object-agnostic models in the low-data regimes.
Tasks	Autonomous Driving
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05432v2
PDF	http://arxiv.org/pdf/1811.05432v2.pdf
PWC	https://paperswithcode.com/paper/deep-object-centric-policies-for-autonomous
Repo
Framework

Studying the History of the Arabic Language: Language Technology and a Large-Scale Historical Corpus


Title	Studying the History of the Arabic Language: Language Technology and a Large-Scale Historical Corpus
Authors	Yonatan Belinkov, Alexander Magidow, Alberto Barrón-Cedeño, Avi Shmidman, Maxim Romanov
Abstract	Arabic is a widely-spoken language with a long and rich history, but existing corpora and language technology focus mostly on modern Arabic and its varieties. Therefore, studying the history of the language has so far been mostly limited to manual analyses on a small scale. In this work, we present a large-scale historical corpus of the written Arabic language, spanning 1400 years. We describe our efforts to clean and process this corpus using Arabic NLP tools, including the identification of reused text. We study the history of the Arabic language using a novel automatic periodization algorithm, as well as other techniques. Our findings confirm the established division of written Arabic into Modern Standard and Classical Arabic, and confirm other established periodizations, while suggesting that written Arabic may be divisible into still further periods of development.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03891v1
PDF	http://arxiv.org/pdf/1809.03891v1.pdf
PWC	https://paperswithcode.com/paper/studying-the-history-of-the-arabic-language
Repo
Framework


Title	Graph Refinement based Tree Extraction using Mean-Field Networks and Graph Neural Networks
Authors	Raghavendra Selvan, Thomas Kipf, Max Welling, Jesper H Pedersen, Jens Petersen, Marleen de Bruijne
Abstract	Graph refinement, or the task of obtaining subgraphs of interest from over-complete graphs, can have many varied applications. In this work, we extract tree structures from image data by, first deriving a graph-based representation of the volumetric data and then, posing tree extraction as a graph refinement task. We present two methods to perform graph refinement. First, we use mean-field approximation (MFA) to approximate the posterior density over the subgraphs from which the optimal subgraph of interest can be estimated. Mean field networks (MFNs) are used for inference based on the interpretation that iterations of MFA can be seen as feed-forward operations in a neural network. This allows us to learn the model parameters using gradient descent. Second, we present a supervised learning approach using graph neural networks (GNNs) which can be seen as generalisations of MFNs. Subgraphs are obtained by jointly training a GNN based encoder-decoder pair, wherein the encoder learns useful edge embeddings from which the edge probabilities are predicted using a simple decoder. We discuss connections between the two classes of methods and compare them for the task of extracting airways from 3D, low-dose, chest CT data. We show that both the MFN and GNN models show significant improvement when compared to a baseline method, that is similar to a top performing method in the EXACT’09 Challenge, in detecting more branches.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08674v1
PDF	http://arxiv.org/pdf/1811.08674v1.pdf
PWC	https://paperswithcode.com/paper/graph-refinement-based-tree-extraction-using
Repo
Framework

Image Ordinal Classification and Understanding: Grid Dropout with Masking Label


Title	Image Ordinal Classification and Understanding: Grid Dropout with Masking Label
Authors	Chao Zhang, Ce Zhu, Jimin Xiao, Xun Xu, Yipeng Liu
Abstract	Image ordinal classification refers to predicting a discrete target value which carries ordering correlation among image categories. The limited size of labeled ordinal data renders modern deep learning approaches easy to overfit. To tackle this issue, neuron dropout and data augmentation were proposed which, however, still suffer from over-parameterization and breaking spatial structure, respectively. To address the issues, we first propose a grid dropout method that randomly dropout/blackout some areas of the raining image. Then we combine the objective of predicting the blackout patches with classification to take advantage of the spatial information. Finally we demonstrate the effectiveness of both approaches by visualizing the Class Activation Map (CAM) and discover that grid dropout is more aware of the whole facial areas and more robust than neuron dropout for small training dataset. Experiments are conducted on a challenging age estimation dataset - Adience dataset with very competitive results compared with state-of-the-art methods.
Tasks	Age Estimation, Data Augmentation
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02901v1
PDF	http://arxiv.org/pdf/1805.02901v1.pdf
PWC	https://paperswithcode.com/paper/image-ordinal-classification-and
Repo
Framework


Title	Encoding Longer-term Contextual Multi-modal Information in a Predictive Coding Model
Authors	Junpei Zhong, Tetsuya Ogata, Angelo Cangelosi
Abstract	Studies suggest that within the hierarchical architecture, the topological higher level possibly represents a conscious category of the current sensory events with slower changing activities. They attempt to predict the activities on the lower level by relaying the predicted information. On the other hand, the incoming sensory information corrects such prediction of the events on the higher level by the novel or surprising signal. We propose a predictive hierarchical artificial neural network model that examines this hypothesis on neurorobotic platforms, based on the AFA-PredNet model. In this neural network model, there are different temporal scales of predictions exist on different levels of the hierarchical predictive coding, which are defined in the temporal parameters in the neurons. Also, both the fast and the slow-changing neural activities are modulated by the active motor activities. A neurorobotic experiment based on the architecture was also conducted based on the data collected from the VRep simulator.
Tasks
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06774v1
PDF	http://arxiv.org/pdf/1804.06774v1.pdf
PWC	https://paperswithcode.com/paper/encoding-longer-term-contextual-multi-modal
Repo
Framework

Pathologies of Neural Models Make Interpretations Difficult


Title	Pathologies of Neural Models Make Interpretations Difficult
Authors	Shi Feng, Eric Wallace, Alvin Grissom II, Mohit Iyyer, Pedro Rodriguez, Jordan Boyd-Graber
Abstract	One way to interpret neural model predictions is to highlight the most important input features—for example, a heatmap visualization over the words in an input sentence. In existing interpretation methods for NLP, a word’s importance is determined by either input perturbation—measuring the decrease in model confidence when that word is removed—or by the gradient with respect to that word. To understand the limitations of these methods, we use input reduction, which iteratively removes the least important word from the input. This exposes pathological behaviors of neural models: the remaining words appear nonsensical to humans and are not the ones determined as important by interpretation methods. As we confirm with human experiments, the reduced examples lack information to support the prediction of any label, but models still make the same predictions with high confidence. To explain these counterintuitive results, we draw connections to adversarial examples and confidence calibration: pathological behaviors reveal difficulties in interpreting neural models trained with maximum likelihood. To mitigate their deficiencies, we fine-tune the models by encouraging high entropy outputs on reduced examples. Fine-tuned models become more interpretable under input reduction without accuracy loss on regular examples.
Tasks	Calibration
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07781v3
PDF	http://arxiv.org/pdf/1804.07781v3.pdf
PWC	https://paperswithcode.com/paper/pathologies-of-neural-models-make
Repo
Framework

Accelerating likelihood optimization for ICA on real signals


Title	Accelerating likelihood optimization for ICA on real signals
Authors	Pierre Ablin, Jean-François Cardoso, Alexandre Gramfort
Abstract	We study optimization methods for solving the maximum likelihood formulation of independent component analysis (ICA). We consider both the the problem constrained to white signals and the unconstrained problem. The Hessian of the objective function is costly to compute, which renders Newton’s method impractical for large data sets. Many algorithms proposed in the literature can be rewritten as quasi-Newton methods, for which the Hessian approximation is cheap to compute. These algorithms are very fast on simulated data where the linear mixture assumption really holds. However, on real signals, we observe that their rate of convergence can be severely impaired. In this paper, we investigate the origins of this behavior, and show that the recently proposed Preconditioned ICA for Real Data (Picard) algorithm overcomes this issue on both constrained and unconstrained problems.
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09390v1
PDF	http://arxiv.org/pdf/1806.09390v1.pdf
PWC	https://paperswithcode.com/paper/accelerating-likelihood-optimization-for-ica
Repo
Framework


Title	Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval
Authors	Jing Yu, Yuhang Lu, Zengchang Qin, Yanbing Liu, Jianlong Tan, Li Guo, Weifeng Zhang
Abstract	Cross-modal information retrieval aims to find heterogeneous data of various modalities from a given query of one modality. The main challenge is to map different modalities into a common semantic space, in which distance between concepts in different modalities can be well modeled. For cross-modal information retrieval between images and texts, existing work mostly uses off-the-shelf Convolutional Neural Network (CNN) for image feature extraction. For texts, word-level features such as bag-of-words or word2vec are employed to build deep learning models to represent texts. Besides word-level semantics, the semantic relations between words are also informative but less explored. In this paper, we model texts by graphs using similarity measure based on word2vec. A dual-path neural network model is proposed for couple feature learning in cross-modal information retrieval. One path utilizes Graph Convolutional Network (GCN) for text modeling based on graph representations. The other path uses a neural network with layers of nonlinearities for image modeling based on off-the-shelf features. The model is trained by a pairwise similarity loss function to maximize the similarity of relevant text-image pairs and minimize the similarity of irrelevant pairs. Experimental results show that the proposed model outperforms the state-of-the-art methods significantly, with 17% improvement on accuracy for the best case.
Tasks	Cross-Modal Information Retrieval, Information Retrieval
Published	2018-02-03
URL	http://arxiv.org/abs/1802.00985v2
PDF	http://arxiv.org/pdf/1802.00985v2.pdf
PWC	https://paperswithcode.com/paper/modeling-text-with-graph-convolutional
Repo
Framework

MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection


Title	MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection
Authors	Marcelo Criscuolo, Erick Rocha Fonseca, Sandra Maria Aluísio, Ana Carolina Sperança-Criscuolo
Abstract	We introduce MilkQA, a question answering dataset from the dairy domain dedicated to the study of consumer questions. The dataset contains 2,657 pairs of questions and answers, written in the Portuguese language and originally collected by the Brazilian Agricultural Research Corporation (Embrapa). All questions were motivated by real situations and written by thousands of authors with very different backgrounds and levels of literacy, while answers were elaborated by specialists from Embrapa’s customer service. Our dataset was filtered and anonymized by three human annotators. Consumer questions are a challenging kind of question that is usually employed as a form of seeking information. Although several question answering datasets are available, most of such resources are not suitable for research on answer selection models for consumer questions. We aim to fill this gap by making MilkQA publicly available. We study the behavior of four answer selection models on MilkQA: two baseline models and two convolutional neural network archictetures. Our results show that MilkQA poses real challenges to computational models, particularly due to linguistic characteristics of its questions and to their unusually longer lengths. Only one of the experimented models gives reasonable results, at the cost of high computational requirements.
Tasks	Answer Selection, Question Answering
Published	2018-01-10
URL	http://arxiv.org/abs/1801.03460v1
PDF	http://arxiv.org/pdf/1801.03460v1.pdf
PWC	https://paperswithcode.com/paper/milkqa-a-dataset-of-consumer-questions-for
Repo
Framework

ADSaS: Comprehensive Real-time Anomaly Detection System


Title	ADSaS: Comprehensive Real-time Anomaly Detection System
Authors	Sooyeon Lee, Huy Kang Kim
Abstract	Since with massive data growth, the need for autonomous and generic anomaly detection system is increased. However, developing one stand-alone generic anomaly detection system that is accurate and fast is still a challenge. In this paper, we propose conventional time-series analysis approaches, the Seasonal Autoregressive Integrated Moving Average (SARIMA) model and Seasonal Trend decomposition using Loess (STL), to detect complex and various anomalies. Usually, SARIMA and STL are used only for stationary and periodic time-series, but by combining, we show they can detect anomalies with high accuracy for data that is even noisy and non-periodic. We compared the algorithm to Long Short Term Memory (LSTM), a deep-learning-based algorithm used for anomaly detection system. We used a total of seven real-world datasets and four artificial datasets with different time-series properties to verify the performance of the proposed algorithm.
Tasks	Anomaly Detection, Time Series, Time Series Analysis
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12634v1
PDF	http://arxiv.org/pdf/1811.12634v1.pdf
PWC	https://paperswithcode.com/paper/adsas-comprehensive-real-time-anomaly
Repo
Framework

ECO: Egocentric Cognitive Mapping


Title	ECO: Egocentric Cognitive Mapping
Authors	Jayant Sharma, Zixing Wang, Alberto Speranzon, Vijay Venkataraman, Hyun Soo Park
Abstract	We present a new method to localize a camera within a previously unseen environment perceived from an egocentric point of view. Although this is, in general, an ill-posed problem, humans can effortlessly and efficiently determine their relative location and orientation and navigate into a previously unseen environments, e.g., finding a specific item in a new grocery store. To enable such a capability, we design a new egocentric representation, which we call ECO (Egocentric COgnitive map). ECO is biologically inspired, by the cognitive map that allows human navigation, and it encodes the surrounding visual semantics with respect to both distance and orientation. ECO possesses three main properties: (1) reconfigurability: complex semantics and geometry is captured via the synthesis of atomic visual representations (e.g., image patch); (2) robustness: the visual semantics are registered in a geometrically consistent way (e.g., aligning with respect to the gravity vector, frontalizing, and rescaling to canonical depth), thus enabling us to learn meaningful atomic representations; (3) adaptability: a domain adaptation framework is designed to generalize the learned representation without manual calibration. As a proof-of-concept, we use ECO to localize a camera within real-world scenes—various grocery stores—and demonstrate performance improvements when compared to existing semantic localization approaches.
Tasks	Calibration, Domain Adaptation
Published	2018-12-02
URL	http://arxiv.org/abs/1812.00312v1
PDF	http://arxiv.org/pdf/1812.00312v1.pdf
PWC	https://paperswithcode.com/paper/eco-egocentric-cognitive-mapping
Repo
Framework

Engineering a Simplified 0-Bit Consistent Weighted Sampling


Title	Engineering a Simplified 0-Bit Consistent Weighted Sampling
Authors	Edward Raff, Jared Sylvester, Charles Nicholas
Abstract	The Min-Hashing approach to sketching has become an important tool in data analysis, information retrial, and classification. To apply it to real-valued datasets, the ICWS algorithm has become a seminal approach that is widely used, and provides state-of-the-art performance for this problem space. However, ICWS suffers a computational burden as the sketch size K increases. We develop a new Simplified approach to the ICWS algorithm, that enables us to obtain over 20x speedups compared to the standard algorithm. The veracity of our approach is demonstrated empirically on multiple datasets and scenarios, showing that our new Simplified CWS obtains the same quality of results while being an order of magnitude faster.
Tasks
Published	2018-03-30
URL	http://arxiv.org/abs/1804.00069v2
PDF	http://arxiv.org/pdf/1804.00069v2.pdf
PWC	https://paperswithcode.com/paper/engineering-a-simplified-0-bit-consistent
Repo
Framework

Paper Group ANR 196

Real-to-Virtual Domain Unification for End-to-End Autonomous Driving

Reconstruction of training samples from loss functions

Meta-Gradient Reinforcement Learning

Deep Object-Centric Policies for Autonomous Driving

Studying the History of the Arabic Language: Language Technology and a Large-Scale Historical Corpus

Graph Refinement based Tree Extraction using Mean-Field Networks and Graph Neural Networks

Image Ordinal Classification and Understanding: Grid Dropout with Masking Label

Encoding Longer-term Contextual Multi-modal Information in a Predictive Coding Model

Pathologies of Neural Models Make Interpretations Difficult

Accelerating likelihood optimization for ICA on real signals

Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval

MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection

ADSaS: Comprehensive Real-time Anomaly Detection System

ECO: Egocentric Cognitive Mapping

Engineering a Simplified 0-Bit Consistent Weighted Sampling

Paper Group ANR 119

Paper Group ANR 647

Paper Group ANR 707