October 21, 2019

2936 words 14 mins read

Paper Group AWR 109

Semi-Analytic Resampling in Lasso. Referring Relationships. Text normalization using memory augmented neural networks. ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation. A Recurrent Graph Neural Network for Multi-Relational Data. Modular meta-learning. Recent Advances in Object Detection in the Age of Deep Convolut …

Semi-Analytic Resampling in Lasso


Title	Semi-Analytic Resampling in Lasso
Authors	Tomoyuki Obuchi, Yoshiyuki Kabashima
Abstract	An approximate method for conducting resampling in Lasso, the $\ell_1$ penalized linear regression, in a semi-analytic manner is developed, whereby the average over the resampled datasets is directly computed without repeated numerical sampling, thus enabling an inference free of the statistical fluctuations due to sampling finiteness, as well as a significant reduction of computational time. The proposed method is based on a message passing type algorithm, and its fast convergence is guaranteed by the state evolution analysis, when covariates are provided as zero-mean independently and identically distributed Gaussian random variables. It is employed to implement bootstrapped Lasso (Bolasso) and stability selection, both of which are variable selection methods using resampling in conjunction with Lasso, and resolves their disadvantage regarding computational cost. To examine approximation accuracy and efficiency, numerical experiments were carried out using simulated datasets. Moreover, an application to a real-world dataset, the wine quality dataset, is presented. To process such real-world datasets, an objective criterion for determining the relevance of selected variables is also introduced by the addition of noise variables and resampling.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10254v2
PDF	http://arxiv.org/pdf/1802.10254v2.pdf
PWC	https://paperswithcode.com/paper/semi-analytic-resampling-in-lasso
Repo	https://github.com/T-Obuchi/AMPR_lasso_matlab
Framework	none

Referring Relationships


Title	Referring Relationships
Authors	Ranjay Krishna, Ines Chami, Michael Bernstein, Li Fei-Fei
Abstract	Images are not simply sets of objects: each image represents a web of interconnected relationships. These relationships between entities carry semantic meaning and help a viewer differentiate between instances of an entity. For example, in an image of a soccer match, there may be multiple persons present, but each participates in different relationships: one is kicking the ball, and the other is guarding the goal. In this paper, we formulate the task of utilizing these “referring relationships” to disambiguate between entities of the same category. We introduce an iterative model that localizes the two entities in the referring relationship, conditioned on one another. We formulate the cyclic condition between the entities in a relationship by modelling predicates that connect the entities as shifts in attention from one entity to another. We demonstrate that our model can not only outperform existing approaches on three datasets — CLEVR, VRD and Visual Genome — but also that it produces visually meaningful predicate shifts, as an instance of interpretable neural networks. Finally, we show that by modelling predicates as attention shifts, we can even localize entities in the absence of their category, allowing our model to find completely unseen categories.
Tasks
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10362v2
PDF	http://arxiv.org/pdf/1803.10362v2.pdf
PWC	https://paperswithcode.com/paper/referring-relationships
Repo	https://github.com/shikorab/DSG
Framework	tf

Text normalization using memory augmented neural networks


Title	Text normalization using memory augmented neural networks
Authors	Subhojeet Pramanik, Aman Hussain
Abstract	We perform text normalization, i.e. the transformation of words from the written to the spoken form, using a memory augmented neural network. With the addition of dynamic memory access and storage mechanism, we present a neural architecture that will serve as a language-agnostic text normalization system while avoiding the kind of unacceptable errors made by the LSTM-based recurrent neural networks. By successfully reducing the frequency of such mistakes, we show that this novel architecture is indeed a better alternative. Our proposed system requires significantly lesser amounts of data, training time and compute resources. Additionally, we perform data up-sampling, circumventing the data sparsity problem in some semiotic classes, to show that sufficient examples in any particular class can improve the performance of our text normalization system. Although a few occurrences of these errors still remain in certain semiotic classes, we demonstrate that memory augmented networks with meta-learning capabilities can open many doors to a superior text normalization system.
Tasks	Meta-Learning
Published	2018-05-31
URL	http://arxiv.org/abs/1806.00044v3
PDF	http://arxiv.org/pdf/1806.00044v3.pdf
PWC	https://paperswithcode.com/paper/text-normalization-using-memory-augmented
Repo	https://github.com/cognibit/Text-Normalization-Demo
Framework	tf

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation


Title	ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
Authors	Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, Hannaneh Hajishirzi
Abstract	We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power. ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the state-of-the-art semantic segmentation network PSPNet, while its category-wise accuracy is only 8% less. We evaluated ESPNet on a variety of semantic segmentation datasets including Cityscapes, PASCAL VOC, and a breast biopsy whole slide image dataset. Under the same constraints on memory and computation, ESPNet outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and our newly introduced performance metrics that measure efficiency on edge devices. Our network can process high resolution images at a rate of 112 and 9 frames per second on a standard GPU and edge device, respectively.
Tasks	Real-Time Semantic Segmentation, Semantic Segmentation
Published	2018-03-19
URL	http://arxiv.org/abs/1803.06815v3
PDF	http://arxiv.org/pdf/1803.06815v3.pdf
PWC	https://paperswithcode.com/paper/espnet-efficient-spatial-pyramid-of-dilated
Repo	https://github.com/adichaloo/EdgeNet
Framework	pytorch

A Recurrent Graph Neural Network for Multi-Relational Data


Title	A Recurrent Graph Neural Network for Multi-Relational Data
Authors	Vassilis N. Ioannidis, Antonio G. Marques, Georgios B. Giannakis
Abstract	The era of data deluge has sparked the interest in graph-based learning methods in a number of disciplines such as sociology, biology, neuroscience, or engineering. In this paper, we introduce a graph recurrent neural network (GRNN) for scalable semi-supervised learning from multi-relational data. Key aspects of the novel GRNN architecture are the use of multi-relational graphs, the dynamic adaptation to the different relations via learnable weights, and the consideration of graph-based regularizers to promote smoothness and alleviate over-parametrization. Our ultimate goal is to design a powerful learning architecture able to: discover complex and highly non-linear data associations, combine (and select) multiple types of relations, and scale gracefully with respect to the size of the graph. Numerical tests with real data sets corroborate the design goals and illustrate the performance gains relative to competing alternatives.
Tasks
Published	2018-11-05
URL	http://arxiv.org/abs/1811.02061v3
PDF	http://arxiv.org/pdf/1811.02061v3.pdf
PWC	https://paperswithcode.com/paper/a-recurrent-graph-neural-network-for-multi
Repo	https://github.com/bioannidis/adaptive_recurrent_graph_neural_network
Framework	tf

Modular meta-learning


Title	Modular meta-learning
Authors	Ferran Alet, Tomás Lozano-Pérez, Leslie P. Kaelbling
Abstract	Many prediction problems, such as those that arise in the context of robotics, have a simplifying underlying structure that, if known, could accelerate learning. In this paper, we present a strategy for learning a set of neural network modules that can be combined in different ways. We train different modular structures on a set of related tasks and generalize to new tasks by composing the learned modules in new ways. By reusing modules to generalize we achieve combinatorial generalization, akin to the “infinite use of finite means” displayed in language. Finally, we show this improves performance in two robotics-related problems.
Tasks	Meta-Learning
Published	2018-06-26
URL	https://arxiv.org/abs/1806.10166v2
PDF	https://arxiv.org/pdf/1806.10166v2.pdf
PWC	https://paperswithcode.com/paper/modular-meta-learning
Repo	https://github.com/FerranAlet/modular-metalearning
Framework	pytorch

Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks


Title	Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks
Authors	Shivang Agarwal, Jean Ogier Du Terrail, Frédéric Jurie
Abstract	Object detection-the computer vision task dealing with detecting instances of objects of a certain class (e.g., ‘car’, ‘plane’, etc.) in images-attracted a lot of attention from the community during the last 5 years. This strong interest can be explained not only by the importance this task has for many applications but also by the phenomenal advances in this area since the arrival of deep convolutional neural networks (DCNN). This article reviews the recent literature on object detection with deep CNN, in a comprehensive way, and provides an in-depth view of these recent advances. The survey covers not only the typical architectures (SSD, YOLO, Faster-RCNN) but also discusses the challenges currently met by the community and goes on to show how the problem of object detection can be extended. This survey also reviews the public datasets and associated state-of-the-art algorithms.
Tasks	Object Detection
Published	2018-09-10
URL	https://arxiv.org/abs/1809.03193v2
PDF	https://arxiv.org/pdf/1809.03193v2.pdf
PWC	https://paperswithcode.com/paper/recent-advances-in-object-detection-in-the
Repo	https://github.com/atheheath/papers
Framework	none

Conditional BERT Contextual Augmentation


Title	Conditional BERT Contextual Augmentation
Authors	Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han, Songlin Hu
Abstract	We propose a novel data augmentation method for labeled sentences called conditional BERT contextual augmentation. Data augmentation methods are often applied to prevent overfitting and improve generalization of deep neural network models. Recently proposed contextual augmentation augments labeled sentences by randomly replacing words with more varied substitutions predicted by language model. BERT demonstrates that a deep bidirectional language model is more powerful than either an unidirectional language model or the shallow concatenation of a forward and backward model. We retrofit BERT to conditional BERT by introducing a new conditional masked language model\footnote{The term “conditional masked language model” appeared once in original BERT paper, which indicates context-conditional, is equivalent to term “masked language model”. In our paper, “conditional masked language model” indicates we apply extra label-conditional constraint to the “masked language model”.} task. The well trained conditional BERT can be applied to enhance contextual augmentation. Experiments on six various different text classification tasks show that our method can be easily applied to both convolutional or recurrent neural networks classifier to obtain obvious improvement.
Tasks	Data Augmentation, Language Modelling, Text Classification
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06705v1
PDF	http://arxiv.org/pdf/1812.06705v1.pdf
PWC	https://paperswithcode.com/paper/conditional-bert-contextual-augmentation
Repo	https://github.com/Gal1eo/DD2424
Framework	pytorch

Generalizable Adversarial Training via Spectral Normalization


Title	Generalizable Adversarial Training via Spectral Normalization
Authors	Farzan Farnia, Jesse M. Zhang, David Tse
Abstract	Deep neural networks (DNNs) have set benchmarks on a wide array of supervised learning tasks. Trained DNNs, however, often lack robustness to minor adversarial perturbations to the input, which undermines their true practicality. Recent works have increased the robustness of DNNs by fitting networks using adversarially-perturbed training samples, but the improved performance can still be far below the performance seen in non-adversarial settings. A significant portion of this gap can be attributed to the decrease in generalization performance due to adversarial training. In this work, we extend the notion of margin loss to adversarial settings and bound the generalization error for DNNs trained under several well-known gradient-based attack schemes, motivating an effective regularization scheme based on spectral normalization of the DNN’s weight matrices. We also provide a computationally-efficient method for normalizing the spectral norm of convolutional layers with arbitrary stride and padding schemes in deep convolutional networks. We evaluate the power of spectral normalization extensively on combinations of datasets, network architectures, and adversarial training schemes. The code is available at https://github.com/jessemzhang/dl_spectral_normalization.
Tasks
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07457v1
PDF	http://arxiv.org/pdf/1811.07457v1.pdf
PWC	https://paperswithcode.com/paper/generalizable-adversarial-training-via
Repo	https://github.com/jessemzhang/dl_spectral_normalization
Framework	tf

Automatic salt deposits segmentation: A deep learning approach


Title	Automatic salt deposits segmentation: A deep learning approach
Authors	Mikhail Karchevskiy, Insaf Ashrapov, Leonid Kozinkin
Abstract	One of the most important applications of seismic reflection is the hydrocarbon exploration which is closely related to salt deposits analysis. This problem is very important even nowadays due to it’s non-linear nature. Taking into account the recent developments in deep learning networks TGS-NOPEC Geophysical Company hosted the Kaggle competition for salt deposits segmentation problem in seismic image data. In this paper, we demonstrate the great performance of several novel deep learning techniques merged into a single neural network which achieved the 27th place (top 1%) in the mentioned competition. Using a U-Net with ResNeXt-50 encoder pre-trained on ImageNet as our base architecture, we implemented Spatial-Channel Squeeze & Excitation, Lovasz loss, CoordConv and Hypercolumn methods. The source code for our solution is made publicly available at https://github.com/K-Mike/Automatic-salt-deposits-segmentation.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1812.01429v1
PDF	http://arxiv.org/pdf/1812.01429v1.pdf
PWC	https://paperswithcode.com/paper/181201429
Repo	https://github.com/woans0104/sk_project
Framework	none

Lipizzaner: A System That Scales Robust Generative Adversarial Network Training


Title	Lipizzaner: A System That Scales Robust Generative Adversarial Network Training
Authors	Tom Schmiedlechner, Ignavier Ng Zhi Yong, Abdullah Al-Dujaili, Erik Hemberg, Una-May O’Reilly
Abstract	GANs are difficult to train due to convergence pathologies such as mode and discriminator collapse. We introduce Lipizzaner, an open source software system that allows machine learning engineers to train GANs in a distributed and robust way. Lipizzaner distributes a competitive coevolutionary algorithm which, by virtue of dual, adapting, generator and discriminator populations, is robust to collapses. The algorithm is well suited to efficient distribution because it uses a spatial grid abstraction. Training is local to each cell and strong intermediate training results are exchanged among overlapping neighborhoods allowing high performing solutions to propagate and improve with more rounds of training. Experiments on common image datasets overcome critical collapses. Communication overhead scales linearly when increasing the number of compute instances and we observe that increasing scale leads to improved model performance.
Tasks
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12843v1
PDF	http://arxiv.org/pdf/1811.12843v1.pdf
PWC	https://paperswithcode.com/paper/lipizzaner-a-system-that-scales-robust
Repo	https://github.com/ALFA-group/lipizzaner-gan
Framework	pytorch

ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding


Title	ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding
Authors	Ning Liu, Yongchao Long, Changqing Zou, Qun Niu, Li Pan, Hefeng Wu
Abstract	We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes. ADCrowdNet contains two concatenated networks. An attention-aware network called Attention Map Generator (AMG) first detects crowd regions in images and computes the congestion degree of these regions. Based on detected crowd regions and congestion priors, a multi-scale deformable network called Density Map Estimator (DME) then generates high-quality density maps. With the attention-aware training scheme and multi-scale deformable convolutional scheme, the proposed ADCrowdNet achieves the capability of being more effective to capture the crowd features and more resistant to various noises. We have evaluated our method on four popular crowd counting datasets (ShanghaiTech, UCF_CC_50, WorldEXPO’10, and UCSD) and an extra vehicle counting dataset TRANCOS, and our approach beats existing state-of-the-art approaches on all of these datasets.
Tasks	Crowd Counting
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11968v5
PDF	http://arxiv.org/pdf/1811.11968v5.pdf
PWC	https://paperswithcode.com/paper/adcrowdnet-an-attention-injective-deformable
Repo	https://github.com/BIGKnight/ADCrowd_pytorch_implementation
Framework	pytorch

Group Anomaly Detection using Deep Generative Models


Title	Group Anomaly Detection using Deep Generative Models
Authors	Raghavendra Chalapathy, Edward Toth, Sanjay Chawla
Abstract	Unlike conventional anomaly detection research that focuses on point anomalies, our goal is to detect anomalous collections of individual data points. In particular, we perform group anomaly detection (GAD) with an emphasis on irregular group distributions (e.g. irregular mixtures of image pixels). GAD is an important task in detecting unusual and anomalous phenomena in real-world applications such as high energy particle physics, social media, and medical imaging. In this paper, we take a generative approach by proposing deep generative models: Adversarial autoencoder (AAE) and variational autoencoder (VAE) for group anomaly detection. Both AAE and VAE detect group anomalies using point-wise input data where group memberships are known a priori. We conduct extensive experiments to evaluate our models on real-world datasets. The empirical results demonstrate that our approach is effective and robust in detecting group anomalies.
Tasks	Anomaly Detection, Group Anomaly Detection
Published	2018-04-13
URL	http://arxiv.org/abs/1804.04876v1
PDF	http://arxiv.org/pdf/1804.04876v1.pdf
PWC	https://paperswithcode.com/paper/group-anomaly-detection-using-deep-generative
Repo	https://github.com/raghavchalapathy/gad
Framework	none

Less is more: sampling chemical space with active learning


Title	Less is more: sampling chemical space with active learning
Authors	Justin S. Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, Adrian E. Roitberg
Abstract	The development of accurate and transferable machine learning (ML) potentials for predicting molecular energetics is a challenging task. The process of data generation to train such ML potentials is a task neither well understood nor researched in detail. In this work, we present a fully automated approach for the generation of datasets with the intent of training universal ML potentials. It is based on the concept of active learning (AL) via Query by Committee (QBC), which uses the disagreement between an ensemble of ML potentials to infer the reliability of the ensemble’s prediction. QBC allows the presented AL algorithm to automatically sample regions of chemical space where the ML potential fails to accurately predict the potential energy. AL improves the overall fitness of ANAKIN-ME (ANI) deep learning potentials in rigorous test cases by mitigating human biases in deciding what new training data to use. AL also reduces the training set size to a fraction of the data required when using naive random sampling techniques. To provide validation of our AL approach we develop the COMP6 benchmark (publicly available on GitHub), which contains a diverse set of organic molecules. Through the AL process, it is shown that the AL-based potentials perform as well as the ANI-1 potential on COMP6 with only 10% of the data, and vastly outperforms ANI-1 with 25% the amount of data. Finally, we show that our proposed AL technique develops a universal ANI potential (ANI-1x) that provides accurate energy and force predictions on the entire COMP6 benchmark. This universal ML potential achieves a level of accuracy on par with the best ML potentials for single molecule or materials, while remaining applicable to the general class of organic molecules comprised of the elements CHNO.
Tasks	Active Learning
Published	2018-01-28
URL	http://arxiv.org/abs/1801.09319v2
PDF	http://arxiv.org/pdf/1801.09319v2.pdf
PWC	https://paperswithcode.com/paper/less-is-more-sampling-chemical-space-with
Repo	https://github.com/isayev/ANI1_dataset
Framework	none

Solving Atari Games Using Fractals And Entropy


Title	Solving Atari Games Using Fractals And Entropy
Authors	Sergio Hernandez Cerezo, Guillem Duran Ballester, Spiros Baxevanakis
Abstract	In this paper, we introduce a novel MCTS based approach that is derived from the laws of the thermodynamics. The algorithm coined Fractal Monte Carlo (FMC), allows us to create an agent that takes intelligent actions in both continuous and discrete environments while providing control over every aspect of the agent behavior. Results show that FMC is several orders of magnitude more efficient than similar techniques, such as MCTS, in the Atari games tested.
Tasks	Atari Games
Published	2018-07-03
URL	http://arxiv.org/abs/1807.01081v1
PDF	http://arxiv.org/pdf/1807.01081v1.pdf
PWC	https://paperswithcode.com/paper/solving-atari-games-using-fractals-and
Repo	https://github.com/FragileTheory/FractalAI
Framework	none