October 20, 2019

3177 words 15 mins read

Paper Group AWR 236

Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials. Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding. Deep Anomaly Detection with Outlier Exposure. Polarity Loss for Zero-shot Object Detection. Learning Approximate Inference Networks for Structured Predicti …

Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials


Title	Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials
Authors	Nicholas R. Waytowich, Vernon Lawhern, Javier O. Garcia, Jennifer Cummings, Josef Faller, Paul Sajda, Jean M. Vettel
Abstract	Steady-State Visual Evoked Potentials (SSVEPs) are neural oscillations from the parietal and occipital regions of the brain that are evoked from flickering visual stimuli. SSVEPs are robust signals measurable in the electroencephalogram (EEG) and are commonly used in brain-computer interfaces (BCIs). However, methods for high-accuracy decoding of SSVEPs usually require hand-crafted approaches that leverage domain-specific knowledge of the stimulus signals, such as specific temporal frequencies in the visual stimuli and their relative spatial arrangement. When this knowledge is unavailable, such as when SSVEP signals are acquired asynchronously, such approaches tend to fail. In this paper, we show how a compact convolutional neural network (Compact-CNN), which only requires raw EEG signals for automatic feature extraction, can be used to decode signals from a 12-class SSVEP dataset without the need for any domain-specific knowledge or calibration data. We report across subject mean accuracy of approximately 80% (chance being 8.3%) and show this is substantially better than current state-of-the-art hand-crafted approaches using canonical correlation analysis (CCA) and Combined-CCA. Furthermore, we analyze our Compact-CNN to examine the underlying feature representation, discovering that the deep learner extracts additional phase and amplitude related features associated with the structure of the dataset. We discuss how our Compact-CNN shows promise for BCI applications that allow users to freely gaze/attend to any stimulus at any time (e.g., asynchronous BCI) as well as provides a method for analyzing SSVEP signals in a way that might augment our understanding about the basic processing in the visual cortex.
Tasks	Calibration, EEG
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04566v2
PDF	http://arxiv.org/pdf/1803.04566v2.pdf
PWC	https://paperswithcode.com/paper/compact-convolutional-neural-networks-for
Repo	https://github.com/vlawhern/arl-eegmodels
Framework	tf

Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding


Title	Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
Authors	Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, Tom Soderstrom
Abstract	As spacecraft send back increasing amounts of telemetry data, improved anomaly detection systems are needed to lessen the monitoring burden placed on operations engineers and reduce operational risk. Current spacecraft monitoring systems only target a subset of anomaly types and often require costly expert knowledge to develop and maintain due to challenges involving scale and complexity. We demonstrate the effectiveness of Long Short-Term Memory (LSTMs) networks, a type of Recurrent Neural Network (RNN), in overcoming these issues using expert-labeled telemetry anomaly data from the Soil Moisture Active Passive (SMAP) satellite and the Mars Science Laboratory (MSL) rover, Curiosity. We also propose a complementary unsupervised and nonparametric anomaly thresholding approach developed during a pilot implementation of an anomaly detection system for SMAP, and offer false positive mitigation strategies along with other key improvements and lessons learned during development.
Tasks	Anomaly Detection
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04431v3
PDF	http://arxiv.org/pdf/1802.04431v3.pdf
PWC	https://paperswithcode.com/paper/detecting-spacecraft-anomalies-using-lstms
Repo	https://github.com/PKUZHOU/anomaly_det
Framework	tf

Deep Anomaly Detection with Outlier Exposure


Title	Deep Anomaly Detection with Outlier Exposure
Authors	Dan Hendrycks, Mantas Mazeika, Thomas Dietterich
Abstract	It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
Tasks	Anomaly Detection
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04606v3
PDF	http://arxiv.org/pdf/1812.04606v3.pdf
PWC	https://paperswithcode.com/paper/deep-anomaly-detection-with-outlier-exposure
Repo	https://github.com/hendrycks/outlier-exposure
Framework	pytorch

Polarity Loss for Zero-shot Object Detection


Title	Polarity Loss for Zero-shot Object Detection
Authors	Shafin Rahman, Salman Khan, Nick Barnes
Abstract	Zero-shot object detection is an emerging research topic that aims to recognize and localize previously ‘unseen’ objects. This setting gives rise to several unique challenges, e.g., highly imbalanced positive vs. negative instance ratio, ambiguity between background and unseen classes and the proper alignment between visual and semantic concepts. Here, we propose an end-to-end deep learning framework underpinned by a novel loss function that seeks to properly align the visual and semantic cues for improved zero-shot learning. We call our objective the ‘Polarity loss’ because it explicitly maximizes the gap between positive and negative predictions. Such a margin maximizing formulation is not only important for visual-semantic alignment but it also resolves the ambiguity between background and unseen objects. Our approach is inspired by the embodiment theories in cognitive science, that claim human semantic understanding to be grounded in past experiences (seen objects), related linguistic concepts (word dictionary) and the perception of the physical world (visual imagery). To this end, we learn to attend to a dictionary of related semantic concepts that eventually refines the noisy semantic embeddings and helps establish a better synergy between visual and semantic domains. Our extensive results on MS-COCO and Pascal VOC datasets show as high as 14x mAP improvement over state of the art.
Tasks	Object Detection, Zero-Shot Learning, Zero-Shot Object Detection
Published	2018-11-22
URL	http://arxiv.org/abs/1811.08982v2
PDF	http://arxiv.org/pdf/1811.08982v2.pdf
PWC	https://paperswithcode.com/paper/polarity-loss-for-zero-shot-object-detection
Repo	https://github.com/salman-h-khan/PL-ZSD_Release
Framework	tf

Learning Approximate Inference Networks for Structured Prediction


Title	Learning Approximate Inference Networks for Structured Prediction
Authors	Lifu Tu, Kevin Gimpel
Abstract	Structured prediction energy networks (SPENs; Belanger & McCallum 2016) use neural network architectures to define energy functions that can capture arbitrary dependencies among parts of structured outputs. Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them. We replace this use of gradient descent with a neural network trained to approximate structured argmax inference. This “inference network” outputs continuous values that we treat as the output structure. We develop large-margin training criteria for joint training of the structured energy function and inference network. On multi-label classification we report speed-ups of 10-60x compared to (Belanger et al, 2017) while also improving accuracy. For sequence labeling with simple structured energies, our approach performs comparably to exact inference while being much faster at test time. We then demonstrate improved accuracy by augmenting the energy with a “label language model” that scores entire output label sequences, showing it can improve handling of long-distance dependencies in part-of-speech tagging. Finally, we show how inference networks can replace dynamic programming for test-time inference in conditional random fields, suggestive for their general use for fast inference in structured settings.
Tasks	Language Modelling, Multi-Label Classification, Part-Of-Speech Tagging, Structured Prediction
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03376v1
PDF	http://arxiv.org/pdf/1803.03376v1.pdf
PWC	https://paperswithcode.com/paper/learning-approximate-inference-networks-for
Repo	https://github.com/lifu-tu/INFNET
Framework	none

A fast minimal solver for absolute camera pose with unknown focal length and radial distortion from four planar points


Title	A fast minimal solver for absolute camera pose with unknown focal length and radial distortion from four planar points
Authors	Magnus Oskarsson
Abstract	In this paper we present a fast minimal solver for absolute camera pose estimation from four known points that lie in a plane. We assume a perspective camera model with unknown focal length and unknown radial distortion. The radial distortion is modelled using the division model with one parameter. We show that the solutions to this problem can be found from a univariate six-degree polynomial. This results in a very fast and numerically stable solver.
Tasks	Pose Estimation
Published	2018-05-27
URL	http://arxiv.org/abs/1805.10705v2
PDF	http://arxiv.org/pdf/1805.10705v2.pdf
PWC	https://paperswithcode.com/paper/a-fast-minimal-solver-for-absolute-camera
Repo	https://github.com/hamburgerlady/fast_planar_camera_pose
Framework	none

Fast Adjustable Threshold For Uniform Neural Network Quantization (Winning solution of LPIRC-II)


Title	Fast Adjustable Threshold For Uniform Neural Network Quantization (Winning solution of LPIRC-II)
Authors	Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev
Abstract	Neural network quantization procedure is the necessary step for porting of neural networks to mobile devices. Quantization allows accelerating the inference, reducing memory consumption and model size. It can be performed without fine-tuning using calibration procedure (calculation of parameters necessary for quantization), or it is possible to train the network with quantization from scratch. Training with quantization from scratch on the labeled data is rather long and resource-consuming procedure. Quantization of network without fine-tuning leads to accuracy drop because of outliers which appear during the calibration. In this article we suggest to simplify the quantization procedure significantly by introducing the trained scale factors for quantization thresholds. It allows speeding up the process of quantization with fine-tuning up to 8 epochs as well as reducing the requirements to the set of train images. By our knowledge, the proposed method allowed us to get the first public available quantized version of MNAS without significant accuracy reduction - 74.8% vs 75.3% for original full-precision network. Model and code are ready for use and available at: https://github.com/agoncharenko1992/FAT-fast_adjustable_threshold.
Tasks	Calibration, Quantization
Published	2018-12-19
URL	https://arxiv.org/abs/1812.07872v3
PDF	https://arxiv.org/pdf/1812.07872v3.pdf
PWC	https://paperswithcode.com/paper/fast-adjustable-threshold-for-uniform-neural
Repo	https://github.com/agoncharenko1992/FAT-fast_adjustable_threshold
Framework	tf

Deep Directional Statistics: Pose Estimation with Uncertainty Quantification


Title	Deep Directional Statistics: Pose Estimation with Uncertainty Quantification
Authors	Sergey Prokudin, Peter Gehler, Sebastian Nowozin
Abstract	Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low-resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable, we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allows for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art.
Tasks	Pose Estimation
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03430v1
PDF	http://arxiv.org/pdf/1805.03430v1.pdf
PWC	https://paperswithcode.com/paper/deep-directional-statistics-pose-estimation
Repo	https://github.com/sergeyprokudin/deep_direct_stat
Framework	tf

Structured Bayesian Gaussian process latent variable model: applications to data-driven dimensionality reduction and high-dimensional inversion


Title	Structured Bayesian Gaussian process latent variable model: applications to data-driven dimensionality reduction and high-dimensional inversion
Authors	Steven Atkinson, Nicholas Zabaras
Abstract	We introduce a methodology for nonlinear inverse problems using a variational Bayesian approach where the unknown quantity is a spatial field. A structured Bayesian Gaussian process latent variable model is used both to construct a low-dimensional generative model of the sample-based stochastic prior as well as a surrogate for the forward evaluation. Its Bayesian formulation captures epistemic uncertainty introduced by the limited number of input and output examples, automatically selects an appropriate dimensionality for the learned latent representation of the data, and rigorously propagates the uncertainty of the data-driven dimensionality reduction of the stochastic space through the forward model surrogate. The structured Gaussian process model explicitly leverages spatial information for an informative generative prior to improve sample efficiency while achieving computational tractability through Kronecker product decompositions of the relevant kernel matrices. Importantly, the Bayesian inversion is carried out by solving a variational optimization problem, replacing traditional computationally-expensive Monte Carlo sampling. The methodology is demonstrated on an elliptic PDE and is shown to return well-calibrated posteriors and is tractable with latent spaces with over 100 dimensions.
Tasks	Dimensionality Reduction
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04302v1
PDF	http://arxiv.org/pdf/1807.04302v1.pdf
PWC	https://paperswithcode.com/paper/structured-bayesian-gaussian-process-latent
Repo	https://github.com/cics-nd/sgplvm-inverse
Framework	tf

Assessing Generative Models via Precision and Recall


Title	Assessing Generative Models via Precision and Recall
Authors	Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly
Abstract	Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison. Commonly used evaluation methods, such as the Frechet Inception Distance (FID), correlate well with the perceived quality of samples and are sensitive to mode dropping. However, these metrics are unable to distinguish between different failure cases since they only yield one-dimensional scores. We propose a novel definition of precision and recall for distributions which disentangles the divergence into two separate dimensions. The proposed notion is intuitive, retains desirable properties, and naturally leads to an efficient algorithm that can be used to evaluate generative models. We relate this notion to total variation as well as to recent evaluation metrics such as Inception Score and FID. To demonstrate the practical utility of the proposed approach we perform an empirical study on several variants of Generative Adversarial Networks and Variational Autoencoders. In an extensive set of experiments we show that the proposed metric is able to disentangle the quality of generated samples from the coverage of the target distribution.
Tasks
Published	2018-05-31
URL	http://arxiv.org/abs/1806.00035v2
PDF	http://arxiv.org/pdf/1806.00035v2.pdf
PWC	https://paperswithcode.com/paper/assessing-generative-models-via-precision-and
Repo	https://github.com/raahii/evan
Framework	pytorch

Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning


Title	Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
Authors	Hock Hung Chieng, Noorhaniza Wahid, Pauline Ong, Sai Raj Kishore Perla
Abstract	Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindered the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function.
Tasks	Image Classification
Published	2018-12-15
URL	http://arxiv.org/abs/1812.06247v1
PDF	http://arxiv.org/pdf/1812.06247v1.pdf
PWC	https://paperswithcode.com/paper/flatten-t-swish-a-thresholded-relu-swish-like
Repo	https://github.com/lessw2020/FTSwishPlus
Framework	none

Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models


Title	Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models
Authors	Herman Kamper
Abstract	We investigate unsupervised models that can map a variable-duration speech segment to a fixed-dimensional representation. In settings where unlabelled speech is the only available resource, such acoustic word embeddings can form the basis for “zero-resource” speech search, discovery and indexing systems. Most existing unsupervised embedding methods still use some supervision, such as word or phoneme boundaries. Here we propose the encoder-decoder correspondence autoencoder (EncDec-CAE), which, instead of true word segments, uses automatically discovered segments: an unsupervised term discovery system finds pairs of words of the same unknown type, and the EncDec-CAE is trained to reconstruct one word given the other as input. We compare it to a standard encoder-decoder autoencoder (AE), a variational AE with a prior over its latent embedding, and downsampling. EncDec-CAE outperforms its closest competitor by 24% relative in average precision on two languages in a word discrimination task.
Tasks	Word Embeddings
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00403v2
PDF	http://arxiv.org/pdf/1811.00403v2.pdf
PWC	https://paperswithcode.com/paper/truly-unsupervised-acoustic-word-embeddings
Repo	https://github.com/kamperh/recipe_bucktsong_awe
Framework	tf

Imitating Latent Policies from Observation


Title	Imitating Latent Policies from Observation
Authors	Ashley D. Edwards, Himanshu Sahni, Yannick Schroecker, Charles L. Isbell
Abstract	In this paper, we describe a novel approach to imitation learning that infers latent policies directly from state observations. We introduce a method that characterizes the causal effects of latent actions on observations while simultaneously predicting their likelihood. We then outline an action alignment procedure that leverages a small amount of environment interactions to determine a mapping between the latent and real-world actions. We show that this corrected labeling can be used for imitating the observed behavior, even though no expert actions are given. We evaluate our approach within classic control environments and a platform game and demonstrate that it performs better than standard approaches. Code for this work is available at https://github.com/ashedwards/ILPO.
Tasks	Imitation Learning
Published	2018-05-21
URL	https://arxiv.org/abs/1805.07914v3
PDF	https://arxiv.org/pdf/1805.07914v3.pdf
PWC	https://paperswithcode.com/paper/imitating-latent-policies-from-observation
Repo	https://github.com/ashedwards/ILPO
Framework	tf

3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction From a Single Image


Title	3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction From a Single Image
Authors	Priyanka Mandikal, Navaneet K L, R. Venkatesh Babu
Abstract	We propose a mechanism to reconstruct part annotated 3D point clouds of objects given just a single input image. We demonstrate that jointly training for both reconstruction and segmentation leads to improved performance in both the tasks, when compared to training for each task individually. The key idea is to propagate information from each task so as to aid the other during the training procedure. Towards this end, we introduce a location-aware segmentation loss in the training regime. We empirically show the effectiveness of the proposed loss in generating more faithful part reconstructions while also improving segmentation accuracy. We thoroughly evaluate the proposed approach on different object categories from the ShapeNet dataset to obtain improved results in reconstruction as well as segmentation.
Tasks
Published	2018-09-30
URL	http://arxiv.org/abs/1810.00461v1
PDF	http://arxiv.org/pdf/1810.00461v1.pdf
PWC	https://paperswithcode.com/paper/3d-psrnet-part-segmented-3d-point-cloud
Repo	https://github.com/val-iisc/3d-psrnet
Framework	tf

Morse Code Datasets for Machine Learning


Title	Morse Code Datasets for Machine Learning
Authors	Sourya Dey, Keith M. Chugg, Peter A. Beerel
Abstract	We present an algorithm to generate synthetic datasets of tunable difficulty on classification of Morse code symbols for supervised machine learning problems, in particular, neural networks. The datasets are spatially one-dimensional and have a small number of input features, leading to high density of input information content. This makes them particularly challenging when implementing network complexity reduction methods. We explore how network performance is affected by deliberately adding various forms of noise and expanding the feature set and dataset size. Finally, we establish several metrics to indicate the difficulty of a dataset, and evaluate their merits. The algorithm and datasets are open-source.
Tasks
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04239v2
PDF	http://arxiv.org/pdf/1807.04239v2.pdf
PWC	https://paperswithcode.com/paper/morse-code-datasets-for-machine-learning
Repo	https://github.com/souryadey/morse-dataset
Framework	none