Paper Group AWR 236
Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials. Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding. Deep Anomaly Detection with Outlier Exposure. Polarity Loss for Zero-shot Object Detection. Learning Approximate Inference Networks for Structured Predicti …
Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials
Title | Compact Convolutional Neural Networks for Classification of Asynchronous Steady-state Visual Evoked Potentials |
Authors | Nicholas R. Waytowich, Vernon Lawhern, Javier O. Garcia, Jennifer Cummings, Josef Faller, Paul Sajda, Jean M. Vettel |
Abstract | Steady-State Visual Evoked Potentials (SSVEPs) are neural oscillations from the parietal and occipital regions of the brain that are evoked from flickering visual stimuli. SSVEPs are robust signals measurable in the electroencephalogram (EEG) and are commonly used in brain-computer interfaces (BCIs). However, methods for high-accuracy decoding of SSVEPs usually require hand-crafted approaches that leverage domain-specific knowledge of the stimulus signals, such as specific temporal frequencies in the visual stimuli and their relative spatial arrangement. When this knowledge is unavailable, such as when SSVEP signals are acquired asynchronously, such approaches tend to fail. In this paper, we show how a compact convolutional neural network (Compact-CNN), which only requires raw EEG signals for automatic feature extraction, can be used to decode signals from a 12-class SSVEP dataset without the need for any domain-specific knowledge or calibration data. We report across subject mean accuracy of approximately 80% (chance being 8.3%) and show this is substantially better than current state-of-the-art hand-crafted approaches using canonical correlation analysis (CCA) and Combined-CCA. Furthermore, we analyze our Compact-CNN to examine the underlying feature representation, discovering that the deep learner extracts additional phase and amplitude related features associated with the structure of the dataset. We discuss how our Compact-CNN shows promise for BCI applications that allow users to freely gaze/attend to any stimulus at any time (e.g., asynchronous BCI) as well as provides a method for analyzing SSVEP signals in a way that might augment our understanding about the basic processing in the visual cortex. |
Tasks | Calibration, EEG |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04566v2 |
http://arxiv.org/pdf/1803.04566v2.pdf | |
PWC | https://paperswithcode.com/paper/compact-convolutional-neural-networks-for |
Repo | https://github.com/vlawhern/arl-eegmodels |
Framework | tf |
Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
Title | Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding |
Authors | Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, Tom Soderstrom |
Abstract | As spacecraft send back increasing amounts of telemetry data, improved anomaly detection systems are needed to lessen the monitoring burden placed on operations engineers and reduce operational risk. Current spacecraft monitoring systems only target a subset of anomaly types and often require costly expert knowledge to develop and maintain due to challenges involving scale and complexity. We demonstrate the effectiveness of Long Short-Term Memory (LSTMs) networks, a type of Recurrent Neural Network (RNN), in overcoming these issues using expert-labeled telemetry anomaly data from the Soil Moisture Active Passive (SMAP) satellite and the Mars Science Laboratory (MSL) rover, Curiosity. We also propose a complementary unsupervised and nonparametric anomaly thresholding approach developed during a pilot implementation of an anomaly detection system for SMAP, and offer false positive mitigation strategies along with other key improvements and lessons learned during development. |
Tasks | Anomaly Detection |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04431v3 |
http://arxiv.org/pdf/1802.04431v3.pdf | |
PWC | https://paperswithcode.com/paper/detecting-spacecraft-anomalies-using-lstms |
Repo | https://github.com/PKUZHOU/anomaly_det |
Framework | tf |
Deep Anomaly Detection with Outlier Exposure
Title | Deep Anomaly Detection with Outlier Exposure |
Authors | Dan Hendrycks, Mantas Mazeika, Thomas Dietterich |
Abstract | It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance. |
Tasks | Anomaly Detection |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04606v3 |
http://arxiv.org/pdf/1812.04606v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-anomaly-detection-with-outlier-exposure |
Repo | https://github.com/hendrycks/outlier-exposure |
Framework | pytorch |
Polarity Loss for Zero-shot Object Detection
Title | Polarity Loss for Zero-shot Object Detection |
Authors | Shafin Rahman, Salman Khan, Nick Barnes |
Abstract | Zero-shot object detection is an emerging research topic that aims to recognize and localize previously ‘unseen’ objects. This setting gives rise to several unique challenges, e.g., highly imbalanced positive vs. negative instance ratio, ambiguity between background and unseen classes and the proper alignment between visual and semantic concepts. Here, we propose an end-to-end deep learning framework underpinned by a novel loss function that seeks to properly align the visual and semantic cues for improved zero-shot learning. We call our objective the ‘Polarity loss’ because it explicitly maximizes the gap between positive and negative predictions. Such a margin maximizing formulation is not only important for visual-semantic alignment but it also resolves the ambiguity between background and unseen objects. Our approach is inspired by the embodiment theories in cognitive science, that claim human semantic understanding to be grounded in past experiences (seen objects), related linguistic concepts (word dictionary) and the perception of the physical world (visual imagery). To this end, we learn to attend to a dictionary of related semantic concepts that eventually refines the noisy semantic embeddings and helps establish a better synergy between visual and semantic domains. Our extensive results on MS-COCO and Pascal VOC datasets show as high as 14x mAP improvement over state of the art. |
Tasks | Object Detection, Zero-Shot Learning, Zero-Shot Object Detection |
Published | 2018-11-22 |
URL | http://arxiv.org/abs/1811.08982v2 |
http://arxiv.org/pdf/1811.08982v2.pdf | |
PWC | https://paperswithcode.com/paper/polarity-loss-for-zero-shot-object-detection |
Repo | https://github.com/salman-h-khan/PL-ZSD_Release |
Framework | tf |
Learning Approximate Inference Networks for Structured Prediction
Title | Learning Approximate Inference Networks for Structured Prediction |
Authors | Lifu Tu, Kevin Gimpel |
Abstract | Structured prediction energy networks (SPENs; Belanger & McCallum 2016) use neural network architectures to define energy functions that can capture arbitrary dependencies among parts of structured outputs. Prior work used gradient descent for inference, relaxing the structured output to a set of continuous variables and then optimizing the energy with respect to them. We replace this use of gradient descent with a neural network trained to approximate structured argmax inference. This “inference network” outputs continuous values that we treat as the output structure. We develop large-margin training criteria for joint training of the structured energy function and inference network. On multi-label classification we report speed-ups of 10-60x compared to (Belanger et al, 2017) while also improving accuracy. For sequence labeling with simple structured energies, our approach performs comparably to exact inference while being much faster at test time. We then demonstrate improved accuracy by augmenting the energy with a “label language model” that scores entire output label sequences, showing it can improve handling of long-distance dependencies in part-of-speech tagging. Finally, we show how inference networks can replace dynamic programming for test-time inference in conditional random fields, suggestive for their general use for fast inference in structured settings. |
Tasks | Language Modelling, Multi-Label Classification, Part-Of-Speech Tagging, Structured Prediction |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03376v1 |
http://arxiv.org/pdf/1803.03376v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-approximate-inference-networks-for |
Repo | https://github.com/lifu-tu/INFNET |
Framework | none |
A fast minimal solver for absolute camera pose with unknown focal length and radial distortion from four planar points
Title | A fast minimal solver for absolute camera pose with unknown focal length and radial distortion from four planar points |
Authors | Magnus Oskarsson |
Abstract | In this paper we present a fast minimal solver for absolute camera pose estimation from four known points that lie in a plane. We assume a perspective camera model with unknown focal length and unknown radial distortion. The radial distortion is modelled using the division model with one parameter. We show that the solutions to this problem can be found from a univariate six-degree polynomial. This results in a very fast and numerically stable solver. |
Tasks | Pose Estimation |
Published | 2018-05-27 |
URL | http://arxiv.org/abs/1805.10705v2 |
http://arxiv.org/pdf/1805.10705v2.pdf | |
PWC | https://paperswithcode.com/paper/a-fast-minimal-solver-for-absolute-camera |
Repo | https://github.com/hamburgerlady/fast_planar_camera_pose |
Framework | none |
Fast Adjustable Threshold For Uniform Neural Network Quantization (Winning solution of LPIRC-II)
Title | Fast Adjustable Threshold For Uniform Neural Network Quantization (Winning solution of LPIRC-II) |
Authors | Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev |
Abstract | Neural network quantization procedure is the necessary step for porting of neural networks to mobile devices. Quantization allows accelerating the inference, reducing memory consumption and model size. It can be performed without fine-tuning using calibration procedure (calculation of parameters necessary for quantization), or it is possible to train the network with quantization from scratch. Training with quantization from scratch on the labeled data is rather long and resource-consuming procedure. Quantization of network without fine-tuning leads to accuracy drop because of outliers which appear during the calibration. In this article we suggest to simplify the quantization procedure significantly by introducing the trained scale factors for quantization thresholds. It allows speeding up the process of quantization with fine-tuning up to 8 epochs as well as reducing the requirements to the set of train images. By our knowledge, the proposed method allowed us to get the first public available quantized version of MNAS without significant accuracy reduction - 74.8% vs 75.3% for original full-precision network. Model and code are ready for use and available at: https://github.com/agoncharenko1992/FAT-fast_adjustable_threshold. |
Tasks | Calibration, Quantization |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.07872v3 |
https://arxiv.org/pdf/1812.07872v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-adjustable-threshold-for-uniform-neural |
Repo | https://github.com/agoncharenko1992/FAT-fast_adjustable_threshold |
Framework | tf |
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification
Title | Deep Directional Statistics: Pose Estimation with Uncertainty Quantification |
Authors | Sergey Prokudin, Peter Gehler, Sebastian Nowozin |
Abstract | Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low-resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable, we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allows for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art. |
Tasks | Pose Estimation |
Published | 2018-05-09 |
URL | http://arxiv.org/abs/1805.03430v1 |
http://arxiv.org/pdf/1805.03430v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-directional-statistics-pose-estimation |
Repo | https://github.com/sergeyprokudin/deep_direct_stat |
Framework | tf |
Structured Bayesian Gaussian process latent variable model: applications to data-driven dimensionality reduction and high-dimensional inversion
Title | Structured Bayesian Gaussian process latent variable model: applications to data-driven dimensionality reduction and high-dimensional inversion |
Authors | Steven Atkinson, Nicholas Zabaras |
Abstract | We introduce a methodology for nonlinear inverse problems using a variational Bayesian approach where the unknown quantity is a spatial field. A structured Bayesian Gaussian process latent variable model is used both to construct a low-dimensional generative model of the sample-based stochastic prior as well as a surrogate for the forward evaluation. Its Bayesian formulation captures epistemic uncertainty introduced by the limited number of input and output examples, automatically selects an appropriate dimensionality for the learned latent representation of the data, and rigorously propagates the uncertainty of the data-driven dimensionality reduction of the stochastic space through the forward model surrogate. The structured Gaussian process model explicitly leverages spatial information for an informative generative prior to improve sample efficiency while achieving computational tractability through Kronecker product decompositions of the relevant kernel matrices. Importantly, the Bayesian inversion is carried out by solving a variational optimization problem, replacing traditional computationally-expensive Monte Carlo sampling. The methodology is demonstrated on an elliptic PDE and is shown to return well-calibrated posteriors and is tractable with latent spaces with over 100 dimensions. |
Tasks | Dimensionality Reduction |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.04302v1 |
http://arxiv.org/pdf/1807.04302v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-bayesian-gaussian-process-latent |
Repo | https://github.com/cics-nd/sgplvm-inverse |
Framework | tf |
Assessing Generative Models via Precision and Recall
Title | Assessing Generative Models via Precision and Recall |
Authors | Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly |
Abstract | Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison. Commonly used evaluation methods, such as the Frechet Inception Distance (FID), correlate well with the perceived quality of samples and are sensitive to mode dropping. However, these metrics are unable to distinguish between different failure cases since they only yield one-dimensional scores. We propose a novel definition of precision and recall for distributions which disentangles the divergence into two separate dimensions. The proposed notion is intuitive, retains desirable properties, and naturally leads to an efficient algorithm that can be used to evaluate generative models. We relate this notion to total variation as well as to recent evaluation metrics such as Inception Score and FID. To demonstrate the practical utility of the proposed approach we perform an empirical study on several variants of Generative Adversarial Networks and Variational Autoencoders. In an extensive set of experiments we show that the proposed metric is able to disentangle the quality of generated samples from the coverage of the target distribution. |
Tasks | |
Published | 2018-05-31 |
URL | http://arxiv.org/abs/1806.00035v2 |
http://arxiv.org/pdf/1806.00035v2.pdf | |
PWC | https://paperswithcode.com/paper/assessing-generative-models-via-precision-and |
Repo | https://github.com/raahii/evan |
Framework | pytorch |
Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
Title | Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning |
Authors | Hock Hung Chieng, Noorhaniza Wahid, Pauline Ong, Sai Raj Kishore Perla |
Abstract | Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindered the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function. |
Tasks | Image Classification |
Published | 2018-12-15 |
URL | http://arxiv.org/abs/1812.06247v1 |
http://arxiv.org/pdf/1812.06247v1.pdf | |
PWC | https://paperswithcode.com/paper/flatten-t-swish-a-thresholded-relu-swish-like |
Repo | https://github.com/lessw2020/FTSwishPlus |
Framework | none |
Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models
Title | Truly unsupervised acoustic word embeddings using weak top-down constraints in encoder-decoder models |
Authors | Herman Kamper |
Abstract | We investigate unsupervised models that can map a variable-duration speech segment to a fixed-dimensional representation. In settings where unlabelled speech is the only available resource, such acoustic word embeddings can form the basis for “zero-resource” speech search, discovery and indexing systems. Most existing unsupervised embedding methods still use some supervision, such as word or phoneme boundaries. Here we propose the encoder-decoder correspondence autoencoder (EncDec-CAE), which, instead of true word segments, uses automatically discovered segments: an unsupervised term discovery system finds pairs of words of the same unknown type, and the EncDec-CAE is trained to reconstruct one word given the other as input. We compare it to a standard encoder-decoder autoencoder (AE), a variational AE with a prior over its latent embedding, and downsampling. EncDec-CAE outperforms its closest competitor by 24% relative in average precision on two languages in a word discrimination task. |
Tasks | Word Embeddings |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00403v2 |
http://arxiv.org/pdf/1811.00403v2.pdf | |
PWC | https://paperswithcode.com/paper/truly-unsupervised-acoustic-word-embeddings |
Repo | https://github.com/kamperh/recipe_bucktsong_awe |
Framework | tf |
Imitating Latent Policies from Observation
Title | Imitating Latent Policies from Observation |
Authors | Ashley D. Edwards, Himanshu Sahni, Yannick Schroecker, Charles L. Isbell |
Abstract | In this paper, we describe a novel approach to imitation learning that infers latent policies directly from state observations. We introduce a method that characterizes the causal effects of latent actions on observations while simultaneously predicting their likelihood. We then outline an action alignment procedure that leverages a small amount of environment interactions to determine a mapping between the latent and real-world actions. We show that this corrected labeling can be used for imitating the observed behavior, even though no expert actions are given. We evaluate our approach within classic control environments and a platform game and demonstrate that it performs better than standard approaches. Code for this work is available at https://github.com/ashedwards/ILPO. |
Tasks | Imitation Learning |
Published | 2018-05-21 |
URL | https://arxiv.org/abs/1805.07914v3 |
https://arxiv.org/pdf/1805.07914v3.pdf | |
PWC | https://paperswithcode.com/paper/imitating-latent-policies-from-observation |
Repo | https://github.com/ashedwards/ILPO |
Framework | tf |
3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction From a Single Image
Title | 3D-PSRNet: Part Segmented 3D Point Cloud Reconstruction From a Single Image |
Authors | Priyanka Mandikal, Navaneet K L, R. Venkatesh Babu |
Abstract | We propose a mechanism to reconstruct part annotated 3D point clouds of objects given just a single input image. We demonstrate that jointly training for both reconstruction and segmentation leads to improved performance in both the tasks, when compared to training for each task individually. The key idea is to propagate information from each task so as to aid the other during the training procedure. Towards this end, we introduce a location-aware segmentation loss in the training regime. We empirically show the effectiveness of the proposed loss in generating more faithful part reconstructions while also improving segmentation accuracy. We thoroughly evaluate the proposed approach on different object categories from the ShapeNet dataset to obtain improved results in reconstruction as well as segmentation. |
Tasks | |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.00461v1 |
http://arxiv.org/pdf/1810.00461v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-psrnet-part-segmented-3d-point-cloud |
Repo | https://github.com/val-iisc/3d-psrnet |
Framework | tf |
Morse Code Datasets for Machine Learning
Title | Morse Code Datasets for Machine Learning |
Authors | Sourya Dey, Keith M. Chugg, Peter A. Beerel |
Abstract | We present an algorithm to generate synthetic datasets of tunable difficulty on classification of Morse code symbols for supervised machine learning problems, in particular, neural networks. The datasets are spatially one-dimensional and have a small number of input features, leading to high density of input information content. This makes them particularly challenging when implementing network complexity reduction methods. We explore how network performance is affected by deliberately adding various forms of noise and expanding the feature set and dataset size. Finally, we establish several metrics to indicate the difficulty of a dataset, and evaluate their merits. The algorithm and datasets are open-source. |
Tasks | |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.04239v2 |
http://arxiv.org/pdf/1807.04239v2.pdf | |
PWC | https://paperswithcode.com/paper/morse-code-datasets-for-machine-learning |
Repo | https://github.com/souryadey/morse-dataset |
Framework | none |