April 2, 2020

3121 words 15 mins read

Paper Group ANR 304

Paper Group ANR 304

Robust Multi-channel Speech Recognition using Frequency Aligned Network. On Layer Normalization in the Transformer Architecture. Reinforcement Learning in FlipIt. Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents. Lattice protein design using Bayesian learning. FastDTW is approximate and Generally Slower than t …

Robust Multi-channel Speech Recognition using Frequency Aligned Network

Title Robust Multi-channel Speech Recognition using Frequency Aligned Network
Authors Taejin Park, Kenichi Kumatani, Minhua Wu, Shiva Sundaram
Abstract Conventional speech enhancement technique such as beamforming has known benefits for far-field speech recognition. Our own work in frequency-domain multi-channel acoustic modeling has shown additional improvements by training a spatial filtering layer jointly within an acoustic model. In this paper, we further develop this idea and use frequency aligned network for robust multi-channel automatic speech recognition (ASR). Unlike an affine layer in the frequency domain, the proposed frequency aligned component prevents one frequency bin influencing other frequency bins. We show that this modification not only reduces the number of parameters in the model but also significantly and improves the ASR performance. We investigate effects of frequency aligned network through ASR experiments on the real-world far-field data where users are interacting with an ASR system in uncontrolled acoustic environments. We show that our multi-channel acoustic model with a frequency aligned network shows up to 18% relative reduction in word error rate.
Tasks Speech Enhancement, Speech Recognition
Published 2020-02-06
URL https://arxiv.org/abs/2002.02520v1
PDF https://arxiv.org/pdf/2002.02520v1.pdf
PWC https://paperswithcode.com/paper/robust-multi-channel-speech-recognition-using
Repo
Framework

On Layer Normalization in the Transformer Architecture

Title On Layer Normalization in the Transformer Architecture
Authors Ruibin Xiong, Yunchang Yang, Di He, Kai Zheng, Shuxin Zheng, Chen Xing, Huishuai Zhang, Yanyan Lan, Liwei Wang, Tie-Yan Liu
Abstract The Transformer is widely used in natural language processing tasks. To train a Transformer however, one usually needs a carefully designed learning rate warm-up stage, which is shown to be crucial to the final performance but will slow down the optimization and bring more hyper-parameter tunings. In this paper, we first study theoretically why the learning rate warm-up stage is essential and show that the location of layer normalization matters. Specifically, we prove with mean field theory that at initialization, for the original-designed Post-LN Transformer, which places the layer normalization between the residual blocks, the expected gradients of the parameters near the output layer are large. Therefore, using a large learning rate on those gradients makes the training unstable. The warm-up stage is practically helpful for avoiding this problem. On the other hand, our theory also shows that if the layer normalization is put inside the residual blocks (recently proposed as Pre-LN Transformer), the gradients are well-behaved at initialization. This motivates us to remove the warm-up stage for the training of Pre-LN Transformers. We show in our experiments that Pre-LN Transformers without the warm-up stage can reach comparable results with baselines while requiring significantly less training time and hyper-parameter tuning on a wide range of applications.
Tasks
Published 2020-02-12
URL https://arxiv.org/abs/2002.04745v1
PDF https://arxiv.org/pdf/2002.04745v1.pdf
PWC https://paperswithcode.com/paper/on-layer-normalization-in-the-transformer-1
Repo
Framework

Reinforcement Learning in FlipIt

Title Reinforcement Learning in FlipIt
Authors Laura Greige, Peter Chin
Abstract Reinforcement learning has shown much success in games such as chess, backgammon and Go. However, in most of these games, agents have full knowledge of the environment at all times. In this paper, we describe a deep learning model that successfully optimizes its score using reinforcement learning in a game with incomplete and imperfect information. We apply our model to FlipIt, a two-player game in which both players, the attacker and the defender, compete for ownership of a shared resource and only receive information on the current state (such as the current owner of the resource, or the time since the opponent last moved, etc.) upon making a move. Our model is a deep neural network combined with Q-learning and is trained to maximize the defender’s time of ownership of the resource. Despite the imperfect observations, our model successfully learns an optimal cost-effective counter-strategy and shows the advantages of the use of deep reinforcement learning in game theoretic scenarios. Our results show that it outperforms the Greedy strategy against distributions such as periodic and exponential distributions without any prior knowledge of the opponent’s strategy, and we generalize the model to $n$-player games.
Tasks Q-Learning
Published 2020-02-28
URL https://arxiv.org/abs/2002.12909v1
PDF https://arxiv.org/pdf/2002.12909v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-in-flipit
Repo
Framework

Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents

Title Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents
Authors Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata
Abstract Standard game-theoretic formulations for settings like contextual pricing and security games assume that agents act in accordance with a specific behavioral model. In practice however, some agents may not prescribe to the dominant behavioral model or may act in ways that are arbitrarily inconsistent. Existing algorithms heavily depend on the model being (approximately) accurate for all agents and have poor performance in the presence of even a few such arbitrarily irrational agents. How do we design learning algorithms that are robust to the presence of arbitrarily irrational agents? We address this question for a number of canonical game-theoretic applications by designing a robust algorithm for the fundamental problem of multidimensional binary search. The performance of our algorithm degrades gracefully with the number of corrupted rounds, which correspond to irrational agents and need not be known in advance. As binary search is the key primitive in algorithms for contextual pricing, Stackelberg Security Games, and other game-theoretic applications, we immediately obtain robust algorithms for these settings. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis, and may be of independent algorithmic interest.
Tasks
Published 2020-02-26
URL https://arxiv.org/abs/2002.11650v2
PDF https://arxiv.org/pdf/2002.11650v2.pdf
PWC https://paperswithcode.com/paper/corrupted-multidimensional-binary-search
Repo
Framework

Lattice protein design using Bayesian learning

Title Lattice protein design using Bayesian learning
Authors Tomoei Takahashi, George Chikenji, Kei Tokita
Abstract A novel protein design method using Bayesian learning is proposed in this work. We consider a posterior probability of amino acid sequences by taking into account water and assuming a prior of sequences. For some instances of a target conformation of a two-dimensional (2D) lattice Hydrophobic-Polar (HP) model, our method successfully finds an amino acid sequence for which the target conformation has a unique ground state. However, the performance was not as good for 3D lattice HP models compared with 2D models. Furthermore, we find a strong linearity between the chemical potential of water and the number of surface residues, thereby revealing the relationship between protein structure and the effect of water molecules. The advantage of our method is that it greatly reduces computation time, because it does not require long calculations for the partition function corresponding to an exhaustive conformational search. As our method uses a general form of Bayesian learning and statistical mechanics and is not limited to lattice HP proteins, the results presented here elucidate some heuristics used successfully in previous protein design methods.
Tasks
Published 2020-03-14
URL https://arxiv.org/abs/2003.06601v4
PDF https://arxiv.org/pdf/2003.06601v4.pdf
PWC https://paperswithcode.com/paper/lattice-protein-design-using-bayesian
Repo
Framework

FastDTW is approximate and Generally Slower than the Algorithm it Approximates

Title FastDTW is approximate and Generally Slower than the Algorithm it Approximates
Authors Renjie Wu, Eamonn J. Keogh
Abstract Many time series data mining problems can be solved with repeated use of distance measure. Examples of such tasks include similarity search, clustering, classification, anomaly detection and segmentation. For over two decades it has been known that the Dynamic Time Warping (DTW) distance measure is the best measure to use for most tasks, in most domains. Because the classic DTW algorithm has quadratic time complexity, many ideas have been introduced to reduce its amortized time, or to quickly approximate it. One of the most cited approximate approaches is FastDTW. The FastDTW algorithm has well over a thousand citations and has been explicitly used in several hundred research efforts. In this work, we make a surprising claim. In any realistic data mining application, the approximate FastDTW is much slower than the exact DTW. This fact clearly has implications for the community that uses this algorithm: allowing it to address much larger datasets, get exact results, and do so in less time. Our observation also has a more sobering lesson for the community. This work may serve as a reminder to the community to exercise more caution in uncritically accepting published results.
Tasks Anomaly Detection, Time Series
Published 2020-03-25
URL https://arxiv.org/abs/2003.11246v1
PDF https://arxiv.org/pdf/2003.11246v1.pdf
PWC https://paperswithcode.com/paper/fastdtw-is-approximate-and-generally-slower
Repo
Framework

Event Probability Mask (EPM) and Event Denoising Convolutional Neural Network (EDnCNN) for Neuromorphic Cameras

Title Event Probability Mask (EPM) and Event Denoising Convolutional Neural Network (EDnCNN) for Neuromorphic Cameras
Authors R. Wes Baldwin, Mohammed Almatrafi, Vijayan Asari, Keigo Hirakawa
Abstract This paper presents a novel method for labeling real-world neuromorphic camera sensor data by calculating the likelihood of generating an event at each pixel within a short time window, which we refer to as “event probability mask” or EPM. Its applications include (i) objective benchmarking of event denoising performance, (ii) training convolutional neural networks for noise removal called “event denoising convolutional neural network” (EDnCNN), and (iii) estimating internal neuromorphic camera parameters. We provide the first dataset (DVSNOISE20) of real-world labeled neuromorphic camera events for noise removal.
Tasks Denoising
Published 2020-03-18
URL https://arxiv.org/abs/2003.08282v2
PDF https://arxiv.org/pdf/2003.08282v2.pdf
PWC https://paperswithcode.com/paper/event-probability-mask-epm-and-event
Repo
Framework

3D U-Net for Segmentation of Plant Root MRI Images in Super-Resolution

Title 3D U-Net for Segmentation of Plant Root MRI Images in Super-Resolution
Authors Yi Zhao, Nils Wandel, Magdalena Landl, Andrea Schnepf, Sven Behnke
Abstract Magnetic resonance imaging (MRI) enables plant scientists to non-invasively study root system development and root-soil interaction. Challenging recording conditions, such as low resolution and a high level of noise hamper the performance of traditional root extraction algorithms, though. We propose to increase signal-to-noise ratio and resolution by segmenting the scanned volumes into root and soil in super-resolution using a 3D U-Net. Tests on real data show that the trained network is capable to detect most roots successfully and even finds roots that were missed by human annotators. Our experiments show that the segmentation performance can be further improved with modifications of the loss function.
Tasks Super-Resolution
Published 2020-02-21
URL https://arxiv.org/abs/2002.09317v1
PDF https://arxiv.org/pdf/2002.09317v1.pdf
PWC https://paperswithcode.com/paper/3d-u-net-for-segmentation-of-plant-root-mri
Repo
Framework

Memory Augmented Generative Adversarial Networks for Anomaly Detection

Title Memory Augmented Generative Adversarial Networks for Anomaly Detection
Authors Ziyi Yang, Teng Zhang, Iman Soltani Bozchalooi, Eric Darve
Abstract In this paper, we present a memory-augmented algorithm for anomaly detection. Classical anomaly detection algorithms focus on learning to model and generate normal data, but typically guarantees for detecting anomalous data are weak. The proposed Memory Augmented Generative Adversarial Networks (MEMGAN) interacts with a memory module for both the encoding and generation processes. Our algorithm is such that most of the \textit{encoded} normal data are inside the convex hull of the memory units, while the abnormal data are isolated outside. Such a remarkable property leads to good (resp.\ poor) reconstruction for normal (resp.\ abnormal) data and therefore provides a strong guarantee for anomaly detection. Decoded memory units in MEMGAN are more interpretable and disentangled than previous methods, which further demonstrates the effectiveness of the memory mechanism. Experimental results on twenty anomaly detection datasets of CIFAR-10 and MNIST show that MEMGAN demonstrates significant improvements over previous anomaly detection methods.
Tasks Anomaly Detection
Published 2020-02-07
URL https://arxiv.org/abs/2002.02669v1
PDF https://arxiv.org/pdf/2002.02669v1.pdf
PWC https://paperswithcode.com/paper/memory-augmented-generative-adversarial
Repo
Framework

Anomaly Detection using Deep Autoencoders for in-situ Wastewater Systems Monitoring Data

Title Anomaly Detection using Deep Autoencoders for in-situ Wastewater Systems Monitoring Data
Authors Stefania Russo, Andy Disch, Frank Blumensaat, Kris Villez
Abstract Due to the growing amount of data from in-situ sensors in wastewater systems, it becomes necessary to automatically identify abnormal behaviours and ensure high data quality. This paper proposes an anomaly detection method based on a deep autoencoder for in-situ wastewater systems monitoring data. The autoencoder architecture is based on 1D Convolutional Neural Network (CNN) layers where the convolutions are performed over the inputs across the temporal axis of the data. Anomaly detection is then performed based on the reconstruction error of the decoding stage. The approach is validated on multivariate time series from in-sewer process monitoring data. We discuss the results and the challenge of labelling anomalies in complex time series. We suggest that our proposed approach can support the domain experts in the identification of anomalies.
Tasks Anomaly Detection, Time Series
Published 2020-02-07
URL https://arxiv.org/abs/2002.03843v3
PDF https://arxiv.org/pdf/2002.03843v3.pdf
PWC https://paperswithcode.com/paper/anomaly-detection-using-deep-autoencoders-for
Repo
Framework

Acoustic anomaly detection via latent regularized gaussian mixture generative adversarial networks

Title Acoustic anomaly detection via latent regularized gaussian mixture generative adversarial networks
Authors Chengwei Chen, Pan Chen, Lingyu Yang, Jinyuan Mo, Haichuan Song, Yuan Xie, Lizhuang Ma
Abstract Acoustic anomaly detection aims at distinguishing abnormal acoustic signals from the normal ones. It suffers from the class imbalance issue and the lacking in the abnormal instances. In addition, collecting all kinds of abnormal or unknown samples for training purpose is impractical and timeconsuming. In this paper, a novel Gaussian Mixture Generative Adversarial Network (GMGAN) is proposed under semi-supervised learning framework, in which the underlying structure of training data is not only captured in spectrogram reconstruction space, but also can be further restricted in the space of latent representation in a discriminant manner. Experiments show that our model has clear superiority over previous methods, and achieves the state-of-the-art results on DCASE dataset.
Tasks Anomaly Detection
Published 2020-02-04
URL https://arxiv.org/abs/2002.01107v2
PDF https://arxiv.org/pdf/2002.01107v2.pdf
PWC https://paperswithcode.com/paper/acoustic-anomaly-detection-via-latent
Repo
Framework

A Geometric Perspective on Visual Imitation Learning

Title A Geometric Perspective on Visual Imitation Learning
Authors Jun Jin, Laura Petrich, Masood Dehghan, Martin Jagersand
Abstract We consider the problem of visual imitation learning without human supervision (e.g. kinesthetic teaching or teleoperation), nor access to an interactive reinforcement learning (RL) training environment. We present a geometric perspective to derive solutions to this problem. Specifically, we propose VGS-IL (Visual Geometric Skill Imitation Learning), an end-to-end geometry-parameterized task concept inference method, to infer globally consistent geometric feature association rules from human demonstration video frames. We show that, instead of learning actions from image pixels, learning a geometry-parameterized task concept provides an explainable and invariant representation across demonstrator to imitator under various environmental settings. Moreover, such a task concept representation provides a direct link with geometric vision based controllers (e.g. visual servoing), allowing for efficient mapping of high-level task concepts to low-level robot actions.
Tasks Imitation Learning
Published 2020-03-05
URL https://arxiv.org/abs/2003.02768v1
PDF https://arxiv.org/pdf/2003.02768v1.pdf
PWC https://paperswithcode.com/paper/a-geometric-perspective-on-visual-imitation
Repo
Framework

Comprehensive Analysis of Time Series Forecasting Using Neural Networks

Title Comprehensive Analysis of Time Series Forecasting Using Neural Networks
Authors Manie Tadayon, Yumi Iwashita
Abstract Time series forecasting has gained lots of attention recently; this is because many real-world phenomena can be modeled as time series. The massive volume of data and recent advancements in the processing power of the computers enable researchers to develop more sophisticated machine learning algorithms such as neural networks to forecast the time series data. In this paper, we propose various neural network architectures to forecast the time series data using the dynamic measurements; moreover, we introduce various architectures on how to combine static and dynamic measurements for forecasting. We also investigate the importance of performing techniques such as anomaly detection and clustering on forecasting accuracy. Our results indicate that clustering can improve the overall prediction time as well as improve the forecasting performance of the neural network. Furthermore, we show that feature-based clustering can outperform the distance-based clustering in terms of speed and efficiency. Finally, our results indicate that adding more predictors to forecast the target variable will not necessarily improve the forecasting accuracy.
Tasks Anomaly Detection, Time Series, Time Series Forecasting
Published 2020-01-27
URL https://arxiv.org/abs/2001.09547v1
PDF https://arxiv.org/pdf/2001.09547v1.pdf
PWC https://paperswithcode.com/paper/comprehensive-analysis-of-time-series
Repo
Framework

One-Shot Informed Robotic Visual Search in the Wild

Title One-Shot Informed Robotic Visual Search in the Wild
Authors Karim Koreitem, Florian Shkurti, Travis Manderson, Wei-Di Chang, Juan Camilo Gamboa Higuera, Gregory Dudek
Abstract We consider the task of underwater robot navigation for the purpose of collecting scientifically-relevant video data for environmental monitoring. The majority of field robots that currently perform monitoring tasks in unstructured natural environments navigate via path-tracking a pre-specified sequence of waypoints. Although this navigation method is often necessary, it is limiting because the robot does not have a model of what the scientist deems to be relevant visual observations. Thus, the robot can neither visually search for particular types of objects, nor focus its attention on parts of the scene that might be more relevant than the pre-specified waypoints and viewpoints. In this paper we propose a method that enables informed visual navigation via a learned visual similarity operator that guides the robot’s visual search towards parts of the scene that look like an exemplar image, which is given by the user as a high-level specification for data collection. We propose and evaluate a weakly-supervised video representation learning method that outperforms ImageNet embeddings for similarity tasks in the underwater domain. We also demonstrate the deployment of this similarity operator during informed visual navigation in collaborative environmental monitoring scenarios, in large-scale field trials, where the robot and a human scientist jointly search for relevant visual content.
Tasks Representation Learning, Robot Navigation, Visual Navigation
Published 2020-03-22
URL https://arxiv.org/abs/2003.10010v1
PDF https://arxiv.org/pdf/2003.10010v1.pdf
PWC https://paperswithcode.com/paper/one-shot-informed-robotic-visual-search-in
Repo
Framework

Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning

Title Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning
Authors Roei Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov
Abstract Word embeddings, i.e., low-dimensional vector representations such as GloVe and SGNS, encode word “meaning” in the sense that distances between words’ vectors correspond to their semantic proximity. This enables transfer learning of semantics for a variety of natural language processing tasks. Word embeddings are typically trained on large public corpora such as Wikipedia or Twitter. We demonstrate that an attacker who can modify the corpus on which the embedding is trained can control the “meaning” of new and existing words by changing their locations in the embedding space. We develop an explicit expression over corpus features that serves as a proxy for distance between words and establish a causative relationship between its values and embedding distances. We then show how to use this relationship for two adversarial objectives: (1) make a word a top-ranked neighbor of another word, and (2) move a word from one semantic cluster to another. An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios. We use this attack to manipulate query expansion in information retrieval systems such as resume search, make certain names more or less visible to named entity recognition models, and cause new words to be translated to a particular target word regardless of the language. Finally, we show how the attacker can generate linguistically likely corpus modifications, thus fooling defenses that attempt to filter implausible sentences from the corpus using a language model.
Tasks data poisoning, Information Retrieval, Language Modelling, Named Entity Recognition, Transfer Learning, Word Embeddings
Published 2020-01-14
URL https://arxiv.org/abs/2001.04935v1
PDF https://arxiv.org/pdf/2001.04935v1.pdf
PWC https://paperswithcode.com/paper/humpty-dumpty-controlling-word-meanings-via
Repo
Framework
comments powered by Disqus