January 30, 2020

2987 words 15 mins read

Paper Group ANR 290

Paper Group ANR 290

Incremental Online Spoken Language Understanding. Iterative Delexicalization for Improved Spoken Language Understanding. A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification. A Taxonomy of Channel Pruning Signals in CNNs. Energy-Aware Analog Aggregation for Federated Learning with Redundant Data. LivDet in Action - Fi …

Incremental Online Spoken Language Understanding

Title Incremental Online Spoken Language Understanding
Authors Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan
Abstract Spoken Language Understanding (SLU) typically comprises of an automatic speech recognition (ASR) followed by a natural language understanding (NLU) module. The two modules process signals in a blocking sequential fashion, i.e., the NLU often has to wait for the ASR to finish processing on an utterance basis, potentially leading to high latencies that render the spoken interaction less natural. In this paper, we propose recurrent neural network (RNN) based incremental processing towards the SLU task of intent detection. The proposed methodology offers lower latencies than a typical SLU system, without any significant reduction in system accuracy. We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems. A lexical End-of-Sentence (EOS) detector is proposed for segmenting the stream of transcript into sentences for intent classification. Intent detection experiments are conducted on benchmark ATIS dataset modified to emulate a continuous incremental stream of words with no utterance demarcation. We also analyze the prospects of early intent detection, before EOS, with our proposed system.
Tasks Intent Classification, Intent Detection, Speech Recognition, Spoken Language Understanding
Published 2019-10-23
URL https://arxiv.org/abs/1910.10287v1
PDF https://arxiv.org/pdf/1910.10287v1.pdf
PWC https://paperswithcode.com/paper/incremental-online-spoken-language
Repo
Framework

Iterative Delexicalization for Improved Spoken Language Understanding

Title Iterative Delexicalization for Improved Spoken Language Understanding
Authors Avik Ray, Yilin Shen, Hongxia Jin
Abstract Recurrent neural network (RNN) based joint intent classification and slot tagging models have achieved tremendous success in recent years for building spoken language understanding and dialog systems. However, these models suffer from poor performance for slots which often encounter large semantic variability in slot values after deployment (e.g. message texts, partial movie/artist names). While greedy delexicalization of slots in the input utterance via substring matching can partly improve performance, it often produces incorrect input. Moreover, such techniques cannot delexicalize slots with out-of-vocabulary slot values not seen at training. In this paper, we propose a novel iterative delexicalization algorithm, which can accurately delexicalize the input, even with out-of-vocabulary slot values. Based on model confidence of the current delexicalized input, our algorithm improves delexicalization in every iteration to converge to the best input having the highest confidence. We show on benchmark and in-house datasets that our algorithm can greatly improve parsing performance for RNN based models, especially for out-of-distribution slot values.
Tasks Intent Classification, Spoken Language Understanding
Published 2019-10-15
URL https://arxiv.org/abs/1910.07060v1
PDF https://arxiv.org/pdf/1910.07060v1.pdf
PWC https://paperswithcode.com/paper/iterative-delexicalization-for-improved
Repo
Framework

A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification

Title A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification
Authors Varun Kumar, Hadrien Glaude, Cyprien de Lichy, William Campbell
Abstract New conversation topics and functionalities are constantly being added to conversational AI agents like Amazon Alexa and Apple Siri. As data collection and annotation is not scalable and is often costly, only a handful of examples for the new functionalities are available, which results in poor generalization performance. We formulate it as a Few-Shot Integration (FSI) problem where a few examples are used to introduce a new intent. In this paper, we study six feature space data augmentation methods to improve classification performance in FSI setting in combination with both supervised and unsupervised representation learning methods such as BERT. Through realistic experiments on two public conversational datasets, SNIPS, and the Facebook Dialog corpus, we show that data augmentation in feature space provides an effective way to improve intent classification performance in few-shot setting beyond traditional transfer learning approaches. In particular, we show that (a) upsampling in latent space is a competitive baseline for feature space augmentation (b) adding the difference between two examples to a new example is a simple yet effective data augmentation method.
Tasks Data Augmentation, Intent Classification, Representation Learning, Transfer Learning, Unsupervised Representation Learning
Published 2019-10-09
URL https://arxiv.org/abs/1910.04176v1
PDF https://arxiv.org/pdf/1910.04176v1.pdf
PWC https://paperswithcode.com/paper/a-closer-look-at-feature-space-data
Repo
Framework

A Taxonomy of Channel Pruning Signals in CNNs

Title A Taxonomy of Channel Pruning Signals in CNNs
Authors Kaveena Persand, Andrew Anderson, David Gregg
Abstract Convolutional neural networks (CNNs) are widely used for classification problems. However, they often require large amounts of computation and memory which are not readily available in resource constrained systems. Pruning unimportant parameters from CNNs to reduce these requirements has been a subject of intensive research in recent years. However, novel approaches in pruning signals are sometimes difficult to compare against each other. We propose a taxonomy that classifies pruning signals based on four mostly-orthogonal components of the signal. We also empirically evaluate 396 pruning signals including existing ones, and new signals constructed from the components of existing signals. We find that some of our newly constructed signals outperform the best existing pruning signals.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04675v1
PDF https://arxiv.org/pdf/1906.04675v1.pdf
PWC https://paperswithcode.com/paper/a-taxonomy-of-channel-pruning-signals-in-cnns
Repo
Framework

Energy-Aware Analog Aggregation for Federated Learning with Redundant Data

Title Energy-Aware Analog Aggregation for Federated Learning with Redundant Data
Authors Yuxuan Sun, Sheng Zhou, Deniz Gündüz
Abstract Federated learning (FL) enables workers to learn a model collaboratively by using their local data, with the help of a parameter server (PS) for global model aggregation. The high communication cost for periodic model updates and the non-independent and identically distributed (i.i.d.) data become major bottlenecks for FL. In this work, we consider analog aggregation to scale down the communication cost with respect to the number of workers, and introduce data redundancy to the system to deal with non-i.i.d. data. We propose an online energy-aware dynamic worker scheduling policy, which maximizes the average number of workers scheduled for gradient update at each iteration under a long-term energy constraint, and analyze its performance based on Lyapunov optimization. Experiments using MNIST dataset show that, for non-i.i.d. data, doubling data storage can improve the accuracy by 9.8% under a stringent energy budget, while the proposed policy can achieve close-to-optimal accuracy without violating the energy constraint.
Tasks
Published 2019-11-01
URL https://arxiv.org/abs/1911.00188v1
PDF https://arxiv.org/pdf/1911.00188v1.pdf
PWC https://paperswithcode.com/paper/energy-aware-analog-aggregation-for-federated
Repo
Framework

LivDet in Action - Fingerprint Liveness Detection Competition 2019

Title LivDet in Action - Fingerprint Liveness Detection Competition 2019
Authors Giulia Orrù, Roberto Casula, Pierluigi Tuveri, Carlotta Bazzoni, Giovanna Dessalvi, Marco Micheletto, Luca Ghiani, Gian Luca Marcialis
Abstract The International Fingerprint liveness Detection Competition (LivDet) is an open and well-acknowledged meeting point of academies and private companies that deal with the problem of distinguishing images coming from reproductions of fingerprints made of artificial materials and images relative to real fingerprints. In this edition of LivDet we invited the competitors to propose integrated algorithms with matching systems. The goal was to investigate at which extent this integration impact on the whole performance. Twelve algorithms were submitted to the competition, eight of which worked on integrated systems.
Tasks
Published 2019-05-02
URL http://arxiv.org/abs/1905.00639v1
PDF http://arxiv.org/pdf/1905.00639v1.pdf
PWC https://paperswithcode.com/paper/livdet-in-action-fingerprint-liveness
Repo
Framework

Power analysis of knockoff filters for correlated designs

Title Power analysis of knockoff filters for correlated designs
Authors Jingbo Liu, Philippe Rigollet
Abstract The knockoff filter introduced by Barber and Cand`es 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservative, there is no conclusive theoretical result on its power. When the predictors are i.i.d. Gaussian, it is known that as the signal to noise ratio tend to infinity, the knockoff filter is consistent in the sense that one can make FDR go to 0 and power go to 1 simultaneously. In this work we study the case where the predictors have a general covariance matrix $\Sigma$. We introduce a simple functional called effective signal deficiency (ESD) of the covariance matrix $\Sigma$ that predicts consistency of various variable selection methods. In particular, ESD reveals that the structure of the precision matrix $\Sigma^{-1}$ plays a central role in consistency and therefore, so does the conditional independence structure of the predictors. To leverage this connection, we introduce Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse). Our theoretical results are supported by numerical evidence on synthetic data.
Tasks
Published 2019-10-28
URL https://arxiv.org/abs/1910.12428v3
PDF https://arxiv.org/pdf/1910.12428v3.pdf
PWC https://paperswithcode.com/paper/power-analysis-of-knockoff-filters-for
Repo
Framework

Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks

Title Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks
Authors Felipe A. Mejia, Paul Gamble, Zigfried Hampel-Arias, Michael Lomnitz, Nina Lopatina, Lucas Tindall, Maria Alejandra Barrios
Abstract Adversarial training was introduced as a way to improve the robustness of deep learning models to adversarial attacks. This training method improves robustness against adversarial attacks, but increases the models vulnerability to privacy attacks. In this work we demonstrate how model inversion attacks, extracting training data directly from the model, previously thought to be intractable become feasible when attacking a robustly trained model. The input space for a traditionally trained model is dominated by adversarial examples - data points that strongly activate a certain class but lack semantic meaning - this makes it difficult to successfully conduct model inversion attacks. We demonstrate this effect using the CIFAR-10 dataset under three different model inversion attacks, a vanilla gradient descent method, gradient based method at different scales, and a generative adversarial network base attacks.
Tasks
Published 2019-06-15
URL https://arxiv.org/abs/1906.06449v1
PDF https://arxiv.org/pdf/1906.06449v1.pdf
PWC https://paperswithcode.com/paper/robust-or-private-adversarial-training-makes
Repo
Framework

Deep Learning and MARS: A Connection

Title Deep Learning and MARS: A Connection
Authors Michael Kohler, Adam Krzyzak, Sophie Langer
Abstract We consider least squares regression estimates using deep neural networks. We show that these estimates satisfy an oracle inequality, which implies that (up to a logarithmic factor) the error of these estimates is at least as small as the optimal possible error bound which one would expect for MARS in case that this procedure would work in the optimal way. As a result we show that our neural networks are able to achieve a dimensionality reduction in case that the regression function locally has low dimensionality. This assumption seems to be realistic in real-world applications, since selected high-dimensional data are often confined to locally-low-dimensional distributions. In our simulation study we provide numerical experiments to support our theoretical results and to compare our estimate with other conventional nonparametric regression estimates, especially with MARS. The use of our estimates is illustrated through a real data analysis.
Tasks Dimensionality Reduction
Published 2019-08-29
URL https://arxiv.org/abs/1908.11140v2
PDF https://arxiv.org/pdf/1908.11140v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-and-mars-a-connection
Repo
Framework

The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification

Title The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification
Authors Jimmy Lin
Abstract Motivated by recent commentary that has questioned today’s pursuit of ever-more complex models and mathematical formalisms in applied machine learning and whether meaningful empirical progress is actually being made, this paper tries to tackle the decades-old problem of pseudo-relevance feedback with “the simplest thing that can possibly work”. I present a technique based on training a document relevance classifier for each information need using pseudo-labels from an initial ranked list and then applying the classifier to rerank the retrieved documents. Experiments demonstrate significant improvements across a number of newswire collections, with initial rankings supplied by “bag of words” BM25 as well as from a well-tuned query expansion model. While this simple technique draws elements from several well-known threads in the literature, to my knowledge this exact combination has not previously been proposed and evaluated.
Tasks Text Classification
Published 2019-04-18
URL http://arxiv.org/abs/1904.08861v1
PDF http://arxiv.org/pdf/1904.08861v1.pdf
PWC https://paperswithcode.com/paper/the-simplest-thing-that-can-possibly-work
Repo
Framework

FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving

Title FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving
Authors Hazem Rashed, Mohamed Ramzy, Victor Vaquero, Ahmad El Sallab, Ganesh Sistu, Senthil Yogamani
Abstract Moving object detection is a critical task for autonomous vehicles. As dynamic objects represent higher collision risk than static ones, our own ego-trajectories have to be planned attending to the future states of the moving elements of the scene. Motion can be perceived using temporal information such as optical flow. Conventional optical flow computation is based on camera sensors only, which makes it prone to failure in conditions with low illumination. On the other hand, LiDAR sensors are independent of illumination, as they measure the time-of-flight of their own emitted lasers. In this work, we propose a robust and real-time CNN architecture for Moving Object Detection (MOD) under low-light conditions by capturing motion information from both camera and LiDAR sensors. We demonstrate the impact of our algorithm on KITTI dataset where we simulate a low-light environment creating a novel dataset “Dark KITTI”. We obtain a 10.1% relative improvement on Dark-KITTI, and a 4.25% improvement on standard KITTI relative to our baselines. The proposed algorithm runs at 18 fps on a standard desktop GPU using $256\times1224$ resolution images.
Tasks Autonomous Driving, Autonomous Vehicles, Object Detection, Optical Flow Estimation
Published 2019-10-11
URL https://arxiv.org/abs/1910.05395v3
PDF https://arxiv.org/pdf/1910.05395v3.pdf
PWC https://paperswithcode.com/paper/fusemodnet-real-time-camera-and-lidar-based
Repo
Framework

Stable Backward Diffusion Models that Minimise Convex Energies

Title Stable Backward Diffusion Models that Minimise Convex Energies
Authors Leif Bergerhoff, Marcelo Cárdenas, Joachim Weickert, Martin Welk
Abstract Backward diffusion processes appear naturally in image enhancement and deblurring applications. However, the inverse problem of backward diffusion is known to be ill-posed and straightforward numerical algorithms are unstable. So far, existing stabilisation strategies in the literature require sophisticated numerics to solve the underlying initial value problem. Therefore, it is desirable to establish a backward diffusion model which implements a smart stabilisation approach that can be used in combination with a simple numerical scheme. We derive a class of space-discrete one-dimensional backward diffusion as gradient descent of energies where we gain stability by imposing range constraints. Interestingly, these energies are even convex. Furthermore, we establish a comprehensive theory for the time-continuous evolution and we show that stability carries over to a simple explicit time discretisation of our model. Finally, we confirm the stability and usefulness of our technique in experiments in which we enhance the contrast of digital greyscale and colour images.
Tasks Deblurring, Image Enhancement
Published 2019-03-08
URL http://arxiv.org/abs/1903.03491v1
PDF http://arxiv.org/pdf/1903.03491v1.pdf
PWC https://paperswithcode.com/paper/stable-backward-diffusion-models-that
Repo
Framework

CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots

Title CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots
Authors Arshit Gupta, Peng Zhang, Garima Lalwani, Mona Diab
Abstract Natural Language Understanding (NLU) is a core component of dialog systems. It typically involves two tasks - intent classification (IC) and slot labeling (SL), which are then followed by a dialogue management (DM) component. Such NLU systems cater to utterances in isolation, thus pushing the problem of context management to DM. However, contextual information is critical to the correct prediction of intents and slots in a conversation. Prior work on contextual NLU has been limited in terms of the types of contextual signals used and the understanding of their impact on the model. In this work, we propose a context-aware self-attentive NLU (CASA-NLU) model that uses multiple signals, such as previous intents, slots, dialog acts and utterances over a variable context window, in addition to the current user utterance. CASA-NLU outperforms a recurrent contextual NLU baseline on two conversational datasets, yielding a gain of up to 7% on the IC task for one of the datasets. Moreover, a non-contextual variant of CASA-NLU achieves state-of-the-art performance for IC task on standard public datasets - Snips and ATIS.
Tasks Dialogue Management, Intent Classification
Published 2019-09-18
URL https://arxiv.org/abs/1909.08705v1
PDF https://arxiv.org/pdf/1909.08705v1.pdf
PWC https://paperswithcode.com/paper/casa-nlu-context-aware-self-attentive-natural
Repo
Framework

Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework

Title Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework
Authors Deepan Das, Noor Mohammed Ghouse, Shashank Verma, Yin Li
Abstract We introduce a novel deep neural network architecture that links visual regions to corresponding textual segments including phrases and words. To accomplish this task, our architecture makes use of the rich semantic information available in a joint embedding space of multi-modal data. From this joint embedding space, we extract the associative localization maps that develop naturally, without explicitly providing supervision during training for the localization task. The joint space is learned using a bidirectional ranking objective that is optimized using a $N$-Pair loss formulation. This training mechanism demonstrates the idea that localization information is learned inherently while optimizing a Bidirectional Retrieval objective. The model’s retrieval and localization performance is evaluated on MSCOCO and Flickr30K Entities datasets. This architecture outperforms the state of the art results in the semi-supervised phrase localization setting.
Tasks Image Retrieval
Published 2019-08-08
URL https://arxiv.org/abs/1908.02950v1
PDF https://arxiv.org/pdf/1908.02950v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-phrase-localization-in-a
Repo
Framework

Data-Driven Design for Fourier Ptychographic Microscopy

Title Data-Driven Design for Fourier Ptychographic Microscopy
Authors Michael Kellman, Emrah Bostan, Michael Chen, Laura Waller
Abstract Fourier Ptychographic Microscopy (FPM) is a computational imaging method that is able to super-resolve features beyond the diffraction-limit set by the objective lens of a traditional microscope. This is accomplished by using synthetic aperture and phase retrieval algorithms to combine many measurements captured by an LED array microscope with programmable source patterns. FPM provides simultaneous large field-of-view and high resolution imaging, but at the cost of reduced temporal resolution, thereby limiting live cell applications. In this work, we learn LED source pattern designs that compress the many required measurements into only a few, with negligible loss in reconstruction quality or resolution. This is accomplished by recasting the super-resolution reconstruction as a Physics-based Neural Network and learning the experimental design to optimize the network’s overall performance. Specifically, we learn LED patterns for different applications (e.g. amplitude contrast and quantitative phase imaging) and show that the designs we learn through simulation generalize well in the experimental setting. Further, we discuss a context-specific loss function, practical memory limitations, and interpretability of our learned designs.
Tasks Super-Resolution
Published 2019-04-08
URL http://arxiv.org/abs/1904.04175v1
PDF http://arxiv.org/pdf/1904.04175v1.pdf
PWC https://paperswithcode.com/paper/data-driven-design-for-fourier-ptychographic
Repo
Framework
comments powered by Disqus