January 30, 2020

2987 words 15 mins read

Paper Group ANR 290

Incremental Online Spoken Language Understanding. Iterative Delexicalization for Improved Spoken Language Understanding. A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification. A Taxonomy of Channel Pruning Signals in CNNs. Energy-Aware Analog Aggregation for Federated Learning with Redundant Data. LivDet in Action - Fi …

Incremental Online Spoken Language Understanding


Title	Incremental Online Spoken Language Understanding
Authors	Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan
Abstract	Spoken Language Understanding (SLU) typically comprises of an automatic speech recognition (ASR) followed by a natural language understanding (NLU) module. The two modules process signals in a blocking sequential fashion, i.e., the NLU often has to wait for the ASR to finish processing on an utterance basis, potentially leading to high latencies that render the spoken interaction less natural. In this paper, we propose recurrent neural network (RNN) based incremental processing towards the SLU task of intent detection. The proposed methodology offers lower latencies than a typical SLU system, without any significant reduction in system accuracy. We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems. A lexical End-of-Sentence (EOS) detector is proposed for segmenting the stream of transcript into sentences for intent classification. Intent detection experiments are conducted on benchmark ATIS dataset modified to emulate a continuous incremental stream of words with no utterance demarcation. We also analyze the prospects of early intent detection, before EOS, with our proposed system.
Tasks	Intent Classification, Intent Detection, Speech Recognition, Spoken Language Understanding
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10287v1
PDF	https://arxiv.org/pdf/1910.10287v1.pdf
PWC	https://paperswithcode.com/paper/incremental-online-spoken-language
Repo
Framework

Iterative Delexicalization for Improved Spoken Language Understanding


Title	Iterative Delexicalization for Improved Spoken Language Understanding
Authors	Avik Ray, Yilin Shen, Hongxia Jin
Abstract	Recurrent neural network (RNN) based joint intent classification and slot tagging models have achieved tremendous success in recent years for building spoken language understanding and dialog systems. However, these models suffer from poor performance for slots which often encounter large semantic variability in slot values after deployment (e.g. message texts, partial movie/artist names). While greedy delexicalization of slots in the input utterance via substring matching can partly improve performance, it often produces incorrect input. Moreover, such techniques cannot delexicalize slots with out-of-vocabulary slot values not seen at training. In this paper, we propose a novel iterative delexicalization algorithm, which can accurately delexicalize the input, even with out-of-vocabulary slot values. Based on model confidence of the current delexicalized input, our algorithm improves delexicalization in every iteration to converge to the best input having the highest confidence. We show on benchmark and in-house datasets that our algorithm can greatly improve parsing performance for RNN based models, especially for out-of-distribution slot values.
Tasks	Intent Classification, Spoken Language Understanding
Published	2019-10-15
URL	https://arxiv.org/abs/1910.07060v1
PDF	https://arxiv.org/pdf/1910.07060v1.pdf
PWC	https://paperswithcode.com/paper/iterative-delexicalization-for-improved
Repo
Framework

A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification


Title	A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification
Authors	Varun Kumar, Hadrien Glaude, Cyprien de Lichy, William Campbell
Abstract	New conversation topics and functionalities are constantly being added to conversational AI agents like Amazon Alexa and Apple Siri. As data collection and annotation is not scalable and is often costly, only a handful of examples for the new functionalities are available, which results in poor generalization performance. We formulate it as a Few-Shot Integration (FSI) problem where a few examples are used to introduce a new intent. In this paper, we study six feature space data augmentation methods to improve classification performance in FSI setting in combination with both supervised and unsupervised representation learning methods such as BERT. Through realistic experiments on two public conversational datasets, SNIPS, and the Facebook Dialog corpus, we show that data augmentation in feature space provides an effective way to improve intent classification performance in few-shot setting beyond traditional transfer learning approaches. In particular, we show that (a) upsampling in latent space is a competitive baseline for feature space augmentation (b) adding the difference between two examples to a new example is a simple yet effective data augmentation method.
Tasks	Data Augmentation, Intent Classification, Representation Learning, Transfer Learning, Unsupervised Representation Learning
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04176v1
PDF	https://arxiv.org/pdf/1910.04176v1.pdf
PWC	https://paperswithcode.com/paper/a-closer-look-at-feature-space-data
Repo
Framework

A Taxonomy of Channel Pruning Signals in CNNs


Title	A Taxonomy of Channel Pruning Signals in CNNs
Authors	Kaveena Persand, Andrew Anderson, David Gregg
Abstract	Convolutional neural networks (CNNs) are widely used for classification problems. However, they often require large amounts of computation and memory which are not readily available in resource constrained systems. Pruning unimportant parameters from CNNs to reduce these requirements has been a subject of intensive research in recent years. However, novel approaches in pruning signals are sometimes difficult to compare against each other. We propose a taxonomy that classifies pruning signals based on four mostly-orthogonal components of the signal. We also empirically evaluate 396 pruning signals including existing ones, and new signals constructed from the components of existing signals. We find that some of our newly constructed signals outperform the best existing pruning signals.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04675v1
PDF	https://arxiv.org/pdf/1906.04675v1.pdf
PWC	https://paperswithcode.com/paper/a-taxonomy-of-channel-pruning-signals-in-cnns
Repo
Framework

Energy-Aware Analog Aggregation for Federated Learning with Redundant Data


Title	Energy-Aware Analog Aggregation for Federated Learning with Redundant Data
Authors	Yuxuan Sun, Sheng Zhou, Deniz Gündüz
Abstract	Federated learning (FL) enables workers to learn a model collaboratively by using their local data, with the help of a parameter server (PS) for global model aggregation. The high communication cost for periodic model updates and the non-independent and identically distributed (i.i.d.) data become major bottlenecks for FL. In this work, we consider analog aggregation to scale down the communication cost with respect to the number of workers, and introduce data redundancy to the system to deal with non-i.i.d. data. We propose an online energy-aware dynamic worker scheduling policy, which maximizes the average number of workers scheduled for gradient update at each iteration under a long-term energy constraint, and analyze its performance based on Lyapunov optimization. Experiments using MNIST dataset show that, for non-i.i.d. data, doubling data storage can improve the accuracy by 9.8% under a stringent energy budget, while the proposed policy can achieve close-to-optimal accuracy without violating the energy constraint.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00188v1
PDF	https://arxiv.org/pdf/1911.00188v1.pdf
PWC	https://paperswithcode.com/paper/energy-aware-analog-aggregation-for-federated
Repo
Framework

LivDet in Action - Fingerprint Liveness Detection Competition 2019


Title	LivDet in Action - Fingerprint Liveness Detection Competition 2019
Authors	Giulia Orrù, Roberto Casula, Pierluigi Tuveri, Carlotta Bazzoni, Giovanna Dessalvi, Marco Micheletto, Luca Ghiani, Gian Luca Marcialis
Abstract	The International Fingerprint liveness Detection Competition (LivDet) is an open and well-acknowledged meeting point of academies and private companies that deal with the problem of distinguishing images coming from reproductions of fingerprints made of artificial materials and images relative to real fingerprints. In this edition of LivDet we invited the competitors to propose integrated algorithms with matching systems. The goal was to investigate at which extent this integration impact on the whole performance. Twelve algorithms were submitted to the competition, eight of which worked on integrated systems.
Tasks
Published	2019-05-02
URL	http://arxiv.org/abs/1905.00639v1
PDF	http://arxiv.org/pdf/1905.00639v1.pdf
PWC	https://paperswithcode.com/paper/livdet-in-action-fingerprint-liveness
Repo
Framework

Power analysis of knockoff filters for correlated designs


Title	Power analysis of knockoff filters for correlated designs
Authors	Jingbo Liu, Philippe Rigollet
Abstract	The knockoff filter introduced by Barber and Cand`es 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservative, there is no conclusive theoretical result on its power. When the predictors are i.i.d. Gaussian, it is known that as the signal to noise ratio tend to infinity, the knockoff filter is consistent in the sense that one can make FDR go to 0 and power go to 1 simultaneously. In this work we study the case where the predictors have a general covariance matrix $\Sigma$. We introduce a simple functional called effective signal deficiency (ESD) of the covariance matrix $\Sigma$ that predicts consistency of various variable selection methods. In particular, ESD reveals that the structure of the precision matrix $\Sigma^{-1}$ plays a central role in consistency and therefore, so does the conditional independence structure of the predictors. To leverage this connection, we introduce Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse). Our theoretical results are supported by numerical evidence on synthetic data.
Tasks
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12428v3
PDF	https://arxiv.org/pdf/1910.12428v3.pdf
PWC	https://paperswithcode.com/paper/power-analysis-of-knockoff-filters-for
Repo
Framework

Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks


Title	Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks
Authors	Felipe A. Mejia, Paul Gamble, Zigfried Hampel-Arias, Michael Lomnitz, Nina Lopatina, Lucas Tindall, Maria Alejandra Barrios
Abstract	Adversarial training was introduced as a way to improve the robustness of deep learning models to adversarial attacks. This training method improves robustness against adversarial attacks, but increases the models vulnerability to privacy attacks. In this work we demonstrate how model inversion attacks, extracting training data directly from the model, previously thought to be intractable become feasible when attacking a robustly trained model. The input space for a traditionally trained model is dominated by adversarial examples - data points that strongly activate a certain class but lack semantic meaning - this makes it difficult to successfully conduct model inversion attacks. We demonstrate this effect using the CIFAR-10 dataset under three different model inversion attacks, a vanilla gradient descent method, gradient based method at different scales, and a generative adversarial network base attacks.
Tasks
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06449v1
PDF	https://arxiv.org/pdf/1906.06449v1.pdf
PWC	https://paperswithcode.com/paper/robust-or-private-adversarial-training-makes
Repo
Framework

Deep Learning and MARS: A Connection


Title	Deep Learning and MARS: A Connection
Authors	Michael Kohler, Adam Krzyzak, Sophie Langer
Abstract	We consider least squares regression estimates using deep neural networks. We show that these estimates satisfy an oracle inequality, which implies that (up to a logarithmic factor) the error of these estimates is at least as small as the optimal possible error bound which one would expect for MARS in case that this procedure would work in the optimal way. As a result we show that our neural networks are able to achieve a dimensionality reduction in case that the regression function locally has low dimensionality. This assumption seems to be realistic in real-world applications, since selected high-dimensional data are often confined to locally-low-dimensional distributions. In our simulation study we provide numerical experiments to support our theoretical results and to compare our estimate with other conventional nonparametric regression estimates, especially with MARS. The use of our estimates is illustrated through a real data analysis.
Tasks	Dimensionality Reduction
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11140v2
PDF	https://arxiv.org/pdf/1908.11140v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-and-mars-a-connection
Repo
Framework

The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification


Title	The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification
Authors	Jimmy Lin
Abstract	Motivated by recent commentary that has questioned today’s pursuit of ever-more complex models and mathematical formalisms in applied machine learning and whether meaningful empirical progress is actually being made, this paper tries to tackle the decades-old problem of pseudo-relevance feedback with “the simplest thing that can possibly work”. I present a technique based on training a document relevance classifier for each information need using pseudo-labels from an initial ranked list and then applying the classifier to rerank the retrieved documents. Experiments demonstrate significant improvements across a number of newswire collections, with initial rankings supplied by “bag of words” BM25 as well as from a well-tuned query expansion model. While this simple technique draws elements from several well-known threads in the literature, to my knowledge this exact combination has not previously been proposed and evaluated.
Tasks	Text Classification
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08861v1
PDF	http://arxiv.org/pdf/1904.08861v1.pdf
PWC	https://paperswithcode.com/paper/the-simplest-thing-that-can-possibly-work
Repo
Framework

FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving


Title	FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving
Authors	Hazem Rashed, Mohamed Ramzy, Victor Vaquero, Ahmad El Sallab, Ganesh Sistu, Senthil Yogamani
Abstract	Moving object detection is a critical task for autonomous vehicles. As dynamic objects represent higher collision risk than static ones, our own ego-trajectories have to be planned attending to the future states of the moving elements of the scene. Motion can be perceived using temporal information such as optical flow. Conventional optical flow computation is based on camera sensors only, which makes it prone to failure in conditions with low illumination. On the other hand, LiDAR sensors are independent of illumination, as they measure the time-of-flight of their own emitted lasers. In this work, we propose a robust and real-time CNN architecture for Moving Object Detection (MOD) under low-light conditions by capturing motion information from both camera and LiDAR sensors. We demonstrate the impact of our algorithm on KITTI dataset where we simulate a low-light environment creating a novel dataset “Dark KITTI”. We obtain a 10.1% relative improvement on Dark-KITTI, and a 4.25% improvement on standard KITTI relative to our baselines. The proposed algorithm runs at 18 fps on a standard desktop GPU using $256\times1224$ resolution images.
Tasks	Autonomous Driving, Autonomous Vehicles, Object Detection, Optical Flow Estimation
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05395v3
PDF	https://arxiv.org/pdf/1910.05395v3.pdf
PWC	https://paperswithcode.com/paper/fusemodnet-real-time-camera-and-lidar-based
Repo
Framework

Stable Backward Diffusion Models that Minimise Convex Energies


Title	Stable Backward Diffusion Models that Minimise Convex Energies
Authors	Leif Bergerhoff, Marcelo Cárdenas, Joachim Weickert, Martin Welk
Abstract	Backward diffusion processes appear naturally in image enhancement and deblurring applications. However, the inverse problem of backward diffusion is known to be ill-posed and straightforward numerical algorithms are unstable. So far, existing stabilisation strategies in the literature require sophisticated numerics to solve the underlying initial value problem. Therefore, it is desirable to establish a backward diffusion model which implements a smart stabilisation approach that can be used in combination with a simple numerical scheme. We derive a class of space-discrete one-dimensional backward diffusion as gradient descent of energies where we gain stability by imposing range constraints. Interestingly, these energies are even convex. Furthermore, we establish a comprehensive theory for the time-continuous evolution and we show that stability carries over to a simple explicit time discretisation of our model. Finally, we confirm the stability and usefulness of our technique in experiments in which we enhance the contrast of digital greyscale and colour images.
Tasks	Deblurring, Image Enhancement
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03491v1
PDF	http://arxiv.org/pdf/1903.03491v1.pdf
PWC	https://paperswithcode.com/paper/stable-backward-diffusion-models-that
Repo
Framework

CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots


Title	CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots
Authors	Arshit Gupta, Peng Zhang, Garima Lalwani, Mona Diab
Abstract	Natural Language Understanding (NLU) is a core component of dialog systems. It typically involves two tasks - intent classification (IC) and slot labeling (SL), which are then followed by a dialogue management (DM) component. Such NLU systems cater to utterances in isolation, thus pushing the problem of context management to DM. However, contextual information is critical to the correct prediction of intents and slots in a conversation. Prior work on contextual NLU has been limited in terms of the types of contextual signals used and the understanding of their impact on the model. In this work, we propose a context-aware self-attentive NLU (CASA-NLU) model that uses multiple signals, such as previous intents, slots, dialog acts and utterances over a variable context window, in addition to the current user utterance. CASA-NLU outperforms a recurrent contextual NLU baseline on two conversational datasets, yielding a gain of up to 7% on the IC task for one of the datasets. Moreover, a non-contextual variant of CASA-NLU achieves state-of-the-art performance for IC task on standard public datasets - Snips and ATIS.
Tasks	Dialogue Management, Intent Classification
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08705v1
PDF	https://arxiv.org/pdf/1909.08705v1.pdf
PWC	https://paperswithcode.com/paper/casa-nlu-context-aware-self-attentive-natural
Repo
Framework

Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework


Title	Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework
Authors	Deepan Das, Noor Mohammed Ghouse, Shashank Verma, Yin Li
Abstract	We introduce a novel deep neural network architecture that links visual regions to corresponding textual segments including phrases and words. To accomplish this task, our architecture makes use of the rich semantic information available in a joint embedding space of multi-modal data. From this joint embedding space, we extract the associative localization maps that develop naturally, without explicitly providing supervision during training for the localization task. The joint space is learned using a bidirectional ranking objective that is optimized using a $N$-Pair loss formulation. This training mechanism demonstrates the idea that localization information is learned inherently while optimizing a Bidirectional Retrieval objective. The model’s retrieval and localization performance is evaluated on MSCOCO and Flickr30K Entities datasets. This architecture outperforms the state of the art results in the semi-supervised phrase localization setting.
Tasks	Image Retrieval
Published	2019-08-08
URL	https://arxiv.org/abs/1908.02950v1
PDF	https://arxiv.org/pdf/1908.02950v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-phrase-localization-in-a
Repo
Framework

Data-Driven Design for Fourier Ptychographic Microscopy


Title	Data-Driven Design for Fourier Ptychographic Microscopy
Authors	Michael Kellman, Emrah Bostan, Michael Chen, Laura Waller
Abstract	Fourier Ptychographic Microscopy (FPM) is a computational imaging method that is able to super-resolve features beyond the diffraction-limit set by the objective lens of a traditional microscope. This is accomplished by using synthetic aperture and phase retrieval algorithms to combine many measurements captured by an LED array microscope with programmable source patterns. FPM provides simultaneous large field-of-view and high resolution imaging, but at the cost of reduced temporal resolution, thereby limiting live cell applications. In this work, we learn LED source pattern designs that compress the many required measurements into only a few, with negligible loss in reconstruction quality or resolution. This is accomplished by recasting the super-resolution reconstruction as a Physics-based Neural Network and learning the experimental design to optimize the network’s overall performance. Specifically, we learn LED patterns for different applications (e.g. amplitude contrast and quantitative phase imaging) and show that the designs we learn through simulation generalize well in the experimental setting. Further, we discuss a context-specific loss function, practical memory limitations, and interpretability of our learned designs.
Tasks	Super-Resolution
Published	2019-04-08
URL	http://arxiv.org/abs/1904.04175v1
PDF	http://arxiv.org/pdf/1904.04175v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-design-for-fourier-ptychographic
Repo
Framework