Paper Group ANR 290
Incremental Online Spoken Language Understanding. Iterative Delexicalization for Improved Spoken Language Understanding. A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification. A Taxonomy of Channel Pruning Signals in CNNs. Energy-Aware Analog Aggregation for Federated Learning with Redundant Data. LivDet in Action - Fi …
Incremental Online Spoken Language Understanding
Title | Incremental Online Spoken Language Understanding |
Authors | Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan |
Abstract | Spoken Language Understanding (SLU) typically comprises of an automatic speech recognition (ASR) followed by a natural language understanding (NLU) module. The two modules process signals in a blocking sequential fashion, i.e., the NLU often has to wait for the ASR to finish processing on an utterance basis, potentially leading to high latencies that render the spoken interaction less natural. In this paper, we propose recurrent neural network (RNN) based incremental processing towards the SLU task of intent detection. The proposed methodology offers lower latencies than a typical SLU system, without any significant reduction in system accuracy. We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems. A lexical End-of-Sentence (EOS) detector is proposed for segmenting the stream of transcript into sentences for intent classification. Intent detection experiments are conducted on benchmark ATIS dataset modified to emulate a continuous incremental stream of words with no utterance demarcation. We also analyze the prospects of early intent detection, before EOS, with our proposed system. |
Tasks | Intent Classification, Intent Detection, Speech Recognition, Spoken Language Understanding |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10287v1 |
https://arxiv.org/pdf/1910.10287v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-online-spoken-language |
Repo | |
Framework | |
Iterative Delexicalization for Improved Spoken Language Understanding
Title | Iterative Delexicalization for Improved Spoken Language Understanding |
Authors | Avik Ray, Yilin Shen, Hongxia Jin |
Abstract | Recurrent neural network (RNN) based joint intent classification and slot tagging models have achieved tremendous success in recent years for building spoken language understanding and dialog systems. However, these models suffer from poor performance for slots which often encounter large semantic variability in slot values after deployment (e.g. message texts, partial movie/artist names). While greedy delexicalization of slots in the input utterance via substring matching can partly improve performance, it often produces incorrect input. Moreover, such techniques cannot delexicalize slots with out-of-vocabulary slot values not seen at training. In this paper, we propose a novel iterative delexicalization algorithm, which can accurately delexicalize the input, even with out-of-vocabulary slot values. Based on model confidence of the current delexicalized input, our algorithm improves delexicalization in every iteration to converge to the best input having the highest confidence. We show on benchmark and in-house datasets that our algorithm can greatly improve parsing performance for RNN based models, especially for out-of-distribution slot values. |
Tasks | Intent Classification, Spoken Language Understanding |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.07060v1 |
https://arxiv.org/pdf/1910.07060v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-delexicalization-for-improved |
Repo | |
Framework | |
A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification
Title | A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification |
Authors | Varun Kumar, Hadrien Glaude, Cyprien de Lichy, William Campbell |
Abstract | New conversation topics and functionalities are constantly being added to conversational AI agents like Amazon Alexa and Apple Siri. As data collection and annotation is not scalable and is often costly, only a handful of examples for the new functionalities are available, which results in poor generalization performance. We formulate it as a Few-Shot Integration (FSI) problem where a few examples are used to introduce a new intent. In this paper, we study six feature space data augmentation methods to improve classification performance in FSI setting in combination with both supervised and unsupervised representation learning methods such as BERT. Through realistic experiments on two public conversational datasets, SNIPS, and the Facebook Dialog corpus, we show that data augmentation in feature space provides an effective way to improve intent classification performance in few-shot setting beyond traditional transfer learning approaches. In particular, we show that (a) upsampling in latent space is a competitive baseline for feature space augmentation (b) adding the difference between two examples to a new example is a simple yet effective data augmentation method. |
Tasks | Data Augmentation, Intent Classification, Representation Learning, Transfer Learning, Unsupervised Representation Learning |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04176v1 |
https://arxiv.org/pdf/1910.04176v1.pdf | |
PWC | https://paperswithcode.com/paper/a-closer-look-at-feature-space-data |
Repo | |
Framework | |
A Taxonomy of Channel Pruning Signals in CNNs
Title | A Taxonomy of Channel Pruning Signals in CNNs |
Authors | Kaveena Persand, Andrew Anderson, David Gregg |
Abstract | Convolutional neural networks (CNNs) are widely used for classification problems. However, they often require large amounts of computation and memory which are not readily available in resource constrained systems. Pruning unimportant parameters from CNNs to reduce these requirements has been a subject of intensive research in recent years. However, novel approaches in pruning signals are sometimes difficult to compare against each other. We propose a taxonomy that classifies pruning signals based on four mostly-orthogonal components of the signal. We also empirically evaluate 396 pruning signals including existing ones, and new signals constructed from the components of existing signals. We find that some of our newly constructed signals outperform the best existing pruning signals. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04675v1 |
https://arxiv.org/pdf/1906.04675v1.pdf | |
PWC | https://paperswithcode.com/paper/a-taxonomy-of-channel-pruning-signals-in-cnns |
Repo | |
Framework | |
Energy-Aware Analog Aggregation for Federated Learning with Redundant Data
Title | Energy-Aware Analog Aggregation for Federated Learning with Redundant Data |
Authors | Yuxuan Sun, Sheng Zhou, Deniz Gündüz |
Abstract | Federated learning (FL) enables workers to learn a model collaboratively by using their local data, with the help of a parameter server (PS) for global model aggregation. The high communication cost for periodic model updates and the non-independent and identically distributed (i.i.d.) data become major bottlenecks for FL. In this work, we consider analog aggregation to scale down the communication cost with respect to the number of workers, and introduce data redundancy to the system to deal with non-i.i.d. data. We propose an online energy-aware dynamic worker scheduling policy, which maximizes the average number of workers scheduled for gradient update at each iteration under a long-term energy constraint, and analyze its performance based on Lyapunov optimization. Experiments using MNIST dataset show that, for non-i.i.d. data, doubling data storage can improve the accuracy by 9.8% under a stringent energy budget, while the proposed policy can achieve close-to-optimal accuracy without violating the energy constraint. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00188v1 |
https://arxiv.org/pdf/1911.00188v1.pdf | |
PWC | https://paperswithcode.com/paper/energy-aware-analog-aggregation-for-federated |
Repo | |
Framework | |
LivDet in Action - Fingerprint Liveness Detection Competition 2019
Title | LivDet in Action - Fingerprint Liveness Detection Competition 2019 |
Authors | Giulia Orrù, Roberto Casula, Pierluigi Tuveri, Carlotta Bazzoni, Giovanna Dessalvi, Marco Micheletto, Luca Ghiani, Gian Luca Marcialis |
Abstract | The International Fingerprint liveness Detection Competition (LivDet) is an open and well-acknowledged meeting point of academies and private companies that deal with the problem of distinguishing images coming from reproductions of fingerprints made of artificial materials and images relative to real fingerprints. In this edition of LivDet we invited the competitors to propose integrated algorithms with matching systems. The goal was to investigate at which extent this integration impact on the whole performance. Twelve algorithms were submitted to the competition, eight of which worked on integrated systems. |
Tasks | |
Published | 2019-05-02 |
URL | http://arxiv.org/abs/1905.00639v1 |
http://arxiv.org/pdf/1905.00639v1.pdf | |
PWC | https://paperswithcode.com/paper/livdet-in-action-fingerprint-liveness |
Repo | |
Framework | |
Power analysis of knockoff filters for correlated designs
Title | Power analysis of knockoff filters for correlated designs |
Authors | Jingbo Liu, Philippe Rigollet |
Abstract | The knockoff filter introduced by Barber and Cand`es 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservative, there is no conclusive theoretical result on its power. When the predictors are i.i.d. Gaussian, it is known that as the signal to noise ratio tend to infinity, the knockoff filter is consistent in the sense that one can make FDR go to 0 and power go to 1 simultaneously. In this work we study the case where the predictors have a general covariance matrix $\Sigma$. We introduce a simple functional called effective signal deficiency (ESD) of the covariance matrix $\Sigma$ that predicts consistency of various variable selection methods. In particular, ESD reveals that the structure of the precision matrix $\Sigma^{-1}$ plays a central role in consistency and therefore, so does the conditional independence structure of the predictors. To leverage this connection, we introduce Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse). Our theoretical results are supported by numerical evidence on synthetic data. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12428v3 |
https://arxiv.org/pdf/1910.12428v3.pdf | |
PWC | https://paperswithcode.com/paper/power-analysis-of-knockoff-filters-for |
Repo | |
Framework | |
Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks
Title | Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks |
Authors | Felipe A. Mejia, Paul Gamble, Zigfried Hampel-Arias, Michael Lomnitz, Nina Lopatina, Lucas Tindall, Maria Alejandra Barrios |
Abstract | Adversarial training was introduced as a way to improve the robustness of deep learning models to adversarial attacks. This training method improves robustness against adversarial attacks, but increases the models vulnerability to privacy attacks. In this work we demonstrate how model inversion attacks, extracting training data directly from the model, previously thought to be intractable become feasible when attacking a robustly trained model. The input space for a traditionally trained model is dominated by adversarial examples - data points that strongly activate a certain class but lack semantic meaning - this makes it difficult to successfully conduct model inversion attacks. We demonstrate this effect using the CIFAR-10 dataset under three different model inversion attacks, a vanilla gradient descent method, gradient based method at different scales, and a generative adversarial network base attacks. |
Tasks | |
Published | 2019-06-15 |
URL | https://arxiv.org/abs/1906.06449v1 |
https://arxiv.org/pdf/1906.06449v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-or-private-adversarial-training-makes |
Repo | |
Framework | |
Deep Learning and MARS: A Connection
Title | Deep Learning and MARS: A Connection |
Authors | Michael Kohler, Adam Krzyzak, Sophie Langer |
Abstract | We consider least squares regression estimates using deep neural networks. We show that these estimates satisfy an oracle inequality, which implies that (up to a logarithmic factor) the error of these estimates is at least as small as the optimal possible error bound which one would expect for MARS in case that this procedure would work in the optimal way. As a result we show that our neural networks are able to achieve a dimensionality reduction in case that the regression function locally has low dimensionality. This assumption seems to be realistic in real-world applications, since selected high-dimensional data are often confined to locally-low-dimensional distributions. In our simulation study we provide numerical experiments to support our theoretical results and to compare our estimate with other conventional nonparametric regression estimates, especially with MARS. The use of our estimates is illustrated through a real data analysis. |
Tasks | Dimensionality Reduction |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11140v2 |
https://arxiv.org/pdf/1908.11140v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-and-mars-a-connection |
Repo | |
Framework | |
The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification
Title | The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification |
Authors | Jimmy Lin |
Abstract | Motivated by recent commentary that has questioned today’s pursuit of ever-more complex models and mathematical formalisms in applied machine learning and whether meaningful empirical progress is actually being made, this paper tries to tackle the decades-old problem of pseudo-relevance feedback with “the simplest thing that can possibly work”. I present a technique based on training a document relevance classifier for each information need using pseudo-labels from an initial ranked list and then applying the classifier to rerank the retrieved documents. Experiments demonstrate significant improvements across a number of newswire collections, with initial rankings supplied by “bag of words” BM25 as well as from a well-tuned query expansion model. While this simple technique draws elements from several well-known threads in the literature, to my knowledge this exact combination has not previously been proposed and evaluated. |
Tasks | Text Classification |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08861v1 |
http://arxiv.org/pdf/1904.08861v1.pdf | |
PWC | https://paperswithcode.com/paper/the-simplest-thing-that-can-possibly-work |
Repo | |
Framework | |
FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving
Title | FuseMODNet: Real-Time Camera and LiDAR based Moving Object Detection for robust low-light Autonomous Driving |
Authors | Hazem Rashed, Mohamed Ramzy, Victor Vaquero, Ahmad El Sallab, Ganesh Sistu, Senthil Yogamani |
Abstract | Moving object detection is a critical task for autonomous vehicles. As dynamic objects represent higher collision risk than static ones, our own ego-trajectories have to be planned attending to the future states of the moving elements of the scene. Motion can be perceived using temporal information such as optical flow. Conventional optical flow computation is based on camera sensors only, which makes it prone to failure in conditions with low illumination. On the other hand, LiDAR sensors are independent of illumination, as they measure the time-of-flight of their own emitted lasers. In this work, we propose a robust and real-time CNN architecture for Moving Object Detection (MOD) under low-light conditions by capturing motion information from both camera and LiDAR sensors. We demonstrate the impact of our algorithm on KITTI dataset where we simulate a low-light environment creating a novel dataset “Dark KITTI”. We obtain a 10.1% relative improvement on Dark-KITTI, and a 4.25% improvement on standard KITTI relative to our baselines. The proposed algorithm runs at 18 fps on a standard desktop GPU using $256\times1224$ resolution images. |
Tasks | Autonomous Driving, Autonomous Vehicles, Object Detection, Optical Flow Estimation |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05395v3 |
https://arxiv.org/pdf/1910.05395v3.pdf | |
PWC | https://paperswithcode.com/paper/fusemodnet-real-time-camera-and-lidar-based |
Repo | |
Framework | |
Stable Backward Diffusion Models that Minimise Convex Energies
Title | Stable Backward Diffusion Models that Minimise Convex Energies |
Authors | Leif Bergerhoff, Marcelo Cárdenas, Joachim Weickert, Martin Welk |
Abstract | Backward diffusion processes appear naturally in image enhancement and deblurring applications. However, the inverse problem of backward diffusion is known to be ill-posed and straightforward numerical algorithms are unstable. So far, existing stabilisation strategies in the literature require sophisticated numerics to solve the underlying initial value problem. Therefore, it is desirable to establish a backward diffusion model which implements a smart stabilisation approach that can be used in combination with a simple numerical scheme. We derive a class of space-discrete one-dimensional backward diffusion as gradient descent of energies where we gain stability by imposing range constraints. Interestingly, these energies are even convex. Furthermore, we establish a comprehensive theory for the time-continuous evolution and we show that stability carries over to a simple explicit time discretisation of our model. Finally, we confirm the stability and usefulness of our technique in experiments in which we enhance the contrast of digital greyscale and colour images. |
Tasks | Deblurring, Image Enhancement |
Published | 2019-03-08 |
URL | http://arxiv.org/abs/1903.03491v1 |
http://arxiv.org/pdf/1903.03491v1.pdf | |
PWC | https://paperswithcode.com/paper/stable-backward-diffusion-models-that |
Repo | |
Framework | |
CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots
Title | CASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots |
Authors | Arshit Gupta, Peng Zhang, Garima Lalwani, Mona Diab |
Abstract | Natural Language Understanding (NLU) is a core component of dialog systems. It typically involves two tasks - intent classification (IC) and slot labeling (SL), which are then followed by a dialogue management (DM) component. Such NLU systems cater to utterances in isolation, thus pushing the problem of context management to DM. However, contextual information is critical to the correct prediction of intents and slots in a conversation. Prior work on contextual NLU has been limited in terms of the types of contextual signals used and the understanding of their impact on the model. In this work, we propose a context-aware self-attentive NLU (CASA-NLU) model that uses multiple signals, such as previous intents, slots, dialog acts and utterances over a variable context window, in addition to the current user utterance. CASA-NLU outperforms a recurrent contextual NLU baseline on two conversational datasets, yielding a gain of up to 7% on the IC task for one of the datasets. Moreover, a non-contextual variant of CASA-NLU achieves state-of-the-art performance for IC task on standard public datasets - Snips and ATIS. |
Tasks | Dialogue Management, Intent Classification |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08705v1 |
https://arxiv.org/pdf/1909.08705v1.pdf | |
PWC | https://paperswithcode.com/paper/casa-nlu-context-aware-self-attentive-natural |
Repo | |
Framework | |
Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework
Title | Semi Supervised Phrase Localization in a Bidirectional Caption-Image Retrieval Framework |
Authors | Deepan Das, Noor Mohammed Ghouse, Shashank Verma, Yin Li |
Abstract | We introduce a novel deep neural network architecture that links visual regions to corresponding textual segments including phrases and words. To accomplish this task, our architecture makes use of the rich semantic information available in a joint embedding space of multi-modal data. From this joint embedding space, we extract the associative localization maps that develop naturally, without explicitly providing supervision during training for the localization task. The joint space is learned using a bidirectional ranking objective that is optimized using a $N$-Pair loss formulation. This training mechanism demonstrates the idea that localization information is learned inherently while optimizing a Bidirectional Retrieval objective. The model’s retrieval and localization performance is evaluated on MSCOCO and Flickr30K Entities datasets. This architecture outperforms the state of the art results in the semi-supervised phrase localization setting. |
Tasks | Image Retrieval |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.02950v1 |
https://arxiv.org/pdf/1908.02950v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-phrase-localization-in-a |
Repo | |
Framework | |
Data-Driven Design for Fourier Ptychographic Microscopy
Title | Data-Driven Design for Fourier Ptychographic Microscopy |
Authors | Michael Kellman, Emrah Bostan, Michael Chen, Laura Waller |
Abstract | Fourier Ptychographic Microscopy (FPM) is a computational imaging method that is able to super-resolve features beyond the diffraction-limit set by the objective lens of a traditional microscope. This is accomplished by using synthetic aperture and phase retrieval algorithms to combine many measurements captured by an LED array microscope with programmable source patterns. FPM provides simultaneous large field-of-view and high resolution imaging, but at the cost of reduced temporal resolution, thereby limiting live cell applications. In this work, we learn LED source pattern designs that compress the many required measurements into only a few, with negligible loss in reconstruction quality or resolution. This is accomplished by recasting the super-resolution reconstruction as a Physics-based Neural Network and learning the experimental design to optimize the network’s overall performance. Specifically, we learn LED patterns for different applications (e.g. amplitude contrast and quantitative phase imaging) and show that the designs we learn through simulation generalize well in the experimental setting. Further, we discuss a context-specific loss function, practical memory limitations, and interpretability of our learned designs. |
Tasks | Super-Resolution |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.04175v1 |
http://arxiv.org/pdf/1904.04175v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-design-for-fourier-ptychographic |
Repo | |
Framework | |