October 17, 2019

2868 words 14 mins read

Paper Group ANR 704

Towards Dialogue-based Navigation with Multivariate Adaptation driven by Intention and Politeness for Social Robots. Neural Network Compression using Transform Coding and Clustering. Asynchronous Stochastic Composition Optimization with Variance Reduction. Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization B …


Title	Towards Dialogue-based Navigation with Multivariate Adaptation driven by Intention and Politeness for Social Robots
Authors	Chandrakant Bothe, Fernando Garcia, Arturo Cruz Maya, Amit Kumar Pandey, Stefan Wermter
Abstract	Service robots need to show appropriate social behaviour in order to be deployed in social environments such as healthcare, education, retail, etc. Some of the main capabilities that robots should have are navigation and conversational skills. If the person is impatient, the person might want a robot to navigate faster and vice versa. Linguistic features that indicate politeness can provide social cues about a person’s patient and impatient behaviour. The novelty presented in this paper is to dynamically incorporate politeness in robotic dialogue systems for navigation. Understanding the politeness in users’ speech can be used to modulate the robot behaviour and responses. Therefore, we developed a dialogue system to navigate in an indoor environment, which produces different robot behaviours and responses based on users’ intention and degree of politeness. We deploy and test our system with the Pepper robot that adapts to the changes in user’s politeness.
Tasks
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07269v2
PDF	http://arxiv.org/pdf/1809.07269v2.pdf
PWC	https://paperswithcode.com/paper/towards-dialogue-based-navigation-with
Repo
Framework

Neural Network Compression using Transform Coding and Clustering


Title	Neural Network Compression using Transform Coding and Clustering
Authors	Thorsten Laude, Yannick Richter, Jörn Ostermann
Abstract	With the deployment of neural networks on mobile devices and the necessity of transmitting neural networks over limited or expensive channels, the file size of the trained model was identified as bottleneck. In this paper, we propose a codec for the compression of neural networks which is based on transform coding for convolutional and dense layers and on clustering for biases and normalizations. By using this codec, we achieve average compression factors between 7.9-9.3 while the accuracy of the compressed networks for image classification decreases only by 1%-2%, respectively.
Tasks	Image Classification, Neural Network Compression
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07258v1
PDF	http://arxiv.org/pdf/1805.07258v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-compression-using-transform
Repo
Framework

Asynchronous Stochastic Composition Optimization with Variance Reduction


Title	Asynchronous Stochastic Composition Optimization with Variance Reduction
Authors	Shuheng Shen, Linli Xu, Jingchang Liu, Junliang Guo, Qing Ling
Abstract	Composition optimization has drawn a lot of attention in a wide variety of machine learning domains from risk management to reinforcement learning. Existing methods solving the composition optimization problem often work in a sequential and single-machine manner, which limits their applications in large-scale problems. To address this issue, this paper proposes two asynchronous parallel variance reduced stochastic compositional gradient (AsyVRSC) algorithms that are suitable to handle large-scale data sets. The two algorithms are AsyVRSC-Shared for the shared-memory architecture and AsyVRSC-Distributed for the master-worker architecture. The embedded variance reduction techniques enable the algorithms to achieve linear convergence rates. Furthermore, AsyVRSC-Shared and AsyVRSC-Distributed enjoy provable linear speedup, when the time delays are bounded by the data dimensionality or the sparsity ratio of the partial gradients, respectively. Extensive experiments are conducted to verify the effectiveness of the proposed algorithms.
Tasks
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06396v1
PDF	http://arxiv.org/pdf/1811.06396v1.pdf
PWC	https://paperswithcode.com/paper/asynchronous-stochastic-composition
Repo
Framework

Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds


Title	Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds
Authors	Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus
Abstract	We present an efficient coresets-based neural network compression algorithm that sparsifies the parameters of a trained fully-connected neural network in a manner that provably approximates the network’s output. Our approach is based on an importance sampling scheme that judiciously defines a sampling distribution over the neural network parameters, and as a result, retains parameters of high importance while discarding redundant ones. We leverage a novel, empirical notion of sensitivity and extend traditional coreset constructions to the application of compressing parameters. Our theoretical analysis establishes guarantees on the size and accuracy of the resulting compressed network and gives rise to generalization bounds that may provide new insights into the generalization properties of neural networks. We demonstrate the practical effectiveness of our algorithm on a variety of neural network configurations and real-world data sets.
Tasks	Neural Network Compression
Published	2018-04-15
URL	https://arxiv.org/abs/1804.05345v6
PDF	https://arxiv.org/pdf/1804.05345v6.pdf
PWC	https://paperswithcode.com/paper/data-dependent-coresets-for-compressing
Repo
Framework

Distribution-based Prediction of the Degree of Grammaticalization for German Prepositions


Title	Distribution-based Prediction of the Degree of Grammaticalization for German Prepositions
Authors	Dominik Schlechtweg, Sabine Schulte im Walde
Abstract	We test the hypothesis that the degree of grammaticalization of German prepositions correlates with their corpus-based contextual dispersion measured by word entropy. We find that there is indeed a moderate correlation for entropy, but a stronger correlation for frequency and number of context types.
Tasks
Published	2018-04-14
URL	http://arxiv.org/abs/1804.06719v1
PDF	http://arxiv.org/pdf/1804.06719v1.pdf
PWC	https://paperswithcode.com/paper/distribution-based-prediction-of-the-degree
Repo
Framework

Training Convolutional Networks with Web Images


Title	Training Convolutional Networks with Web Images
Authors	Nizar Massouh
Abstract	In this thesis we investigate the effect of using web images to build a large scale database to be used along a deep learning method for a classification task. We replicate the ImageNet large scale database (ILSVRC-2012) from images collected from the web using 4 different download strategies varying: the search engine, the query and the image resolution. As a deep learning method, we will choose the Convolutional Neural Network that was very successful with recognition tasks; the AlexNet.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08416v1
PDF	http://arxiv.org/pdf/1805.08416v1.pdf
PWC	https://paperswithcode.com/paper/training-convolutional-networks-with-web
Repo
Framework

High Dimensional Linear Regression using Lattice Basis Reduction


Title	High Dimensional Linear Regression using Lattice Basis Reduction
Authors	David Gamarnik, Ilias Zadik
Abstract	We consider a high dimensional linear regression problem where the goal is to efficiently recover an unknown vector $\beta^$ from $n$ noisy linear observations $Y=X\beta^+W \in \mathbb{R}^n$, for known $X \in \mathbb{R}^{n \times p}$ and unknown $W \in \mathbb{R}^n$. Unlike most of the literature on this model we make no sparsity assumption on $\beta^$. Instead we adopt a regularization based on assuming that the underlying vectors $\beta^$ have rational entries with the same denominator $Q \in \mathbb{Z}_{>0}$. We call this $Q$-rationality assumption. We propose a new polynomial-time algorithm for this task which is based on the seminal Lenstra-Lenstra-Lovasz (LLL) lattice basis reduction algorithm. We establish that under the $Q$-rationality assumption, our algorithm recovers exactly the vector $\beta^*$ for a large class of distributions for the iid entries of $X$ and non-zero noise $W$. We prove that it is successful under small noise, even when the learner has access to only one observation ($n=1$). Furthermore, we prove that in the case of the Gaussian white noise for $W$, $n=o\left(p/\log p\right)$ and $Q$ sufficiently large, our algorithm tolerates a nearly optimal information-theoretic level of the noise.
Tasks
Published	2018-03-18
URL	http://arxiv.org/abs/1803.06716v2
PDF	http://arxiv.org/pdf/1803.06716v2.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-linear-regression-using
Repo
Framework

Radiomic Synthesis Using Deep Convolutional Neural Networks


Title	Radiomic Synthesis Using Deep Convolutional Neural Networks
Authors	Vishwa S. Parekh, Michael A. Jacobs
Abstract	Radiomics is a rapidly growing field that deals with modeling the textural information present in the different tissues of interest for clinical decision support. However, the process of generating radiomic images is computationally very expensive and could take substantial time per radiological image for certain higher order features, such as, gray-level co-occurrence matrix(GLCM), even with high-end GPUs. To that end, we developed RadSynth, a deep convolutional neural network(CNN) model, to efficiently generate radiomic images. RadSynth was tested on a breast cancer patient cohort of twenty-four patients(ten benign, ten malignant and four normal) for computation of GLCM entropy images from post-contrast DCE-MRI. RadSynth produced excellent synthetic entropy images compared to traditional GLCM entropy images. The average percentage difference and correlation between the two techniques were 0.07 $\pm$ 0.06 and 0.97, respectively. In conclusion, RadSynth presents a new powerful tool for fast computation and visualization of the textural information present in the radiological images.
Tasks
Published	2018-10-25
URL	https://arxiv.org/abs/1810.11090v2
PDF	https://arxiv.org/pdf/1810.11090v2.pdf
PWC	https://paperswithcode.com/paper/radiomic-synthesis-using-deep-convolutional
Repo
Framework

Improving Grey-Box Fuzzing by Modeling Program Behavior


Title	Improving Grey-Box Fuzzing by Modeling Program Behavior
Authors	Siddharth Karamcheti, Gideon Mann, David Rosenberg
Abstract	Grey-box fuzzers such as American Fuzzy Lop (AFL) are popular tools for finding bugs and potential vulnerabilities in programs. While these fuzzers have been able to find vulnerabilities in many widely used programs, they are not efficient; of the millions of inputs executed by AFL in a typical fuzzing run, only a handful discover unseen behavior or trigger a crash. The remaining inputs are redundant, exhibiting behavior that has already been observed. Here, we present an approach to increase the efficiency of fuzzers like AFL by applying machine learning to directly model how programs behave. We learn a forward prediction model that maps program inputs to execution traces, training on the thousands of inputs collected during standard fuzzing. This learned model guides exploration by focusing on fuzzing inputs on which our model is the most uncertain (measured via the entropy of the predicted execution trace distribution). By focusing on executing inputs our learned model is unsure about, and ignoring any input whose behavior our model is certain about, we show that we can significantly limit wasteful execution. Through testing our approach on a set of binaries released as part of the DARPA Cyber Grand Challenge, we show that our approach is able to find a set of inputs that result in more code coverage and discovered crashes than baseline fuzzers with significantly fewer executions.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08973v1
PDF	http://arxiv.org/pdf/1811.08973v1.pdf
PWC	https://paperswithcode.com/paper/improving-grey-box-fuzzing-by-modeling
Repo
Framework

Embedding Individual Table Columns for Resilient SQL Chatbots


Title	Embedding Individual Table Columns for Resilient SQL Chatbots
Authors	Bojan Petrovski, Ignacio Aguado, Andreea Hossmann, Michael Baeriswyl, Claudiu Musat
Abstract	Most of the world’s data is stored in relational databases. Accessing these requires specialized knowledge of the Structured Query Language (SQL), putting them out of the reach of many people. A recent research thread in Natural Language Processing (NLP) aims to alleviate this problem by automatically translating natural language questions into SQL queries. While the proposed solutions are a great start, they lack robustness and do not easily generalize: the methods require high quality descriptions of the database table columns, and the most widely used training dataset, WikiSQL, is heavily biased towards using those descriptions as part of the questions. In this work, we propose solutions to both problems: we entirely eliminate the need for column descriptions, by relying solely on their contents, and we augment the WikiSQL dataset by paraphrasing column names to reduce bias. We show that the accuracy of existing methods drops when trained on our augmented, column-agnostic dataset, and that our own method reaches state of the art accuracy, while relying on column contents only.
Tasks	Sql Chatbots
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00633v1
PDF	http://arxiv.org/pdf/1811.00633v1.pdf
PWC	https://paperswithcode.com/paper/embedding-individual-table-columns-for
Repo
Framework

Magnitude Bounded Matrix Factorisation for Recommender Systems


Title	Magnitude Bounded Matrix Factorisation for Recommender Systems
Authors	Shuai Jiang, Kan Li, Richard Yi Da Xu
Abstract	Low rank matrix factorisation is often used in recommender systems as a way of extracting latent features. When dealing with large and sparse datasets, traditional recommendation algorithms face the problem of acquiring large, unrestrained, fluctuating values over predictions especially for users/items with very few corresponding observations. Although the problem has been somewhat solved by imposing bounding constraints over its objectives, and/or over all entries to be within a fixed range, in terms of gaining better recommendations, these approaches have two major shortcomings that we aim to mitigate in this work: one is they can only deal with one pair of fixed bounds for all entries, and the other one is they are very time-consuming when applied on large scale recommender systems. In this paper, we propose a novel algorithm named Magnitude Bounded Matrix Factorisation (MBMF), which allows different bounds for individual users/items and performs very fast on large scale datasets. The key idea of our algorithm is to construct a model by constraining the magnitudes of each individual user/item feature vector. We achieve this by converting from the Cartesian to Spherical coordinate system with radii set as the corresponding magnitudes, which allows the above constrained optimisation problem to become an unconstrained one. The Stochastic Gradient Descent (SGD) method is then applied to solve the unconstrained task efficiently. Experiments on synthetic and real datasets demonstrate that in most cases the proposed MBMF is superior over all existing algorithms in terms of accuracy and time complexity.
Tasks	Recommendation Systems
Published	2018-07-15
URL	http://arxiv.org/abs/1807.05515v1
PDF	http://arxiv.org/pdf/1807.05515v1.pdf
PWC	https://paperswithcode.com/paper/magnitude-bounded-matrix-factorisation-for
Repo
Framework

Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications


Title	Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications
Authors	Deren Lei, Zichen Sun, Yijun Xiao, William Yang Wang
Abstract	Deep neural networks with remarkably strong generalization performances are usually over-parameterized. Despite explicit regularization strategies are used for practitioners to avoid over-fitting, the impacts are often small. Some theoretical studies have analyzed the implicit regularization effect of stochastic gradient descent (SGD) on simple machine learning models with certain assumptions. However, how it behaves practically in state-of-the-art models and real-world datasets is still unknown. To bridge this gap, we study the role of SGD implicit regularization in deep learning systems. We show pure SGD tends to converge to minimas that have better generalization performances in multiple natural language processing (NLP) tasks. This phenomenon coexists with dropout, an explicit regularizer. In addition, neural network’s finite learning capability does not impact the intrinsic nature of SGD’s implicit regularization effect. Specifically, under limited training samples or with certain corrupted labels, the implicit regularization effect remains strong. We further analyze the stability by varying the weight initialization range. We corroborate these experimental findings with a decision boundary visualization using a 3-layer neural network for interpretation. Altogether, our work enables a deepened understanding on how implicit regularization affects the deep learning model and sheds light on the future study of the over-parameterized model’s generalization ability.
Tasks
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00659v1
PDF	http://arxiv.org/pdf/1811.00659v1.pdf
PWC	https://paperswithcode.com/paper/implicit-regularization-of-stochastic
Repo
Framework

A Structural Correlation Filter Combined with A Multi-task Gaussian Particle Filter for Visual Tracking


Title	A Structural Correlation Filter Combined with A Multi-task Gaussian Particle Filter for Visual Tracking
Authors	Manna Dai, Shuying Cheng, Xiangjian He, Dadong Wang
Abstract	In this paper, we propose a novel structural correlation filter combined with a multi-task Gaussian particle filter (KCF-GPF) model for robust visual tracking. We first present an assemble structure where several KCF trackers as weak experts provide a preliminary decision for a Gaussian particle filter to make a final decision. The proposed method is designed to exploit and complement the strength of a KCF and a Gaussian particle filter. Compared with the existing tracking methods based on correlation filters or particle filters, the proposed tracker has several advantages. First, it can detect the tracked target in a large-scale search scope via weak KCF trackers and evaluate the reliability of weak trackers\rq decisions for a Gaussian particle filter to make a strong decision, and hence it can tackle fast motions, appearance variations, occlusions and re-detections. Second, it can effectively handle large-scale variations via a Gaussian particle filter. Third, it can be amenable to fully parallel implementation using importance sampling without resampling, thereby it is convenient for VLSI implementation and can lower the computational costs. Extensive experiments on the OTB-2013 dataset containing 50 challenging sequences demonstrate that the proposed algorithm performs favourably against 16 state-of-the-art trackers.
Tasks	Visual Tracking
Published	2018-03-03
URL	http://arxiv.org/abs/1803.05845v1
PDF	http://arxiv.org/pdf/1803.05845v1.pdf
PWC	https://paperswithcode.com/paper/a-structural-correlation-filter-combined-with
Repo
Framework

Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots


Title	Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots
Authors	Shaojie Jiang, Maarten de Rijke
Abstract	Diversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.
Tasks	Information Retrieval
Published	2018-09-06
URL	http://arxiv.org/abs/1809.01941v1
PDF	http://arxiv.org/pdf/1809.01941v1.pdf
PWC	https://paperswithcode.com/paper/why-are-sequence-to-sequence-models-so-dull
Repo
Framework

Method for Hybrid Precision Convolutional Neural Network Representation


Title	Method for Hybrid Precision Convolutional Neural Network Representation
Authors	Mo’taz Al-Hami, Marcin Pietron, Rishi Kumar, Raul A. Casas, Samer L. Hijazi, Chris Rowen
Abstract	This invention addresses fixed-point representations of convolutional neural networks (CNN) in integrated circuits. When quantizing a CNN for a practical implementation there is a trade-off between the precision used for operations between coefficients and data and the accuracy of the system. A homogenous representation may not be sufficient to achieve the best level of performance at a reasonable cost in implementation complexity or power consumption. Parsimonious ways of representing data and coefficients are needed to improve power efficiency and throughput while maintaining accuracy of a CNN.
Tasks
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09760v1
PDF	http://arxiv.org/pdf/1807.09760v1.pdf
PWC	https://paperswithcode.com/paper/method-for-hybrid-precision-convolutional
Repo
Framework