October 19, 2019

3169 words 15 mins read

Paper Group ANR 143

Paper Group ANR 143

WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations. MVG Mechanism: Differential Privacy under Matrix-Valued Query. Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization. PieAPP: Perceptual Image-Error Assessment through Pairwise Preference. Multi-variable LSTM neural netw …

WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations

Title WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
Authors Mohammad Taher Pilehvar, Jose Camacho-Collados
Abstract By design, word embeddings are unable to model the dynamic nature of words’ semantics, i.e., the property of words to correspond to potentially different meanings. To address this limitation, dozens of specialized meaning representation techniques such as sense or contextualized embeddings have been proposed. However, despite the popularity of research on this topic, very few evaluation benchmarks exist that specifically focus on the dynamic semantics of words. In this paper we show that existing models have surpassed the performance ceiling of the standard evaluation dataset for the purpose, i.e., Stanford Contextual Word Similarity, and highlight its shortcomings. To address the lack of a suitable benchmark, we put forward a large-scale Word in Context dataset, called WiC, based on annotations curated by experts, for generic evaluation of context-sensitive representations. WiC is released in https://pilehvar.github.io/wic/.
Tasks Word Embeddings
Published 2018-08-28
URL http://arxiv.org/abs/1808.09121v3
PDF http://arxiv.org/pdf/1808.09121v3.pdf
PWC https://paperswithcode.com/paper/wic-10000-example-pairs-for-evaluating
Repo
Framework

MVG Mechanism: Differential Privacy under Matrix-Valued Query

Title MVG Mechanism: Differential Privacy under Matrix-Valued Query
Authors Thee Chanyaswad, Alex Dytso, H. Vincent Poor, Prateek Mittal
Abstract Differential privacy mechanism design has traditionally been tailored for a scalar-valued query function. Although many mechanisms such as the Laplace and Gaussian mechanisms can be extended to a matrix-valued query function by adding i.i.d. noise to each element of the matrix, this method is often suboptimal as it forfeits an opportunity to exploit the structural characteristics typically associated with matrix analysis. To address this challenge, we propose a novel differential privacy mechanism called the Matrix-Variate Gaussian (MVG) mechanism, which adds a matrix-valued noise drawn from a matrix-variate Gaussian distribution, and we rigorously prove that the MVG mechanism preserves $(\epsilon,\delta)$-differential privacy. Furthermore, we introduce the concept of directional noise made possible by the design of the MVG mechanism. Directional noise allows the impact of the noise on the utility of the matrix-valued query function to be moderated. Finally, we experimentally demonstrate the performance of our mechanism using three matrix-valued queries on three privacy-sensitive datasets. We find that the MVG mechanism notably outperforms four previous state-of-the-art approaches, and provides comparable utility to the non-private baseline.
Tasks
Published 2018-01-02
URL http://arxiv.org/abs/1801.00823v3
PDF http://arxiv.org/pdf/1801.00823v3.pdf
PWC https://paperswithcode.com/paper/mvg-mechanism-differential-privacy-under
Repo
Framework

Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization

Title Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization
Authors Aryan Mokhtari, Hamed Hassani, Amin Karbasi
Abstract This paper considers stochastic optimization problems for a large class of objective functions, including convex and continuous submodular. Stochastic proximal gradient methods have been widely used to solve such problems; however, their applicability remains limited when the problem dimension is large and the projection onto a convex set is costly. Instead, stochastic conditional gradient methods are proposed as an alternative solution relying on (i) Approximating gradients via a simple averaging technique requiring a single stochastic gradient evaluation per iteration; (ii) Solving a linear program to compute the descent/ascent direction. The averaging technique reduces the noise of gradient approximations as time progresses, and replacing projection step in proximal methods by a linear program lowers the computational complexity of each iteration. We show that under convexity and smoothness assumptions, our proposed method converges to the optimal objective function value at a sublinear rate of $O(1/t^{1/3})$. Further, for a monotone and continuous DR-submodular function and subject to a general convex body constraint, we prove that our proposed method achieves a $((1-1/e)OPT-\eps)$ guarantee with $O(1/\eps^3)$ stochastic gradient computations. This guarantee matches the known hardness results and closes the gap between deterministic and stochastic continuous submodular maximization. Additionally, we obtain $((1/e)OPT -\eps)$ guarantee after using $O(1/\eps^3)$ stochastic gradients for the case that the objective function is continuous DR-submodular but non-monotone and the constraint set is down-closed. By using stochastic continuous optimization as an interface, we provide the first $(1-1/e)$ tight approximation guarantee for maximizing a monotone but stochastic submodular set function subject to a matroid constraint and $(1/e)$ approximation guarantee for the non-monotone case.
Tasks Stochastic Optimization
Published 2018-04-24
URL http://arxiv.org/abs/1804.09554v2
PDF http://arxiv.org/pdf/1804.09554v2.pdf
PWC https://paperswithcode.com/paper/stochastic-conditional-gradient-methods-from
Repo
Framework

PieAPP: Perceptual Image-Error Assessment through Pairwise Preference

Title PieAPP: Perceptual Image-Error Assessment through Pairwise Preference
Authors Ekta Prashnani, Hong Cai, Yasamin Mostofi, Pradeep Sen
Abstract The ability to estimate the perceptual error between images is an important problem in computer vision with many applications. Although it has been studied extensively, however, no method currently exists that can robustly predict visual differences like humans. Some previous approaches used hand-coded models, but they fail to model the complexity of the human visual system. Others used machine learning to train models on human-labeled datasets, but creating large, high-quality datasets is difficult because people are unable to assign consistent error labels to distorted images. In this paper, we present a new learning-based method that is the first to predict perceptual image error like human observers. Since it is much easier for people to compare two given images and identify the one more similar to a reference than to assign quality scores to each, we propose a new, large-scale dataset labeled with the probability that humans will prefer one image over another. We then train a deep-learning model using a novel, pairwise-learning framework to predict the preference of one distorted image over the other. Our key observation is that our trained network can then be used separately with only one distorted image and a reference to predict its perceptual error, without ever being trained on explicit human perceptual-error labels. The perceptual error estimated by our new metric, PieAPP, is well-correlated with human opinion. Furthermore, it significantly outperforms existing algorithms, beating the state-of-the-art by almost 3x on our test set in terms of binary error rate, while also generalizing to new kinds of distortions, unlike previous learning-based methods.
Tasks
Published 2018-06-06
URL http://arxiv.org/abs/1806.02067v1
PDF http://arxiv.org/pdf/1806.02067v1.pdf
PWC https://paperswithcode.com/paper/pieapp-perceptual-image-error-assessment
Repo
Framework

Multi-variable LSTM neural network for autoregressive exogenous model

Title Multi-variable LSTM neural network for autoregressive exogenous model
Authors Tian Guo, Tao Lin
Abstract In this paper, we propose multi-variable LSTM capable of accurate forecasting and variable importance interpretation for time series with exogenous variables. Current attention mechanism in recurrent neural networks mostly focuses on the temporal aspect of data and falls short of characterizing variable importance. To this end, the multi-variable LSTM equipped with tensorized hidden states is developed to learn hidden states for individual variables, which give rise to our mixture temporal and variable attention. Based on such attention mechanism, we infer and quantify variable importance. Extensive experiments using real datasets with Granger-causality test and the synthetic dataset with ground truth demonstrate the prediction performance and interpretability of multi-variable LSTM in comparison to a variety of baselines. It exhibits the prospect of multi-variable LSTM as an end-to-end framework for both forecasting and knowledge discovery.
Tasks Time Series
Published 2018-06-17
URL http://arxiv.org/abs/1806.06384v1
PDF http://arxiv.org/pdf/1806.06384v1.pdf
PWC https://paperswithcode.com/paper/multi-variable-lstm-neural-network-for
Repo
Framework

Quantum algorithms for training Gaussian Processes

Title Quantum algorithms for training Gaussian Processes
Authors Zhikuan Zhao, Jack K. Fitzsimons, Michael A. Osborne, Stephen J. Roberts, Joseph F. Fitzsimons
Abstract Gaussian processes (GPs) are important models in supervised machine learning. Training in Gaussian processes refers to selecting the covariance functions and the associated parameters in order to improve the outcome of predictions, the core of which amounts to evaluating the logarithm of the marginal likelihood (LML) of a given model. LML gives a concrete measure of the quality of prediction that a GP model is expected to achieve. The classical computation of LML typically carries a polynomial time overhead with respect to the input size. We propose a quantum algorithm that computes the logarithm of the determinant of a Hermitian matrix, which runs in logarithmic time for sparse matrices. This is applied in conjunction with a variant of the quantum linear system algorithm that allows for logarithmic time computation of the form $\mathbf{y}^TA^{-1}\mathbf{y}$, where $\mathbf{y}$ is a dense vector and $A$ is the covariance matrix. We hence show that quantum computing can be used to estimate the LML of a GP with exponentially improved efficiency under certain conditions.
Tasks Gaussian Processes
Published 2018-03-28
URL http://arxiv.org/abs/1803.10520v1
PDF http://arxiv.org/pdf/1803.10520v1.pdf
PWC https://paperswithcode.com/paper/quantum-algorithms-for-training-gaussian
Repo
Framework

Neural Approaches to Conversational AI

Title Neural Approaches to Conversational AI
Authors Jianfeng Gao, Michel Galley, Lihong Li
Abstract The present paper surveys neural approaches to conversational AI that have been developed in the last few years. We group conversational systems into three categories: (1) question answering agents, (2) task-oriented dialogue agents, and (3) chatbots. For each category, we present a review of state-of-the-art neural approaches, draw the connection between them and traditional approaches, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies.
Tasks Question Answering
Published 2018-09-21
URL https://arxiv.org/abs/1809.08267v3
PDF https://arxiv.org/pdf/1809.08267v3.pdf
PWC https://paperswithcode.com/paper/neural-approaches-to-conversational-ai
Repo
Framework

VoxSegNet: Volumetric CNNs for Semantic Part Segmentation of 3D Shapes

Title VoxSegNet: Volumetric CNNs for Semantic Part Segmentation of 3D Shapes
Authors Zongji Wang, Feng Lu
Abstract Voxel is an important format to represent geometric data, which has been widely used for 3D deep learning in shape analysis due to its generalization ability and regular data format. However, fine-grained tasks like part segmentation require detailed structural information, which increases voxel resolution and thus causes other issues such as the exhaustion of computational resources. In this paper, we propose a novel volumetric convolutional neural network, which could extract discriminative features encoding detailed information from voxelized 3D data under a limited resolution. To this purpose, a spatial dense extraction (SDE) module is designed to preserve the spatial resolution during the feature extraction procedure, alleviating the loss of detail caused by sub-sampling operations such as max-pooling. An attention feature aggregation (AFA) module is also introduced to adaptively select informative features from different abstraction scales, leading to segmentation with both semantic consistency and high accuracy of details. Experiment results on the large-scale dataset demonstrate the effectiveness of our method in 3D shape part segmentation.
Tasks
Published 2018-09-01
URL http://arxiv.org/abs/1809.00226v1
PDF http://arxiv.org/pdf/1809.00226v1.pdf
PWC https://paperswithcode.com/paper/voxsegnet-volumetric-cnns-for-semantic-part
Repo
Framework

Multi-Domain Neural Machine Translation

Title Multi-Domain Neural Machine Translation
Authors Sander Tars, Mark Fishel
Abstract We present an approach to neural machine translation (NMT) that supports multiple domains in a single model and allows switching between the domains when translating. The core idea is to treat text domains as distinct languages and use multilingual NMT methods to create multi-domain translation systems, we show that this approach results in significant translation quality gains over fine-tuning. We also explore whether the knowledge of pre-specified text domains is necessary, turns out that it is after all, but also that when it is not known quite high translation quality can be reached.
Tasks Machine Translation
Published 2018-05-06
URL http://arxiv.org/abs/1805.02282v1
PDF http://arxiv.org/pdf/1805.02282v1.pdf
PWC https://paperswithcode.com/paper/multi-domain-neural-machine-translation-1
Repo
Framework

Space-efficient Feature Maps for String Alignment Kernels

Title Space-efficient Feature Maps for String Alignment Kernels
Authors Yasuo Tabei, Yoshihiro Yamanishi, Rasmus Pagh
Abstract String kernels are attractive data analysis tools for analyzing string data. Among them, alignment kernels are known for their high prediction accuracies in string classifications when tested in combination with SVM in various applications. However, alignment kernels have a crucial drawback in that they scale poorly due to their quadratic computation complexity in the number of input strings, which limits large-scale applications in practice. We address this need by presenting the first approximation for string alignment kernels, which we call space-efficient feature maps for edit distance with moves (SFMEDM), by leveraging a metric embedding named edit sensitive parsing (ESP) and feature maps (FMs) of random Fourier features (RFFs) for large-scale string analyses. The original FMs for RFFs consume a huge amount of memory proportional to the dimension d of input vectors and the dimension D of output vectors, which prohibits its large-scale applications. We present novel space-efficient feature maps (SFMs) of RFFs for a space reduction from O(dD) of the original FMs to O(d) of SFMs with a theoretical guarantee with respect to concentration bounds. We experimentally test SFMEDM on its ability to learn SVM for large-scale string classifications with various massive string data, and we demonstrate the superior performance of SFMEDM with respect to prediction accuracy, scalability and computation efficiency.
Tasks
Published 2018-02-18
URL https://arxiv.org/abs/1802.06382v9
PDF https://arxiv.org/pdf/1802.06382v9.pdf
PWC https://paperswithcode.com/paper/scalable-alignment-kernels-via-space
Repo
Framework

Collaboratively Learning the Best Option, Using Bounded Memory

Title Collaboratively Learning the Best Option, Using Bounded Memory
Authors Lili Su, Martin Zubeldia, Nancy Lynch
Abstract We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $t \to \infty$) it pulls only the arm with the highest average reward. While this goal is provably impossible for an isolated individual, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion, i.e., communication. Specifically, we study the learning dynamics wherein an individual sequentially decides on which arm to pull next based on not only its private reward feedback but also the suggestions provided by randomly chosen peers. Our learning dynamics are hard to analyze via explicit probabilistic calculations due to the stochastic dependency induced by social interaction. Instead, we employ the mean-field approximation method from statistical physics and we show: (1) With probability $\to 1$ as the social group size $N \to \infty $, every individual in the social group learns the best option. (2) Over an arbitrary finite time horizon $[0, T]$, with high probability (in $N$), the fraction of individuals that prefer the best option grows to 1 exponentially fast as $t$ increases ($t\in [0, T]$). A major innovation of our mean-filed analysis is a simple yet powerful technique to deal with absorbing states in the interchange of limits $N \to \infty$ and $t \to \infty $. The mean-field approximation method allows us to approximate the probabilistic sample paths of our learning dynamics by a deterministic and smooth trajectory that corresponds to the unique solution of a well-behaved system of ordinary differential equations (ODEs). Such an approximation is desired because the analysis of a system of ODEs is relatively easier than that of the original stochastic system.
Tasks
Published 2018-02-22
URL http://arxiv.org/abs/1802.08159v3
PDF http://arxiv.org/pdf/1802.08159v3.pdf
PWC https://paperswithcode.com/paper/collaboratively-learning-the-best-option
Repo
Framework

Deep Watershed Detector for Music Object Recognition

Title Deep Watershed Detector for Music Object Recognition
Authors Lukas Tuggener, Ismail Elezi, Jurgen Schmidhuber, Thilo Stadelmann
Abstract Optical Music Recognition (OMR) is an important and challenging area within music information retrieval, the accurate detection of music symbols in digital images is a core functionality of any OMR pipeline. In this paper, we introduce a novel object detection method, based on synthetic energy maps and the watershed transform, called Deep Watershed Detector (DWD). Our method is specifically tailored to deal with high resolution images that contain a large number of very small objects and is therefore able to process full pages of written music. We present state-of-the-art detection results of common music symbols and show DWD’s ability to work with synthetic scores equally well as on handwritten music.
Tasks Information Retrieval, Music Information Retrieval, Object Detection, Object Recognition
Published 2018-05-26
URL http://arxiv.org/abs/1805.10548v1
PDF http://arxiv.org/pdf/1805.10548v1.pdf
PWC https://paperswithcode.com/paper/deep-watershed-detector-for-music-object
Repo
Framework

Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion

Title Exploiting Task-Oriented Resources to Learn Word Embeddings for Clinical Abbreviation Expansion
Authors Yue Liu, Tao Ge, Kusum S. Mathews, Heng Ji, Deborah L. McGuinness
Abstract In the medical domain, identifying and expanding abbreviations in clinical texts is a vital task for both better human and machine understanding. It is a challenging task because many abbreviations are ambiguous especially for intensive care medicine texts, in which phrase abbreviations are frequently used. Besides the fact that there is no universal dictionary of clinical abbreviations and no universal rules for abbreviation writing, such texts are difficult to acquire, expensive to annotate and even sometimes, confusing to domain experts. This paper proposes a novel and effective approach - exploiting task-oriented resources to learn word embeddings for expanding abbreviations in clinical notes. We achieved 82.27% accuracy, close to expert human performance.
Tasks Word Embeddings
Published 2018-04-11
URL http://arxiv.org/abs/1804.04225v1
PDF http://arxiv.org/pdf/1804.04225v1.pdf
PWC https://paperswithcode.com/paper/exploiting-task-oriented-resources-to-learn
Repo
Framework

Text-based Sentiment Analysis and Music Emotion Recognition

Title Text-based Sentiment Analysis and Music Emotion Recognition
Authors Erion Çano
Abstract Sentiment polarity of tweets, blog posts or product reviews has become highly attractive and is utilized in recommender systems, market predictions, business intelligence and more. Deep learning techniques are becoming top performers on analyzing such texts. There are however several problems that need to be solved for efficient use of deep neural networks on text mining and text polarity analysis. First, deep neural networks need to be fed with data sets that are big in size as well as properly labeled. Second, there are various uncertainties regarding the use of word embedding vectors: should they be generated from the same data set that is used to train the model or it is better to source them from big and popular collections? Third, to simplify model creation it is convenient to have generic neural network architectures that are effective and can adapt to various texts, encapsulating much of design complexity. This thesis addresses the above problems to provide methodological and practical insights for utilizing neural networks on sentiment analysis of texts and achieving state of the art results. Regarding the first problem, the effectiveness of various crowdsourcing alternatives is explored and two medium-sized and emotion-labeled song data sets are created utilizing social tags. To address the second problem, a series of experiments with large text collections of various contents and domains were conducted, trying word embeddings of various parameters. Regarding the third problem, a series of experiments involving convolution and max-pooling neural layers were conducted. Combining convolutions of words, bigrams, and trigrams with regional max-pooling layers in a couple of stacks produced the best results. The derived architecture achieves competitive performance on sentiment polarity analysis of movie, business and product reviews.
Tasks Emotion Recognition, Music Emotion Recognition, Recommendation Systems, Sentiment Analysis, Word Embeddings
Published 2018-10-06
URL http://arxiv.org/abs/1810.03031v1
PDF http://arxiv.org/pdf/1810.03031v1.pdf
PWC https://paperswithcode.com/paper/text-based-sentiment-analysis-and-music
Repo
Framework

A Two-Stage Subspace Trust Region Approach for Deep Neural Network Training

Title A Two-Stage Subspace Trust Region Approach for Deep Neural Network Training
Authors Viacheslav Dudar, Giovanni Chierchia, Emilie Chouzenoux, Jean-Christophe Pesquet, Vladimir Semenov
Abstract In this paper, we develop a novel second-order method for training feed-forward neural nets. At each iteration, we construct a quadratic approximation to the cost function in a low-dimensional subspace. We minimize this approximation inside a trust region through a two-stage procedure: first inside the embedded positive curvature subspace, followed by a gradient descent step. This approach leads to a fast objective function decay, prevents convergence to saddle points, and alleviates the need for manually tuning parameters. We show the good performance of the proposed algorithm on benchmark datasets.
Tasks
Published 2018-05-23
URL http://arxiv.org/abs/1805.09430v1
PDF http://arxiv.org/pdf/1805.09430v1.pdf
PWC https://paperswithcode.com/paper/a-two-stage-subspace-trust-region-approach
Repo
Framework
comments powered by Disqus