Paper Group ANR 1120
Novelty Detection Meets Collider Physics. Phrase Table as Recommendation Memory for Neural Machine Translation. Population Anomaly Detection through Deep Gaussianization. Piano Genie. Annotation Artifacts in Natural Language Inference Data. Towards Machine Learning Induction. Deep contextualized word representations for detecting sarcasm and irony. …
Novelty Detection Meets Collider Physics
Title | Novelty Detection Meets Collider Physics |
Authors | Jan Hajer, Ying-Ying Li, Tao Liu, He Wang |
Abstract | Novelty detection is the machine learning task to recognize data, which belong to an unknown pattern. Complementary to supervised learning, it allows to analyze data model-independently. We demonstrate the potential role of novelty detection in collider physics, using autoencoder-based deep neural network. Explicitly, we develop a set of density-based novelty evaluators, which are sensitive to the clustering of unknown-pattern testing data or new-physics signal events, for the design of detection algorithms. We also explore the influence of the known-pattern data fluctuations, arising from non-signal regions, on detection sensitivity. Strategies to address it are proposed. The algorithms are applied to detecting fermionic di-top partner and resonant di-top productions at LHC, and exotic Higgs decays of two specific modes at a $e^+e^-$ future collider. With parton-level analysis, we conclude that potentially the new-physics benchmarks can be recognized with high efficiency. |
Tasks | |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10261v2 |
http://arxiv.org/pdf/1807.10261v2.pdf | |
PWC | https://paperswithcode.com/paper/novelty-detection-meets-collider-physics |
Repo | |
Framework | |
Phrase Table as Recommendation Memory for Neural Machine Translation
Title | Phrase Table as Recommendation Memory for Neural Machine Translation |
Authors | Yang Zhao, Yining Wang, Jiajun Zhang, Chengqing Zong |
Abstract | Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance recently. However, several studies indicate that NMT often generates fluent but unfaithful translations. In this paper, we propose a method to alleviate this problem by using a phrase table as recommendation memory. The main idea is to add bonus to words worthy of recommendation, so that NMT can make correct predictions. Specifically, we first derive a prefix tree to accommodate all the candidate target phrases by searching the phrase translation table according to the source sentence. Then, we construct a recommendation word set by matching between candidate target phrases and previously translated target words by NMT. After that, we determine the specific bonus value for each recommendable word by using the attention vector and phrase translation probability. Finally, we integrate this bonus value into NMT to improve the translation results. The extensive experiments demonstrate that the proposed methods obtain remarkable improvements over the strong attentionbased NMT. |
Tasks | Machine Translation |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.09960v1 |
http://arxiv.org/pdf/1805.09960v1.pdf | |
PWC | https://paperswithcode.com/paper/phrase-table-as-recommendation-memory-for |
Repo | |
Framework | |
Population Anomaly Detection through Deep Gaussianization
Title | Population Anomaly Detection through Deep Gaussianization |
Authors | David Tolpin |
Abstract | We introduce an algorithmic method for population anomaly detection based on gaussianization through an adversarial autoencoder. This method is applicable to detection of `soft’ anomalies in arbitrarily distributed highly-dimensional data. A soft, or population, anomaly is characterized by a shift in the distribution of the data set, where certain elements appear with higher probability than anticipated. Such anomalies must be detected by considering a sufficiently large sample set rather than a single sample. Applications include, but not limited to, payment fraud trends, data exfiltration, disease clusters and epidemics, and social unrests. We evaluate the method on several domains and obtain both quantitative results and qualitative insights. | |
Tasks | Anomaly Detection |
Published | 2018-05-05 |
URL | http://arxiv.org/abs/1805.02123v1 |
http://arxiv.org/pdf/1805.02123v1.pdf | |
PWC | https://paperswithcode.com/paper/population-anomaly-detection-through-deep |
Repo | |
Framework | |
Piano Genie
Title | Piano Genie |
Authors | Chris Donahue, Ian Simon, Sander Dieleman |
Abstract | We present Piano Genie, an intelligent controller which allows non-musicians to improvise on the piano. With Piano Genie, a user performs on a simple interface with eight buttons, and their performance is decoded into the space of plausible piano music in real time. To learn a suitable mapping procedure for this problem, we train recurrent neural network autoencoders with discrete bottlenecks: an encoder learns an appropriate sequence of buttons corresponding to a piano piece, and a decoder learns to map this sequence back to the original piece. During performance, we substitute a user’s input for the encoder output, and play the decoder’s prediction each time the user presses a button. To improve the intuitiveness of Piano Genie’s performance behavior, we impose musically meaningful constraints over the encoder’s outputs. |
Tasks | |
Published | 2018-10-11 |
URL | http://arxiv.org/abs/1810.05246v2 |
http://arxiv.org/pdf/1810.05246v2.pdf | |
PWC | https://paperswithcode.com/paper/piano-genie |
Repo | |
Framework | |
Annotation Artifacts in Natural Language Inference Data
Title | Annotation Artifacts in Natural Language Inference Data |
Authors | Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, Noah A. Smith |
Abstract | Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to. We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. Specifically, we show that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI (Bowman et. al, 2015) and 53% of MultiNLI (Williams et. al, 2017). Our analysis reveals that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes. Our findings suggest that the success of natural language inference models to date has been overestimated, and that the task remains a hard open problem. |
Tasks | Natural Language Inference, Text Categorization |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02324v2 |
http://arxiv.org/pdf/1803.02324v2.pdf | |
PWC | https://paperswithcode.com/paper/annotation-artifacts-in-natural-language |
Repo | |
Framework | |
Towards Machine Learning Induction
Title | Towards Machine Learning Induction |
Authors | Yutaka Nagashima |
Abstract | Induction lies at the heart of mathematics and computer science. However, automated theorem proving of inductive problems is still limited in its power. In this abstract, we first summarize our progress in automating inductive theorem proving for Isabelle/HOL. Then, we present MeLoId, our approach to suggesting promising applications of induction without completing a proof search. |
Tasks | Automated Theorem Proving |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.04088v2 |
http://arxiv.org/pdf/1812.04088v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-machine-learning-mathematical |
Repo | |
Framework | |
Deep contextualized word representations for detecting sarcasm and irony
Title | Deep contextualized word representations for detecting sarcasm and irony |
Authors | Suzana Ilić, Edison Marrese-Taylor, Jorge A. Balazs, Yutaka Matsuo |
Abstract | Predicting context-dependent and non-literal utterances like sarcastic and ironic expressions still remains a challenging task in NLP, as it goes beyond linguistic patterns, encompassing common sense and shared knowledge as crucial components. To capture complex morpho-syntactic features that can usually serve as indicators for irony or sarcasm across dynamic contexts, we propose a model that uses character-level vector representations of words, based on ELMo. We test our model on 7 different datasets derived from 3 different data sources, providing state-of-the-art performance in 6 of them, and otherwise offering competitive results. |
Tasks | Common Sense Reasoning |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.09795v1 |
http://arxiv.org/pdf/1809.09795v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-contextualized-word-representations-for |
Repo | |
Framework | |
Exploiting the Value of the Center-dark Channel Prior for Salient Object Detection
Title | Exploiting the Value of the Center-dark Channel Prior for Salient Object Detection |
Authors | Chunbiao Zhu, Wenhao Zhang, Thomas H. Li, Ge Li |
Abstract | Saliency detection aims to detect the most attractive objects in images and is widely used as a foundation for various applications. In this paper, we propose a novel salient object detection algorithm for RGB-D images using center-dark channel priors. First, we generate an initial saliency map based on a color saliency map and a depth saliency map of a given RGB-D image. Then, we generate a center-dark channel map based on center saliency and dark channel priors. Finally, we fuse the initial saliency map with the center dark channel map to generate the final saliency map. Extensive evaluations over four benchmark datasets demonstrate that our proposed method performs favorably against most of the state-of-the-art approaches. Besides, we further discuss the application of the proposed algorithm in small target detection and demonstrate the universal value of center-dark channel priors in the field of object detection. |
Tasks | Object Detection, Saliency Detection, Salient Object Detection |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05132v1 |
http://arxiv.org/pdf/1805.05132v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-the-value-of-the-center-dark |
Repo | |
Framework | |
Exact Passive-Aggressive Algorithms for Learning to Rank Using Interval Labels
Title | Exact Passive-Aggressive Algorithms for Learning to Rank Using Interval Labels |
Authors | Naresh Manwani, Mohit Chandra |
Abstract | In this paper, we propose exact passive-aggressive (PA) online algorithms for learning to rank. The proposed algorithms can be used even when we have interval labels instead of actual labels for examples. The proposed algorithms solve a convex optimization problem at every trial. We find exact solution to those optimization problems to determine the updated parameters. We propose support class algorithm (SCA) which finds the active constraints using the KKT conditions of the optimization problems. These active constrains form support set which determines the set of thresholds that need to be updated. We derive update rules for PA, PA-I and PA-II. We show that the proposed algorithms maintain the ordering of the thresholds after every trial. We provide the mistake bounds of the proposed algorithms in both ideal and general settings. We also show experimentally that the proposed algorithms successfully learn accurate classifiers using interval labels as well as exact labels. Proposed algorithms also do well compared to other approaches. |
Tasks | Learning-To-Rank |
Published | 2018-08-18 |
URL | http://arxiv.org/abs/1808.06107v1 |
http://arxiv.org/pdf/1808.06107v1.pdf | |
PWC | https://paperswithcode.com/paper/exact-passive-aggressive-algorithms-for |
Repo | |
Framework | |
Learning-based attacks in cyber-physical systems
Title | Learning-based attacks in cyber-physical systems |
Authors | Mohammad Javad Khojasteh, Anatoly Khina, Massimo Franceschetti, Tara Javidi |
Abstract | We introduce the problem of learning-based attacks in a simple abstraction of cyber-physical systems—the case of a discrete-time, linear, time-invariant plant that may be subject to an attack that overrides the sensor readings and the controller actions. The attacker attempts to learn the dynamics of the plant and subsequently override the controller’s actuation signal, to destroy the plant without being detected. The attacker can feed fictitious sensor readings to the controller using its estimate of the plant dynamics and mimic the legitimate plant operation. The controller, on the other hand, is constantly on the lookout for an attack; once the controller detects an attack, it immediately shuts the plant off. In the case of scalar plants, we derive an upper bound on the attacker’s deception probability for any measurable control policy when the attacker uses an arbitrary learning algorithm to estimate the system dynamics. We then derive lower bounds for the attacker’s deception probability for both scalar and vector plants by assuming a specific authentication test that inspects the empirical variance of the system disturbance. We also show how the controller can improve the security of the system by superimposing a carefully crafted privacy-enhancing signal on top of the “nominal control policy.” Finally, for nonlinear scalar dynamics that belong to the Reproducing Kernel Hilbert Space (RKHS), we investigate the performance of attacks based on nonlinear Gaussian-processes (GP) learning algortihms. |
Tasks | Gaussian Processes |
Published | 2018-09-17 |
URL | https://arxiv.org/abs/1809.06023v6 |
https://arxiv.org/pdf/1809.06023v6.pdf | |
PWC | https://paperswithcode.com/paper/authentication-of-cyber-physical-systems |
Repo | |
Framework | |
Diffeomorphic brain shape modelling using Gauss-Newton optimisation
Title | Diffeomorphic brain shape modelling using Gauss-Newton optimisation |
Authors | Yaël Balbastre, Mikael Brudfors, Kevin Bronik, John Ashburner |
Abstract | Shape modelling describes methods aimed at capturing the natural variability of shapes and commonly relies on probabilistic interpretations of dimensionality reduction techniques such as principal component analysis. Due to their computational complexity when dealing with dense deformation models such as diffeomorphisms, previous attempts have focused on explicitly reducing their dimension, diminishing de facto their flexibility and ability to model complex shapes such as brains. In this paper, we present a generative model of shape that allows the covariance structure of deformations to be captured without squashing their domain, resulting in better normalisation. An efficient inference scheme based on Gauss-Newton optimisation is used, which enables processing of 3D neuroimaging data. We trained this algorithm on segmented brains from the OASIS database, generating physiologically meaningful deformation trajectories. To prove the model’s robustness, we applied it to unseen data, which resulted in equivalent fitting scores. |
Tasks | Dimensionality Reduction |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07109v1 |
http://arxiv.org/pdf/1806.07109v1.pdf | |
PWC | https://paperswithcode.com/paper/diffeomorphic-brain-shape-modelling-using |
Repo | |
Framework | |
Deep learning at the shallow end: Malware classification for non-domain experts
Title | Deep learning at the shallow end: Malware classification for non-domain experts |
Authors | Quan Le, Oisín Boydell, Brian Mac Namee, Mark Scanlon |
Abstract | Current malware detection and classification approaches generally rely on time consuming and knowledge intensive processes to extract patterns (signatures) and behaviors from malware, which are then used for identification. Moreover, these signatures are often limited to local, contiguous sequences within the data whilst ignoring their context in relation to each other and throughout the malware file as a whole. We present a Deep Learning based malware classification approach that requires no expert domain knowledge and is based on a purely data driven approach for complex pattern and feature identification. |
Tasks | Malware Classification, Malware Detection |
Published | 2018-07-22 |
URL | http://arxiv.org/abs/1807.08265v1 |
http://arxiv.org/pdf/1807.08265v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-at-the-shallow-end-malware |
Repo | |
Framework | |
Multi-Cell Multi-Task Convolutional Neural Networks for Diabetic Retinopathy Grading
Title | Multi-Cell Multi-Task Convolutional Neural Networks for Diabetic Retinopathy Grading |
Authors | Kang Zhou, Zaiwang Gu, Wen Liu, Weixin Luo, Jun Cheng, Shenghua Gao, Jiang Liu |
Abstract | Diabetic Retinopathy (DR) is a non-negligible eye disease among patients with Diabetes Mellitus, and automatic retinal image analysis algorithm for the DR screening is in high demand. Considering the resolution of retinal image is very high, where small pathological tissues can be detected only with large resolution image and large local receptive field are required to identify those late stage disease, but directly training a neural network with very deep architecture and high resolution image is both time computational expensive and difficult because of gradient vanishing/exploding problem, we propose a \textbf{Multi-Cell} architecture which gradually increases the depth of deep neural network and the resolution of input image, which both boosts the training time but also improves the classification accuracy. Further, considering the different stages of DR actually progress gradually, which means the labels of different stages are related. To considering the relationships of images with different stages, we propose a \textbf{Multi-Task} learning strategy which predicts the label with both classification and regression. Experimental results on the Kaggle dataset show that our method achieves a Kappa of 0.841 on test set which is the 4-th rank of all state-of-the-arts methods. Further, our Multi-Cell Multi-Task Convolutional Neural Networks (M$^2$CNN) solution is a general framework, which can be readily integrated with many other deep neural network architectures. |
Tasks | Multi-Task Learning |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1808.10564v2 |
http://arxiv.org/pdf/1808.10564v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-cell-multi-task-convolutional-neural |
Repo | |
Framework | |
Modeling Language Vagueness in Privacy Policies using Deep Neural Networks
Title | Modeling Language Vagueness in Privacy Policies using Deep Neural Networks |
Authors | Fei Liu, Nicole Lee Fella, Kexin Liao |
Abstract | Website privacy policies are too long to read and difficult to understand. The over-sophisticated language makes privacy notices to be less effective than they should be. People become even less willing to share their personal information when they perceive the privacy policy as vague. This paper focuses on decoding vagueness from a natural language processing perspective. While thoroughly identifying the vague terms and their linguistic scope remains an elusive challenge, in this work we seek to learn vector representations of words in privacy policies using deep neural networks. The vector representations are fed to an interactive visualization tool (LSTMVis) to test on their ability to discover syntactically and semantically related vague terms. The approach holds promise for modeling and understanding language vagueness. |
Tasks | |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10393v1 |
http://arxiv.org/pdf/1805.10393v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-language-vagueness-in-privacy |
Repo | |
Framework | |
Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping
Title | Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping |
Authors | Mihai Cucuringu, Hemant Tyagi |
Abstract | Consider an unknown smooth function $f: [0,1]^d \rightarrow \mathbb{R}$, and say we are given $n$ noisy mod 1 samples of $f$, i.e., $y_i = (f(x_i) + \eta_i)\mod 1$, for $x_i \in [0,1]^d$, where $\eta_i$ denotes the noise. Given the samples $(x_i,y_i)_{i=1}^{n}$, our goal is to recover smooth, robust estimates of the clean samples $f(x_i) \bmod 1$. We formulate a natural approach for solving this problem, which works with angular embeddings of the noisy mod 1 samples over the unit circle, inspired by the angular synchronization framework. This amounts to solving a smoothness regularized least-squares problem – a quadratically constrained quadratic program (QCQP) – where the variables are constrained to lie on the unit circle. Our approach is based on solving its relaxation, which is a trust-region sub-problem and hence solvable efficiently. We provide theoretical guarantees demonstrating its robustness to noise for adversarial, and random Gaussian and Bernoulli noise models. To the best of our knowledge, these are the first such theoretical results for this problem. We demonstrate the robustness and efficiency of our approach via extensive numerical simulations on synthetic data, along with a simple least-squares solution for the unwrapping stage, that recovers the original samples of $f$ (up to a global shift). It is shown to perform well at high levels of noise, when taking as input the denoised modulo $1$ samples. Finally, we also consider two other approaches for denoising the modulo 1 samples that leverage tools from Riemannian optimization on manifolds, including a Burer-Monteiro approach for a semidefinite programming relaxation of our formulation. For the two-dimensional version of the problem, which has applications in radar interferometry, we are able to solve instances of real-world data with a million sample points in under 10 seconds, on a personal laptop. |
Tasks | Denoising |
Published | 2018-03-09 |
URL | https://arxiv.org/abs/1803.03669v2 |
https://arxiv.org/pdf/1803.03669v2.pdf | |
PWC | https://paperswithcode.com/paper/provably-robust-estimation-of-modulo-1 |
Repo | |
Framework | |