January 27, 2020

2854 words 14 mins read

Paper Group ANR 1294

Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models. Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets. Generative Restricted Kernel Machines. Latent Retrieval for Weakly Supervised Open Domain Question Answering. Sequential Latent Spaces for Modeling the …

Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models


Title	Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models
Authors	Gongbo Tang, Rico Sennrich, Joakim Nivre
Abstract	In this paper, we try to understand neural machine translation (NMT) via simplifying NMT architectures and training encoder-free NMT models. In an encoder-free model, the sums of word embeddings and positional embeddings represent the source. The decoder is a standard Transformer or recurrent neural network that directly attends to embeddings via attention mechanisms. Experimental results show (1) that the attention mechanism in encoder-free models acts as a strong feature extractor, (2) that the word embeddings in encoder-free models are competitive to those in conventional models, (3) that non-contextualized source representations lead to a big performance drop, and (4) that encoder-free models have different effects on alignment quality for German-English and Chinese-English.
Tasks	Machine Translation, Word Embeddings
Published	2019-07-18
URL	https://arxiv.org/abs/1907.08158v1
PDF	https://arxiv.org/pdf/1907.08158v1.pdf
PWC	https://paperswithcode.com/paper/understanding-neural-machine-translation-by
Repo
Framework

Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets


Title	Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets
Authors	Maria Perez-Ortiz, Peter Tino, Rafal Mantiuk, Cesar Hervas-Martinez
Abstract	Data augmentation is rapidly gaining attention in machine learning. Synthetic data can be generated by simple transformations or through the data distribution. In the latter case, the main challenge is to estimate the label associated to new synthetic patterns. This paper studies the effect of generating synthetic data by convex combination of patterns and the use of these as unsupervised information in a semi-supervised learning framework with support vector machines, avoiding thus the need to label synthetic examples. We perform experiments on a total of 53 binary classification datasets. Our results show that this type of data over-sampling supports the well-known cluster assumption in semi-supervised learning, showing outstanding results for small high-dimensional datasets and imbalanced learning problems.
Tasks	Data Augmentation
Published	2019-03-24
URL	http://arxiv.org/abs/1903.10022v1
PDF	http://arxiv.org/pdf/1903.10022v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-synthetically-generated-data-with
Repo
Framework

Generative Restricted Kernel Machines


Title	Generative Restricted Kernel Machines
Authors	Arun Pandey, Joachim Schreurs, Johan A. K. Suykens
Abstract	We introduce a novel framework for generative models based on Restricted Kernel Machines (RKMs) with multi-view generation and uncorrelated feature learning capabilities, called Gen-RKM. To incorporate multi-view generation, this mechanism uses a shared representation of data from various views. The mechanism is flexible to incorporate both kernel-based, (deep) neural network and Convolutional based models within the same setting. To update the parameters of the network, we propose a novel training procedure which jointly learns the features and shared subspace representation. The latent variables are given by the eigen-decomposition of the kernel matrix, where the mutual orthogonality of eigenvectors represents uncorrelated features. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of generated samples on various standard datasets.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08144v5
PDF	https://arxiv.org/pdf/1906.08144v5.pdf
PWC	https://paperswithcode.com/paper/generative-restricted-kernel-machines
Repo
Framework

Latent Retrieval for Weakly Supervised Open Domain Question Answering


Title	Latent Retrieval for Weakly Supervised Open Domain Question Answering
Authors	Kenton Lee, Ming-Wei Chang, Kristina Toutanova
Abstract	Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from IR. We show for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system. In this setting, evidence retrieval from all of Wikipedia is treated as a latent variable. Since this is impractical to learn from scratch, we pre-train the retriever with an Inverse Cloze Task. We evaluate on open versions of five QA datasets. On datasets where the questioner already knows the answer, a traditional IR system such as BM25 is sufficient. On datasets where a user is genuinely seeking an answer, we show that learned retrieval is crucial, outperforming BM25 by up to 19 points in exact match.
Tasks	Information Retrieval, Open-Domain Question Answering, Question Answering
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00300v3
PDF	https://arxiv.org/pdf/1906.00300v3.pdf
PWC	https://paperswithcode.com/paper/190600300
Repo
Framework

Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning


Title	Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
Authors	Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing
Abstract	Diverse and accurate vision+language modeling is an important goal to retain creative freedom and maintain user engagement. However, adequately capturing the intricacies of diversity in language models is challenging. Recent works commonly resort to latent variable models augmented with more or less supervision from object detectors or part-of-speech tags. Common to all those methods is the fact that the latent variable either only initializes the sentence generation process or is identical across the steps of generation. Both methods offer no fine-grained control. To address this concern, we propose Seq-CVAE which learns a latent space for every word position. We encourage this temporal latent space to capture the ‘intention’ about how to complete the sentence by mimicking a representation which summarizes the future. We illustrate the efficacy of the proposed approach to anticipate the sentence continuation on the challenging MSCOCO dataset, significantly improving diversity metrics compared to baselines while performing on par w.r.t sentence quality.
Tasks	Image Captioning, Language Modelling, Latent Variable Models
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08529v1
PDF	https://arxiv.org/pdf/1908.08529v1.pdf
PWC	https://paperswithcode.com/paper/sequential-latent-spaces-for-modeling-the
Repo
Framework

Precision Medicine Informatics: Principles, Prospects, and Challenges


Title	Precision Medicine Informatics: Principles, Prospects, and Challenges
Authors	Muhammad Afzal, S. M. Riazul Islam, Maqbool Hussain, Sungyoung Lee
Abstract	Precision Medicine (PM) is an emerging approach that appears with the impression of changing the existing paradigm of medical practice. Recent advances in technological innovations and genetics, and the growing availability of health data have set a new pace of the research and imposes a set of new requirements on different stakeholders. To date, some studies are available that discuss about different aspects of PM. Nevertheless, a holistic representation of those aspects deemed to confer the technological perspective, in relation to applications and challenges, is mostly ignored. In this context, this paper surveys advances in PM from informatics viewpoint and reviews the enabling tools and techniques in a categorized manner. In addition, the study discusses how other technological paradigms including big data, artificial intelligence, and internet of things can be exploited to advance the potentials of PM. Furthermore, the paper provides some guidelines for future research for seamless implementation and wide-scale deployment of PM based on identified open issues and associated challenges. To this end, the paper proposes an integrated holistic framework for PM motivating informatics researchers to design their relevant research works in an appropriate context.
Tasks
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01014v1
PDF	https://arxiv.org/pdf/1911.01014v1.pdf
PWC	https://paperswithcode.com/paper/precision-medicine-informatics-principles
Repo
Framework

Simple and Effective Noisy Channel Modeling for Neural Machine Translation


Title	Simple and Effective Noisy Channel Modeling for Neural Machine Translation
Authors	Kyra Yee, Nathan Ng, Yann N. Dauphin, Michael Auli
Abstract	Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source. These models perform remarkably well as channel models, even though they have neither been trained on, nor designed to factor over incomplete target sentences. Experiments with neural language models trained on billions of words show that noisy channel models can outperform a direct model by up to 3.2 BLEU on WMT’17 German-English translation. We evaluate on four language-pairs and our channel models consistently outperform strong alternatives such right-to-left reranking models and ensembles of direct models.
Tasks	Latent Variable Models, Machine Translation
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05731v1
PDF	https://arxiv.org/pdf/1908.05731v1.pdf
PWC	https://paperswithcode.com/paper/simple-and-effective-noisy-channel-modeling
Repo
Framework

A Notion of Harmonic Clustering in Simplicial Complexes


Title	A Notion of Harmonic Clustering in Simplicial Complexes
Authors	Stefania Ebli, Gard Spreemann
Abstract	We outline a novel clustering scheme for simplicial complexes that produces clusters of simplices in a way that is sensitive to the homology of the complex. The method is inspired by, and can be seen as a higher-dimensional version of, graph spectral clustering. The algorithm involves only sparse eigenproblems, and is therefore computationally efficient. We believe that it has broad application as a way to extract features from simplicial complexes that often arise in topological data analysis.
Tasks	Topological Data Analysis
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07247v1
PDF	https://arxiv.org/pdf/1910.07247v1.pdf
PWC	https://paperswithcode.com/paper/a-notion-of-harmonic-clustering-in-simplicial
Repo
Framework

ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning


Title	ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning
Authors	Martin Royer, Frédéric Chazal, Clément Levrard, Yuichi Ike, Yuhei Umeda
Abstract	Robust topological information commonly comes in the form of a set of persistence diagrams, finite measures that are in nature uneasy to affix to generic machine learning frameworks. We introduce a learnt, unsupervised measure vectorisation method and use it for reflecting underlying changes in topological behaviour in machine learning contexts. Relying on optimal measure quantisation results the method is tailored to efficiently discriminate important plane regions where meaningful differences arise. We showcase the strength and robustness of our approach on a number of applications, from emulous and modern graph collections where the method reaches state-of-the-art performance to a geometric synthetic dynamical orbits problem. The proposed methodology comes with only high level tuning parameters such as the total measure encoding budget, and we provide a completely open access software.
Tasks	Time Series, Topological Data Analysis
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13472v2
PDF	https://arxiv.org/pdf/1909.13472v2.pdf
PWC	https://paperswithcode.com/paper/atol-automatic-topologically-oriented
Repo
Framework

Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topology-aware Matchings


Title	Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topology-aware Matchings
Authors	Maxime Soler, Martin Petitfrere, Gilles Darche, Melanie Plainchault, Bruno Conche, Julien Tierny
Abstract	This application paper presents a novel framework based on topological data analysis for the automatic evaluation and ranking of viscous finger simulation runs in an ensemble with respect to a reference acquisition. Individual fingers in a given time-step are associated with critical point pairs in the distance field to the injection point, forming persistence diagrams. Different metrics, based on optimal transport, for comparing time-varying persistence diagrams in this specific applicative case are introduced. We evaluate the relevance of the rankings obtained with these metrics, both qualitatively thanks to a lightweight web visual interface, and quantitatively by studying the deviation from a reference ranking suggested by experts. Extensive experiments show the quantitative superiority of our approach compared to traditional alternatives. Our web interface allows experts to conveniently explore the produced rankings. We show a complete viscous fingering case study demonstrating the utility of our approach in the context of porous media fluid flow, where our framework can be used to automatically discard physically-irrelevant simulation runs from the ensemble and rank the most plausible ones. We document an in-situ implementation to lighten I/O and performance constraints arising in the context of parametric studies.
Tasks	Topological Data Analysis
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07841v1
PDF	https://arxiv.org/pdf/1908.07841v1.pdf
PWC	https://paperswithcode.com/paper/190807841
Repo
Framework

Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoder


Title	Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoder
Authors	Mostafa Sadeghi, Simon Leglaive, Xavier Alameda-PIneda, Laurent Girin, Radu Horaud
Abstract	Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. One advantage of this generative approach is that it does not require pairs of clean and noisy speech signals at training. In this paper, we propose audio-visual variants of VAEs for single-channel and speaker-independent speech enhancement. We develop a conditional VAE (CVAE) where the audio speech generative process is conditioned on visual information of the lip region. At test time, the audio-visual speech generative model is combined with a noise model based on nonnegative matrix factorization, and speech enhancement relies on a Monte Carlo expectation-maximization algorithm. Experiments are conducted with the recently published NTCD-TIMIT dataset as well as the GRID corpus. The results confirm that the proposed audio-visual CVAE effectively fuses audio and visual information, and it improves the speech enhancement performance compared with the audio-only VAE model, especially when the speech signal is highly corrupted by noise. We also show that the proposed unsupervised audio-visual speech enhancement approach outperforms a state-of-the-art supervised deep learning method.
Tasks	Latent Variable Models, Speech Enhancement
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02590v2
PDF	https://arxiv.org/pdf/1908.02590v2.pdf
PWC	https://paperswithcode.com/paper/audio-visual-speech-enhancement-using-1
Repo
Framework

A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data


Title	A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data
Authors	Gerhard Wohlgenannt, Dmitry Mouromtsev, Dmitry Pavlov, Yury Emelyanov, Alexey Morozov
Abstract	With the growing number and size of Linked Data datasets, it is crucial to make the data accessible and useful for users without knowledge of formal query languages. Two approaches towards this goal are knowledge graph visualization and natural language interfaces. Here, we investigate specifically question answering (QA) over Linked Data by comparing a diagrammatic visual approach with existing natural language-based systems. Given a QA benchmark (QALD7), we evaluate a visual method which is based on iteratively creating diagrams until the answer is found, against four QA systems that have natural language queries as input. Besides other benefits, the visual approach provides higher performance, but also requires more manual input. The results indicate that the methods can be used complementary, and that such a combination has a large positive impact on QA performance, and also facilitates additional features such as data exploration.
Tasks	Question Answering
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08501v1
PDF	https://arxiv.org/pdf/1907.08501v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-evaluation-of-visual-and
Repo
Framework

Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering


Title	Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering
Authors	Wei Yang, Yuqing Xie, Luchen Tan, Kun Xiong, Ming Li, Jimmy Lin
Abstract	Recently, a simple combination of passage retrieval using off-the-shelf IR techniques and a BERT reader was found to be very effective for question answering directly on Wikipedia, yielding a large improvement over the previous state of the art on a standard benchmark dataset. In this paper, we present a data augmentation technique using distant supervision that exploits positive as well as negative examples. We apply a stage-wise approach to fine tuning BERT on multiple datasets, starting with data that is “furthest” from the test data and ending with the “closest”. Experimental results show large gains in effectiveness over previous approaches on English QA datasets, and we establish new baselines on two recent Chinese QA datasets.
Tasks	Data Augmentation, Open-Domain Question Answering, Question Answering
Published	2019-04-14
URL	http://arxiv.org/abs/1904.06652v1
PDF	http://arxiv.org/pdf/1904.06652v1.pdf
PWC	https://paperswithcode.com/paper/data-augmentation-for-bert-fine-tuning-in
Repo
Framework

Sequential Classification with Empirically Observed Statistics


Title	Sequential Classification with Empirically Observed Statistics
Authors	Mahdi Haghifam, Vincent Y. F. Tan, Ashish Khisti
Abstract	Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according to either one of the distributions. In particular, for the binary case, the decision maker wishes to perform the classification task with minimum number of the test samples, so, at each step, she declares that either hypothesis 1 is true, hypothesis 2 is true, or she requests for an additional test sample. We propose a classifier and analyze the type-I and type-II error probabilities. We demonstrate the significant advantage of our sequential scheme compared to an existing non-sequential classifier proposed by Gutman. Finally, we extend our setup and results to the multi-class classification scenario and again demonstrate that the variable-length nature of the problem affords significant advantages as one can achieve the same set of exponents as Gutman’s fixed-length setting but without having the rejection option.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01170v1
PDF	https://arxiv.org/pdf/1912.01170v1.pdf
PWC	https://paperswithcode.com/paper/sequential-classification-with-empirically
Repo
Framework

Anytime Online-to-Batch Conversions, Optimism, and Acceleration


Title	Anytime Online-to-Batch Conversions, Optimism, and Acceleration
Authors	Ashok Cutkosky
Abstract	A standard way to obtain convergence guarantees in stochastic convex optimization is to run an online learning algorithm and then output the average of its iterates: the actual iterates of the online learning algorithm do not come with individual guarantees. We close this gap by introducing a black-box modification to any online learning algorithm whose iterates converge to the optimum in stochastic scenarios. We then consider the case of smooth losses, and show that combining our approach with optimistic online learning algorithms immediately yields a fast convergence rate of $O(L/T^{3/2}+\sigma/\sqrt{T})$ on $L$-smooth problems with $\sigma^2$ variance in the gradients. Finally, we provide a reduction that converts any adaptive online algorithm into one that obtains the optimal accelerated rate of $\tilde O(L/T^2 + \sigma/\sqrt{T})$, while still maintaining $\tilde O(1/\sqrt{T})$ convergence in the non-smooth setting. Importantly, our algorithms adapt to $L$ and $\sigma$ automatically: they do not need to know either to obtain these rates.
Tasks
Published	2019-03-03
URL	http://arxiv.org/abs/1903.00974v1
PDF	http://arxiv.org/pdf/1903.00974v1.pdf
PWC	https://paperswithcode.com/paper/anytime-online-to-batch-conversions-optimism
Repo
Framework