Paper Group ANR 1294
Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models. Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets. Generative Restricted Kernel Machines. Latent Retrieval for Weakly Supervised Open Domain Question Answering. Sequential Latent Spaces for Modeling the …
Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models
Title | Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models |
Authors | Gongbo Tang, Rico Sennrich, Joakim Nivre |
Abstract | In this paper, we try to understand neural machine translation (NMT) via simplifying NMT architectures and training encoder-free NMT models. In an encoder-free model, the sums of word embeddings and positional embeddings represent the source. The decoder is a standard Transformer or recurrent neural network that directly attends to embeddings via attention mechanisms. Experimental results show (1) that the attention mechanism in encoder-free models acts as a strong feature extractor, (2) that the word embeddings in encoder-free models are competitive to those in conventional models, (3) that non-contextualized source representations lead to a big performance drop, and (4) that encoder-free models have different effects on alignment quality for German-English and Chinese-English. |
Tasks | Machine Translation, Word Embeddings |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08158v1 |
https://arxiv.org/pdf/1907.08158v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-neural-machine-translation-by |
Repo | |
Framework | |
Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets
Title | Exploiting Synthetically Generated Data with Semi-Supervised Learning for Small and Imbalanced Datasets |
Authors | Maria Perez-Ortiz, Peter Tino, Rafal Mantiuk, Cesar Hervas-Martinez |
Abstract | Data augmentation is rapidly gaining attention in machine learning. Synthetic data can be generated by simple transformations or through the data distribution. In the latter case, the main challenge is to estimate the label associated to new synthetic patterns. This paper studies the effect of generating synthetic data by convex combination of patterns and the use of these as unsupervised information in a semi-supervised learning framework with support vector machines, avoiding thus the need to label synthetic examples. We perform experiments on a total of 53 binary classification datasets. Our results show that this type of data over-sampling supports the well-known cluster assumption in semi-supervised learning, showing outstanding results for small high-dimensional datasets and imbalanced learning problems. |
Tasks | Data Augmentation |
Published | 2019-03-24 |
URL | http://arxiv.org/abs/1903.10022v1 |
http://arxiv.org/pdf/1903.10022v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-synthetically-generated-data-with |
Repo | |
Framework | |
Generative Restricted Kernel Machines
Title | Generative Restricted Kernel Machines |
Authors | Arun Pandey, Joachim Schreurs, Johan A. K. Suykens |
Abstract | We introduce a novel framework for generative models based on Restricted Kernel Machines (RKMs) with multi-view generation and uncorrelated feature learning capabilities, called Gen-RKM. To incorporate multi-view generation, this mechanism uses a shared representation of data from various views. The mechanism is flexible to incorporate both kernel-based, (deep) neural network and Convolutional based models within the same setting. To update the parameters of the network, we propose a novel training procedure which jointly learns the features and shared subspace representation. The latent variables are given by the eigen-decomposition of the kernel matrix, where the mutual orthogonality of eigenvectors represents uncorrelated features. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of generated samples on various standard datasets. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08144v5 |
https://arxiv.org/pdf/1906.08144v5.pdf | |
PWC | https://paperswithcode.com/paper/generative-restricted-kernel-machines |
Repo | |
Framework | |
Latent Retrieval for Weakly Supervised Open Domain Question Answering
Title | Latent Retrieval for Weakly Supervised Open Domain Question Answering |
Authors | Kenton Lee, Ming-Wei Chang, Kristina Toutanova |
Abstract | Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from IR. We show for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system. In this setting, evidence retrieval from all of Wikipedia is treated as a latent variable. Since this is impractical to learn from scratch, we pre-train the retriever with an Inverse Cloze Task. We evaluate on open versions of five QA datasets. On datasets where the questioner already knows the answer, a traditional IR system such as BM25 is sufficient. On datasets where a user is genuinely seeking an answer, we show that learned retrieval is crucial, outperforming BM25 by up to 19 points in exact match. |
Tasks | Information Retrieval, Open-Domain Question Answering, Question Answering |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00300v3 |
https://arxiv.org/pdf/1906.00300v3.pdf | |
PWC | https://paperswithcode.com/paper/190600300 |
Repo | |
Framework | |
Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning
Title | Sequential Latent Spaces for Modeling the Intention During Diverse Image Captioning |
Authors | Jyoti Aneja, Harsh Agrawal, Dhruv Batra, Alexander Schwing |
Abstract | Diverse and accurate vision+language modeling is an important goal to retain creative freedom and maintain user engagement. However, adequately capturing the intricacies of diversity in language models is challenging. Recent works commonly resort to latent variable models augmented with more or less supervision from object detectors or part-of-speech tags. Common to all those methods is the fact that the latent variable either only initializes the sentence generation process or is identical across the steps of generation. Both methods offer no fine-grained control. To address this concern, we propose Seq-CVAE which learns a latent space for every word position. We encourage this temporal latent space to capture the ‘intention’ about how to complete the sentence by mimicking a representation which summarizes the future. We illustrate the efficacy of the proposed approach to anticipate the sentence continuation on the challenging MSCOCO dataset, significantly improving diversity metrics compared to baselines while performing on par w.r.t sentence quality. |
Tasks | Image Captioning, Language Modelling, Latent Variable Models |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08529v1 |
https://arxiv.org/pdf/1908.08529v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-latent-spaces-for-modeling-the |
Repo | |
Framework | |
Precision Medicine Informatics: Principles, Prospects, and Challenges
Title | Precision Medicine Informatics: Principles, Prospects, and Challenges |
Authors | Muhammad Afzal, S. M. Riazul Islam, Maqbool Hussain, Sungyoung Lee |
Abstract | Precision Medicine (PM) is an emerging approach that appears with the impression of changing the existing paradigm of medical practice. Recent advances in technological innovations and genetics, and the growing availability of health data have set a new pace of the research and imposes a set of new requirements on different stakeholders. To date, some studies are available that discuss about different aspects of PM. Nevertheless, a holistic representation of those aspects deemed to confer the technological perspective, in relation to applications and challenges, is mostly ignored. In this context, this paper surveys advances in PM from informatics viewpoint and reviews the enabling tools and techniques in a categorized manner. In addition, the study discusses how other technological paradigms including big data, artificial intelligence, and internet of things can be exploited to advance the potentials of PM. Furthermore, the paper provides some guidelines for future research for seamless implementation and wide-scale deployment of PM based on identified open issues and associated challenges. To this end, the paper proposes an integrated holistic framework for PM motivating informatics researchers to design their relevant research works in an appropriate context. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01014v1 |
https://arxiv.org/pdf/1911.01014v1.pdf | |
PWC | https://paperswithcode.com/paper/precision-medicine-informatics-principles |
Repo | |
Framework | |
Simple and Effective Noisy Channel Modeling for Neural Machine Translation
Title | Simple and Effective Noisy Channel Modeling for Neural Machine Translation |
Authors | Kyra Yee, Nathan Ng, Yann N. Dauphin, Michael Auli |
Abstract | Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pursue an alternative approach based on standard sequence to sequence models which utilize the entire source. These models perform remarkably well as channel models, even though they have neither been trained on, nor designed to factor over incomplete target sentences. Experiments with neural language models trained on billions of words show that noisy channel models can outperform a direct model by up to 3.2 BLEU on WMT’17 German-English translation. We evaluate on four language-pairs and our channel models consistently outperform strong alternatives such right-to-left reranking models and ensembles of direct models. |
Tasks | Latent Variable Models, Machine Translation |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05731v1 |
https://arxiv.org/pdf/1908.05731v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-and-effective-noisy-channel-modeling |
Repo | |
Framework | |
A Notion of Harmonic Clustering in Simplicial Complexes
Title | A Notion of Harmonic Clustering in Simplicial Complexes |
Authors | Stefania Ebli, Gard Spreemann |
Abstract | We outline a novel clustering scheme for simplicial complexes that produces clusters of simplices in a way that is sensitive to the homology of the complex. The method is inspired by, and can be seen as a higher-dimensional version of, graph spectral clustering. The algorithm involves only sparse eigenproblems, and is therefore computationally efficient. We believe that it has broad application as a way to extract features from simplicial complexes that often arise in topological data analysis. |
Tasks | Topological Data Analysis |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07247v1 |
https://arxiv.org/pdf/1910.07247v1.pdf | |
PWC | https://paperswithcode.com/paper/a-notion-of-harmonic-clustering-in-simplicial |
Repo | |
Framework | |
ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning
Title | ATOL: Measure Vectorisation for Automatic Topologically-Oriented Learning |
Authors | Martin Royer, Frédéric Chazal, Clément Levrard, Yuichi Ike, Yuhei Umeda |
Abstract | Robust topological information commonly comes in the form of a set of persistence diagrams, finite measures that are in nature uneasy to affix to generic machine learning frameworks. We introduce a learnt, unsupervised measure vectorisation method and use it for reflecting underlying changes in topological behaviour in machine learning contexts. Relying on optimal measure quantisation results the method is tailored to efficiently discriminate important plane regions where meaningful differences arise. We showcase the strength and robustness of our approach on a number of applications, from emulous and modern graph collections where the method reaches state-of-the-art performance to a geometric synthetic dynamical orbits problem. The proposed methodology comes with only high level tuning parameters such as the total measure encoding budget, and we provide a completely open access software. |
Tasks | Time Series, Topological Data Analysis |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1909.13472v2 |
https://arxiv.org/pdf/1909.13472v2.pdf | |
PWC | https://paperswithcode.com/paper/atol-automatic-topologically-oriented |
Repo | |
Framework | |
Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topology-aware Matchings
Title | Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topology-aware Matchings |
Authors | Maxime Soler, Martin Petitfrere, Gilles Darche, Melanie Plainchault, Bruno Conche, Julien Tierny |
Abstract | This application paper presents a novel framework based on topological data analysis for the automatic evaluation and ranking of viscous finger simulation runs in an ensemble with respect to a reference acquisition. Individual fingers in a given time-step are associated with critical point pairs in the distance field to the injection point, forming persistence diagrams. Different metrics, based on optimal transport, for comparing time-varying persistence diagrams in this specific applicative case are introduced. We evaluate the relevance of the rankings obtained with these metrics, both qualitatively thanks to a lightweight web visual interface, and quantitatively by studying the deviation from a reference ranking suggested by experts. Extensive experiments show the quantitative superiority of our approach compared to traditional alternatives. Our web interface allows experts to conveniently explore the produced rankings. We show a complete viscous fingering case study demonstrating the utility of our approach in the context of porous media fluid flow, where our framework can be used to automatically discard physically-irrelevant simulation runs from the ensemble and rank the most plausible ones. We document an in-situ implementation to lighten I/O and performance constraints arising in the context of parametric studies. |
Tasks | Topological Data Analysis |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07841v1 |
https://arxiv.org/pdf/1908.07841v1.pdf | |
PWC | https://paperswithcode.com/paper/190807841 |
Repo | |
Framework | |
Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoder
Title | Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoder |
Authors | Mostafa Sadeghi, Simon Leglaive, Xavier Alameda-PIneda, Laurent Girin, Radu Horaud |
Abstract | Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data. VAEs have been successfully used to learn a probabilistic prior over speech signals, which is then used to perform speech enhancement. One advantage of this generative approach is that it does not require pairs of clean and noisy speech signals at training. In this paper, we propose audio-visual variants of VAEs for single-channel and speaker-independent speech enhancement. We develop a conditional VAE (CVAE) where the audio speech generative process is conditioned on visual information of the lip region. At test time, the audio-visual speech generative model is combined with a noise model based on nonnegative matrix factorization, and speech enhancement relies on a Monte Carlo expectation-maximization algorithm. Experiments are conducted with the recently published NTCD-TIMIT dataset as well as the GRID corpus. The results confirm that the proposed audio-visual CVAE effectively fuses audio and visual information, and it improves the speech enhancement performance compared with the audio-only VAE model, especially when the speech signal is highly corrupted by noise. We also show that the proposed unsupervised audio-visual speech enhancement approach outperforms a state-of-the-art supervised deep learning method. |
Tasks | Latent Variable Models, Speech Enhancement |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02590v2 |
https://arxiv.org/pdf/1908.02590v2.pdf | |
PWC | https://paperswithcode.com/paper/audio-visual-speech-enhancement-using-1 |
Repo | |
Framework | |
A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data
Title | A Comparative Evaluation of Visual and Natural Language Question Answering Over Linked Data |
Authors | Gerhard Wohlgenannt, Dmitry Mouromtsev, Dmitry Pavlov, Yury Emelyanov, Alexey Morozov |
Abstract | With the growing number and size of Linked Data datasets, it is crucial to make the data accessible and useful for users without knowledge of formal query languages. Two approaches towards this goal are knowledge graph visualization and natural language interfaces. Here, we investigate specifically question answering (QA) over Linked Data by comparing a diagrammatic visual approach with existing natural language-based systems. Given a QA benchmark (QALD7), we evaluate a visual method which is based on iteratively creating diagrams until the answer is found, against four QA systems that have natural language queries as input. Besides other benefits, the visual approach provides higher performance, but also requires more manual input. The results indicate that the methods can be used complementary, and that such a combination has a large positive impact on QA performance, and also facilitates additional features such as data exploration. |
Tasks | Question Answering |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08501v1 |
https://arxiv.org/pdf/1907.08501v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-evaluation-of-visual-and |
Repo | |
Framework | |
Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering
Title | Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering |
Authors | Wei Yang, Yuqing Xie, Luchen Tan, Kun Xiong, Ming Li, Jimmy Lin |
Abstract | Recently, a simple combination of passage retrieval using off-the-shelf IR techniques and a BERT reader was found to be very effective for question answering directly on Wikipedia, yielding a large improvement over the previous state of the art on a standard benchmark dataset. In this paper, we present a data augmentation technique using distant supervision that exploits positive as well as negative examples. We apply a stage-wise approach to fine tuning BERT on multiple datasets, starting with data that is “furthest” from the test data and ending with the “closest”. Experimental results show large gains in effectiveness over previous approaches on English QA datasets, and we establish new baselines on two recent Chinese QA datasets. |
Tasks | Data Augmentation, Open-Domain Question Answering, Question Answering |
Published | 2019-04-14 |
URL | http://arxiv.org/abs/1904.06652v1 |
http://arxiv.org/pdf/1904.06652v1.pdf | |
PWC | https://paperswithcode.com/paper/data-augmentation-for-bert-fine-tuning-in |
Repo | |
Framework | |
Sequential Classification with Empirically Observed Statistics
Title | Sequential Classification with Empirically Observed Statistics |
Authors | Mahdi Haghifam, Vincent Y. F. Tan, Ashish Khisti |
Abstract | Motivated by real-world machine learning applications, we consider a statistical classification task in a sequential setting where test samples arrive sequentially. In addition, the generating distributions are unknown and only a set of empirically sampled sequences are available to a decision maker. The decision maker is tasked to classify a test sequence which is known to be generated according to either one of the distributions. In particular, for the binary case, the decision maker wishes to perform the classification task with minimum number of the test samples, so, at each step, she declares that either hypothesis 1 is true, hypothesis 2 is true, or she requests for an additional test sample. We propose a classifier and analyze the type-I and type-II error probabilities. We demonstrate the significant advantage of our sequential scheme compared to an existing non-sequential classifier proposed by Gutman. Finally, we extend our setup and results to the multi-class classification scenario and again demonstrate that the variable-length nature of the problem affords significant advantages as one can achieve the same set of exponents as Gutman’s fixed-length setting but without having the rejection option. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01170v1 |
https://arxiv.org/pdf/1912.01170v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-classification-with-empirically |
Repo | |
Framework | |
Anytime Online-to-Batch Conversions, Optimism, and Acceleration
Title | Anytime Online-to-Batch Conversions, Optimism, and Acceleration |
Authors | Ashok Cutkosky |
Abstract | A standard way to obtain convergence guarantees in stochastic convex optimization is to run an online learning algorithm and then output the average of its iterates: the actual iterates of the online learning algorithm do not come with individual guarantees. We close this gap by introducing a black-box modification to any online learning algorithm whose iterates converge to the optimum in stochastic scenarios. We then consider the case of smooth losses, and show that combining our approach with optimistic online learning algorithms immediately yields a fast convergence rate of $O(L/T^{3/2}+\sigma/\sqrt{T})$ on $L$-smooth problems with $\sigma^2$ variance in the gradients. Finally, we provide a reduction that converts any adaptive online algorithm into one that obtains the optimal accelerated rate of $\tilde O(L/T^2 + \sigma/\sqrt{T})$, while still maintaining $\tilde O(1/\sqrt{T})$ convergence in the non-smooth setting. Importantly, our algorithms adapt to $L$ and $\sigma$ automatically: they do not need to know either to obtain these rates. |
Tasks | |
Published | 2019-03-03 |
URL | http://arxiv.org/abs/1903.00974v1 |
http://arxiv.org/pdf/1903.00974v1.pdf | |
PWC | https://paperswithcode.com/paper/anytime-online-to-batch-conversions-optimism |
Repo | |
Framework | |