January 31, 2020

2839 words 14 mins read

Paper Group ANR 154

Paper Group ANR 154

REVE: Regularizing Deep Learning with Variational Entropy Bound. Rule based Approach for Word Normalization by resolving Transcription Ambiguity in Transliterated Search Queries. Memory-Efficient Adaptive Optimization. Improving Semantic Parsing with Neural Generator-Reranker Architecture. How to Evaluate the Next System: Automatic Dialogue Evaluat …

REVE: Regularizing Deep Learning with Variational Entropy Bound

Title REVE: Regularizing Deep Learning with Variational Entropy Bound
Authors Antoine Saporta, Yifu Chen, Michael Blot, Matthieu Cord
Abstract Studies on generalization performance of machine learning algorithms under the scope of information theory suggest that compressed representations can guarantee good generalization, inspiring many compression-based regularization methods. In this paper, we introduce REVE, a new regularization scheme. Noting that compressing the representation can be sub-optimal, our first contribution is to identify a variable that is directly responsible for the final prediction. Our method aims at compressing the class conditioned entropy of this latter variable. Second, we introduce a variational upper bound on this conditional entropy term. Finally, we propose a scheme to instantiate a tractable loss that is integrated within the training procedure of the neural network and demonstrate its efficiency on different neural networks and datasets.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.06816v1
PDF https://arxiv.org/pdf/1910.06816v1.pdf
PWC https://paperswithcode.com/paper/reve-regularizing-deep-learning-with
Repo
Framework

Rule based Approach for Word Normalization by resolving Transcription Ambiguity in Transliterated Search Queries

Title Rule based Approach for Word Normalization by resolving Transcription Ambiguity in Transliterated Search Queries
Authors Varsha Pathak, Manish Joshi
Abstract Query term matching with document term matching is the basic function of any best effort Information Retrieval models like Vector Space Model. In our problem of SMS based Information Systems we expect common people to participate in information search. Our system allows mobile users to formulate their queries in their own words, own transliteration style and spelling formation. To achieve this flexibility we have resolved the term level ambiguity due to inherent transcription noise in user query terms. We have developed a rule based approach to select most relevantly close standard term for each noisy term in the user query. We have used four different versions of the rule based algorithm with variation in the rule set. We have formulated this rule set including the basic Levenshtein minimum edit distance algorithm for term matching. This paper presents the experiments and corresponding results of Marathi and Hindi language literature information system. We have experimented on Marathi and Hindi literature which include songs, gazals, powadas, bharud and other types in a standard transliteration form like ITRANS.
Tasks Information Retrieval, Transliteration
Published 2019-10-16
URL https://arxiv.org/abs/1910.07233v1
PDF https://arxiv.org/pdf/1910.07233v1.pdf
PWC https://paperswithcode.com/paper/rule-based-approach-for-word-normalization-by
Repo
Framework

Memory-Efficient Adaptive Optimization

Title Memory-Efficient Adaptive Optimization
Authors Rohan Anil, Vineet Gupta, Tomer Koren, Yoram Singer
Abstract Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for achieving state-of-the-art performance in machine translation and language modeling. However, these methods maintain second-order statistics for each parameter, thus introducing significant memory overheads that restrict the size of the model being used as well as the number of examples in a mini-batch. We describe an effective and flexible adaptive optimization method with greatly reduced memory overhead. Our method retains the benefits of per-parameter adaptivity while allowing significantly larger models and batch sizes. We give convergence guarantees for our method, and demonstrate its effectiveness in training very large translation and language models with up to 2-fold speedups compared to the state-of-the-art.
Tasks Language Modelling, Machine Translation
Published 2019-01-30
URL https://arxiv.org/abs/1901.11150v2
PDF https://arxiv.org/pdf/1901.11150v2.pdf
PWC https://paperswithcode.com/paper/memory-efficient-adaptive-optimization-for
Repo
Framework

Improving Semantic Parsing with Neural Generator-Reranker Architecture

Title Improving Semantic Parsing with Neural Generator-Reranker Architecture
Authors Huseyin A. Inan, Gaurav Singh Tomar, Huapu Pan
Abstract Semantic parsing is the problem of deriving machine interpretable meaning representations from natural language utterances. Neural models with encoder-decoder architectures have recently achieved substantial improvements over traditional methods. Although neural semantic parsers appear to have relatively high recall using large beam sizes, there is room for improvement with respect to one-best precision. In this work, we propose a generator-reranker architecture for semantic parsing. The generator produces a list of potential candidates and the reranker, which consists of a pre-processing step for the candidates followed by a novel critic network, reranks these candidates based on the similarity between each candidate and the input sentence. We show the advantages of this approach along with how it improves the parsing performance through extensive analysis. We experiment our model on three semantic parsing datasets (GEO, ATIS, and OVERNIGHT). The overall architecture achieves the state-of-the-art results in all three datasets.
Tasks Semantic Parsing
Published 2019-09-27
URL https://arxiv.org/abs/1909.12764v1
PDF https://arxiv.org/pdf/1909.12764v1.pdf
PWC https://paperswithcode.com/paper/improving-semantic-parsing-with-neural
Repo
Framework

How to Evaluate the Next System: Automatic Dialogue Evaluation from the Perspective of Continual Learning

Title How to Evaluate the Next System: Automatic Dialogue Evaluation from the Perspective of Continual Learning
Authors Lu Li, Zhongheng He, Xiangyang Zhou, Dianhai Yu
Abstract Automatic dialogue evaluation plays a crucial role in open-domain dialogue research. Previous works train neural networks with limited annotation for conducting automatic dialogue evaluation, which would naturally affect the evaluation fairness as dialogue systems close to the scope of training corpus would have more preference than the other ones. In this paper, we study alleviating this problem from the perspective of continual learning: given an existing neural dialogue evaluator and the next system to be evaluated, we fine-tune the learned neural evaluator by selectively forgetting/updating its parameters, to jointly fit dialogue systems have been and will be evaluated. Our motivation is to seek for a lifelong and low-cost automatic evaluation for dialogue systems, rather than to reconstruct the evaluator over and over again. Experimental results show that our continual evaluator achieves comparable performance with reconstructing new evaluators, while requires significantly lower resources.
Tasks Continual Learning
Published 2019-12-10
URL https://arxiv.org/abs/1912.04664v1
PDF https://arxiv.org/pdf/1912.04664v1.pdf
PWC https://paperswithcode.com/paper/how-to-evaluate-the-next-system-automatic
Repo
Framework

Marpa, A practical general parser: the recognizer

Title Marpa, A practical general parser: the recognizer
Authors Jeffrey Kegler
Abstract The Marpa recognizer is described. Marpa is a practical and fully implemented algorithm for the recognition, parsing and evaluation of context-free grammars. The Marpa recognizer is the first to unite the improvements to Earley’s algorithm found in Joop Leo’s 1991 paper to those in Aycock and Horspool’s 2002 paper. Marpa tracks the full state of the parse, at it proceeds, in a form convenient for the application. This greatly improves error detection and enables event-driven parsing. One such technique is “Ruby Slippers” parsing, in which the input is altered in response to the parser’s expectations.
Tasks
Published 2019-10-17
URL https://arxiv.org/abs/1910.08129v1
PDF https://arxiv.org/pdf/1910.08129v1.pdf
PWC https://paperswithcode.com/paper/marpa-a-practical-general-parser-the
Repo
Framework

ExpertoCoder: Capturing Divergent Brain Regions Using Mixture of Regression Experts

Title ExpertoCoder: Capturing Divergent Brain Regions Using Mixture of Regression Experts
Authors Subba Reddy Oota, Naresh Manwani, Raju S. Bapi
Abstract fMRI semantic category understanding using linguistic encoding models attempts to learn a forward mapping that relates stimuli to the corresponding brain activation. Classical encoding models use linear multivariate methods to predict brain activation (all the voxels) given the stimulus. However, these methods mainly assume multiple regions as one vast uniform region or several independent regions, ignoring connections among them. In this paper, we present a mixture of experts model for predicting brain activity patterns. Given a new stimulus, the model predicts the entire brain activation as a weighted linear combination of activation of multiple experts. We argue that each expert captures activity patterns related to a particular region of interest (ROI) in the human brain. Thus, the utility of the proposed model is twofold. It not only accurately predicts the brain activation for a given stimulus, but it also reveals the level of activation of individual brain regions. Results of our experiments highlight the importance of the proposed model for predicting brain activation. This study also helps in understanding which of the brain regions get activated together, given a certain kind of stimulus. Importantly, we suggest that the mixture of regression experts (MoRE) framework successfully combines the two principles of organization of function in the brain, namely that of specialization and integration.
Tasks
Published 2019-09-26
URL https://arxiv.org/abs/1909.12299v1
PDF https://arxiv.org/pdf/1909.12299v1.pdf
PWC https://paperswithcode.com/paper/expertocoder-capturing-divergent-brain
Repo
Framework

Back-Projection based Fidelity Term for Ill-Posed Linear Inverse Problems

Title Back-Projection based Fidelity Term for Ill-Posed Linear Inverse Problems
Authors Tom Tirer, Raja Giryes
Abstract Ill-posed linear inverse problems appear in many image processing applications, such as deblurring, super-resolution and compressed sensing. Many restoration strategies involve minimizing a cost function, which is composed of fidelity and prior terms, balanced by a regularization parameter. While a vast amount of research has been focused on different prior models, the fidelity term is almost always chosen to be the least squares (LS) objective, that encourages fitting the linearly transformed optimization variable to the observations. In this paper, we examine a different fidelity term, which has been implicitly used by the recently proposed iterative denoising and backward projections (IDBP) framework. This term encourages agreement between the projection of the optimization variable onto the row space of the linear operator and the pseudo-inverse of the linear operator (“back-projection”) applied on the observations. We analytically examine the difference between the two fidelity terms for Tikhonov regularization and identify cases (such as a badly conditioned linear operator) where the new term has an advantage over the standard LS one. Moreover, we demonstrate empirically that the behavior of the two induced cost functions for sophisticated convex and non-convex priors, such as total-variation, BM3D, and deep generative models, correlates with the obtained theoretical analysis.
Tasks Deblurring, Denoising, Super-Resolution
Published 2019-06-16
URL https://arxiv.org/abs/1906.06794v2
PDF https://arxiv.org/pdf/1906.06794v2.pdf
PWC https://paperswithcode.com/paper/back-projection-based-fidelity-term-for-ill
Repo
Framework

Label Dependent Deep Variational Paraphrase Generation

Title Label Dependent Deep Variational Paraphrase Generation
Authors Siamak Shakeri, Abhinav Sethy
Abstract Generating paraphrases that are lexically similar but semantically different is a challenging task. Paraphrases of this form can be used to augment data sets for various NLP tasks such as machine reading comprehension and question answering with non-trivial negative examples. In this article, we propose a deep variational model to generate paraphrases conditioned on a label that specifies whether the paraphrases are semantically related or not. We also present new training recipes and KL regularization techniques that improve the performance of variational paraphrasing models. Our proposed model demonstrates promising results in enhancing the generative power of the model by employing label-dependent generation on paraphrasing datasets.
Tasks Machine Reading Comprehension, Paraphrase Generation, Question Answering, Reading Comprehension
Published 2019-11-27
URL https://arxiv.org/abs/1911.11952v1
PDF https://arxiv.org/pdf/1911.11952v1.pdf
PWC https://paperswithcode.com/paper/label-dependent-deep-variational-paraphrase
Repo
Framework

Decentralized Multi-Agent Actor-Critic with Generative Inference

Title Decentralized Multi-Agent Actor-Critic with Generative Inference
Authors Kevin Corder, Manuel M. Vindiola, Keith Decker
Abstract Recent multi-agent actor-critic methods have utilized centralized training with decentralized execution to address the non-stationarity of co-adapting agents. This training paradigm constrains learning to the centralized phase such that only pre-learned policies may be used during the decentralized phase, which performs poorly when agent communications are delayed, noisy, or disrupted. In this work, we propose a new system that can gracefully handle partially-observable information due to communication disruptions during decentralized execution. Our approach augments the multi-agent actor-critic method’s centralized training phase with generative modeling so that agents may infer other agents’ observations when provided with locally available context. Our method is evaluated on three tasks that require agents to combine local and remote observations communicated by other agents. We evaluate our approach by introducing both partial observability during decentralized execution, and show that decentralized training on inferred observations performs as well or better than existing actor-critic methods.
Tasks
Published 2019-10-07
URL https://arxiv.org/abs/1910.03058v1
PDF https://arxiv.org/pdf/1910.03058v1.pdf
PWC https://paperswithcode.com/paper/decentralized-multi-agent-actor-critic-with
Repo
Framework

Winning Isn’t Everything: Enhancing Game Development with Intelligent Agents

Title Winning Isn’t Everything: Enhancing Game Development with Intelligent Agents
Authors Yunqi Zhao, Igor Borovikov, Fernando de Mesentier Silva, Ahmad Beirami, Jason Rupert, Caedmon Somers, Jesse Harder, John Kolen, Jervis Pinto, Reza Pourabolghasem, James Pestrak, Harold Chaput, Mohsen Sardari, Long Lin, Sundeep Narravula, Navid Aghdaie, Kazi Zaman
Abstract Recently, there have been several high-profile achievements of agents learning to play games against humans and beat them. In this paper, we study the problem of training intelligent agents in service of game development. Unlike the agents built to “beat the game”, our agents aim to produce human-like behavior to help with game evaluation and balancing. We discuss two fundamental metrics based on which we measure the human-likeness of agents, namely skill and style, which are multi-faceted concepts with practical implications outlined in this paper. We report four case studies in which the style and skill requirements inform the choice of algorithms and metrics used to train agents; ranging from A* search to state-of-the-art deep reinforcement learning. We, further, show that the learning potential of state-of-the-art deep RL models does not seamlessly transfer from the benchmark environments to target ones without heavily tuning their hyperparameters, leading to linear scaling of the engineering efforts and computational cost with the number of target domains.
Tasks
Published 2019-03-25
URL https://arxiv.org/abs/1903.10545v3
PDF https://arxiv.org/pdf/1903.10545v3.pdf
PWC https://paperswithcode.com/paper/winning-isnt-everything-training-human-like
Repo
Framework

Understanding Important Features of Deep Learning Models for Transmission Electron Microscopy Image Segmentation

Title Understanding Important Features of Deep Learning Models for Transmission Electron Microscopy Image Segmentation
Authors James P. Horwath, Dmitri N. Zakharov, Remi Megret, Eric A. Stach
Abstract Cutting edge deep learning techniques allow for image segmentation with great speed and accuracy. However, application to problems in materials science is often difficult since these complex models may have difficultly learning physical parameters. In situ electron microscopy provides a clear platform for utilizing automated image analysis. In this work we consider the case of studying coarsening dynamics in supported nanoparticles, which is important for understanding e.g. the degradation of industrial catalysts. By systematically studying dataset preparation, neural network architecture, and accuracy evaluation, we describe important considerations in applying deep learning to physical applications, where generalizable and convincing models are required.
Tasks Electron Microscopy Image Segmentation, Semantic Segmentation
Published 2019-12-12
URL https://arxiv.org/abs/1912.06077v1
PDF https://arxiv.org/pdf/1912.06077v1.pdf
PWC https://paperswithcode.com/paper/understanding-important-features-of-deep
Repo
Framework

Distributionally Robust Optimization: A Review

Title Distributionally Robust Optimization: A Review
Authors Hamed Rahimian, Sanjay Mehrotra
Abstract The concepts of risk-aversion, chance-constrained optimization, and robust optimization have developed significantly over the last decade. Statistical learning community has also witnessed a rapid theoretical and applied growth by relying on these concepts. A modeling framework, called distributionally robust optimization (DRO), has recently received significant attention in both the operations research and statistical learning communities. This paper surveys main concepts and contributions to DRO, and its relationships with robust optimization, risk-aversion, chance-constrained optimization, and function regularization.
Tasks
Published 2019-08-13
URL https://arxiv.org/abs/1908.05659v1
PDF https://arxiv.org/pdf/1908.05659v1.pdf
PWC https://paperswithcode.com/paper/distributionally-robust-optimization-a-review
Repo
Framework

Learning Local Forward Models on Unforgiving Games

Title Learning Local Forward Models on Unforgiving Games
Authors Alexander Dockhorn, Simon M. Lucas, Vanessa Volz, Ivan Bravi, Raluca D. Gaina, Diego Perez-Liebana
Abstract This paper examines learning approaches for forward models based on local cell transition functions. We provide a formal definition of local forward models for which we propose two basic learning approaches. Our analysis is based on the game Sokoban, where a wrong action can lead to an unsolvable game state. Therefore, an accurate prediction of an action’s resulting state is necessary to avoid this scenario. In contrast to learning the complete state transition function, local forward models allow extracting multiple training examples from a single state transition. In this way, the Hash Set model, as well as the Decision Tree model, quickly learn to predict upcoming state transitions of both the training and the test set. Applying the model using a statistical forward planner showed that the best models can be used to satisfying degree even in cases in which the test levels have not yet been seen. Our evaluation includes an analysis of various local neighbourhood patterns and sizes to test the learners’ capabilities in case too few or too many attributes are extracted, of which the latter has shown do degrade the performance of the model learner.
Tasks
Published 2019-09-01
URL https://arxiv.org/abs/1909.00442v1
PDF https://arxiv.org/pdf/1909.00442v1.pdf
PWC https://paperswithcode.com/paper/learning-local-forward-models-on-unforgiving
Repo
Framework

Slanted Stixels: A way to represent steep streets

Title Slanted Stixels: A way to represent steep streets
Authors Daniel Hernandez-Juarez, Lukas Schneider, Pau Cebrian, Antonio Espinosa, David Vazquez, Antonio M. Lopez, Uwe Franke, Marc Pollefeys, Juan C. Moure
Abstract This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a Fully Convolutional Network (FCN), which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed methods in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset.
Tasks
Published 2019-10-02
URL https://arxiv.org/abs/1910.01466v1
PDF https://arxiv.org/pdf/1910.01466v1.pdf
PWC https://paperswithcode.com/paper/slanted-stixels-a-way-to-represent-steep
Repo
Framework
comments powered by Disqus