Paper Group NANR 206
On Acceleration with Noise-Corrupted Gradients. Unsupervised Class-Specific Deblurring. Low-resource Cross-lingual Event Type Detection via Distant Supervision with Minimal Effort. Points, Paths, and Playscapes: Large-scale Spatial Language Understanding Tasks Set in the Real World. Robust and Scalable Models of Microbiome Dynamics. Learning what t …
On Acceleration with Noise-Corrupted Gradients
Title | On Acceleration with Noise-Corrupted Gradients |
Authors | Michael Cohen, Jelena Diakonikolas, Lorenzo Orecchia |
Abstract | Accelerated algorithms have broad applications in large-scale optimization, due to their generality and fast convergence. However, their stability in the practical setting of noise-corrupted gradient oracles is not well-understood. This paper provides two main technical contributions: (i) a new accelerated method AGDP that generalizes Nesterov’s AGD and improves on the recent method AXGD (Diakonikolas & Orecchia, 2018), and (ii) a theoretical study of accelerated algorithms under noisy and inexact gradient oracles, which is supported by numerical experiments. This study leverages the simplicity of AGDP and its analysis to clarify the interaction between noise and acceleration and to suggest modifications to the algorithm that reduce the mean and variance of the error incurred due to the gradient noise. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2173 |
http://proceedings.mlr.press/v80/cohen18a/cohen18a.pdf | |
PWC | https://paperswithcode.com/paper/on-acceleration-with-noise-corrupted |
Repo | |
Framework | |
Unsupervised Class-Specific Deblurring
Title | Unsupervised Class-Specific Deblurring |
Authors | Thekke Madam Nimisha, Kumar Sunil, A. N. Rajagopalan |
Abstract | In this paper, we present an end-to-end deblurring network designed specifically for a class of data. Unlike the prior supervised deep-learning works that extensively rely on large sets of paired data, which is highly demanding and challenging to obtain, we propose an unsupervised training scheme with unpaired data to achieve the same. Our model consists of a Generative Adversarial Network (GAN) that learns a strong prior on the clean image domain using adversarial loss and maps the blurred image to its clean equivalent. To improve the stability of GAN and to preserve the image correspondence, we introduce an additional CNN module that reblurs the generated GAN output to match with the blurred input. Along with these two modules, we also make use of the blurred image itself to self-guide the network to constrain the solution space of generated clean images. This self-guidance is achieved by imposing a scale-space gradient error with an additional gradient module. We train our model on different classes and observe that adding the reblur and gradient modules help in better convergence. Extensive experiments demonstrate that our method performs favorably against the state-of-the-art supervised methods on both synthetic and real-world images even in the absence of any supervision. |
Tasks | Deblurring |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Nimisha_T_M_Unsupervised_Class-Specific_Deblurring_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Nimisha_T_M_Unsupervised_Class-Specific_Deblurring_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-class-specific-deblurring |
Repo | |
Framework | |
Low-resource Cross-lingual Event Type Detection via Distant Supervision with Minimal Effort
Title | Low-resource Cross-lingual Event Type Detection via Distant Supervision with Minimal Effort |
Authors | Aldrian Obaja Muis, Naoki Otani, Nidhi Vyas, Ruochen Xu, Yiming Yang, Teruko Mitamura, Eduard Hovy |
Abstract | |
Tasks | Domain Adaptation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/papers/C18-1007/c18-1007 |
https://www.aclweb.org/anthology/C18-1007v2 | |
PWC | https://paperswithcode.com/paper/low-resource-cross-lingual-event-type |
Repo | |
Framework | |
Points, Paths, and Playscapes: Large-scale Spatial Language Understanding Tasks Set in the Real World
Title | Points, Paths, and Playscapes: Large-scale Spatial Language Understanding Tasks Set in the Real World |
Authors | Jason Baldridge, Tania Bedrax-Weiss, Daphne Luong, Srini Narayanan, Bo Pang, Fernando Pereira, Radu Soricut, Michael Tseng, Yuan Zhang |
Abstract | |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/papers/W18-1406/w18-1406 |
https://www.aclweb.org/anthology/W18-1406v2 | |
PWC | https://paperswithcode.com/paper/points-paths-and-playscapes-large-scale |
Repo | |
Framework | |
Robust and Scalable Models of Microbiome Dynamics
Title | Robust and Scalable Models of Microbiome Dynamics |
Authors | Travis Gibson, Georg Gerber |
Abstract | Microbes are everywhere, including in and on our bodies, and have been shown to play key roles in a variety of prevalent human diseases. Consequently, there has been intense interest in the design of bacteriotherapies or “bugs as drugs,” which are communities of bacteria administered to patients for specific therapeutic applications. Central to the design of such therapeutics is an understanding of the causal microbial interaction network and the population dynamics of the organisms. In this work we present a Bayesian nonparametric model and associated efficient inference algorithm that addresses the key conceptual and practical challenges of learning microbial dynamics from time series microbe abundance data. These challenges include high-dimensional (300+ strains of bacteria in the gut) but temporally sparse and non-uniformly sampled data; high measurement noise; and, nonlinear and physically non-negative dynamics. Our contributions include a new type of dynamical systems model for microbial dynamics based on what we term interaction modules, or learned clusters of latent variables with redundant interaction structure (reducing the expected number of interaction coefficients from O(n^2) to O((log n)^2)); a fully Bayesian formulation of the stochastic dynamical systems model that propagates measurement and latent state uncertainty throughout the model; and introduction of a temporally varying auxiliary variable technique to enable efficient inference by relaxing the hard non-negativity constraint on states. We apply our method to simulated and real data, and demonstrate the utility of our technique for system identification from limited data and gaining new biological insights into bacteriotherapy design. |
Tasks | Time Series |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2382 |
http://proceedings.mlr.press/v80/gibson18a/gibson18a.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-scalable-models-of-microbiome |
Repo | |
Framework | |
Learning what to learn in a neural program
Title | Learning what to learn in a neural program |
Authors | Richard Shin, Dawn Song |
Abstract | Learning programs with neural networks is a challenging task, addressed by a long line of existing work. It is difficult to learn neural networks which will generalize to problem instances that are much larger than those used during training. Furthermore, even when the learned neural program empirically works on all test inputs, we cannot verify that it will work on every possible input. Recent work has shown that it is possible to address these issues by using recursion in the Neural Programmer-Interpreter, but this technique requires a verification set which is difficult to construct without knowledge of the internals of the oracle used to generate training data. In this work, we show how to automatically build such a verification set, which can also be directly used for training. By interactively querying an oracle, we can construct this set with minimal additional knowledge about the oracle. We empirically demonstrate that our method allows automated learning and verification of a recursive NPI program with provably perfect generalization. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=BJ4prNx0W |
https://openreview.net/pdf?id=BJ4prNx0W | |
PWC | https://paperswithcode.com/paper/learning-what-to-learn-in-a-neural-program |
Repo | |
Framework | |
Learning Dynamic State Abstractions for Model-Based Reinforcement Learning
Title | Learning Dynamic State Abstractions for Model-Based Reinforcement Learning |
Authors | Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra |
Abstract | A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models. We show that carefully designed models that learn predictive and compact state representations, also called state-space models, substantially reduce the computational costs for predicting outcomes of sequences of actions. Extensive experiments establish that state-space models accurately capture the dynamics of Atari games from the Arcade Learning Environment (ALE) from raw pixels. Furthermore, RL agents that use Monte-Carlo rollouts of these models as features for decision making outperform strong model-free baselines on the game MS_PACMAN, demonstrating the benefits of planning using learned dynamic state abstractions. |
Tasks | Atari Games, Decision Making |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HJw8fAgA- |
https://openreview.net/pdf?id=HJw8fAgA- | |
PWC | https://paperswithcode.com/paper/learning-dynamic-state-abstractions-for-model |
Repo | |
Framework | |
Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos
Title | Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos |
Authors | Bingbin Liu, Serena Yeung, Edward Chou, De-An Huang, Li Fei-Fei, Juan Carlos Niebles |
Abstract | A major challenge in computer vision is scaling activity understanding to the long tail of complex activities without requiring collecting large quantities of data for new actions. The task of video retrieval using natural language descriptions seeks to address this through rich, unconstrained supervision about complex activities. However, while this formulation offers hope of leveraging underlying compositional structure in activity descriptions, existing approaches typically do not explicitly model compositional reasoning. In this work, we introduce an approach for explicitly and dynamically reasoning about compositional natural language descriptions of activity in videos. We take a modular neural network approach that, given a natural language query, extracts the semantic structure to assemble a compositional neural network layout and corresponding network modules. We show that this approach is able to achieve state-of-the-art results on the DiDeMo video retrieval dataset. |
Tasks | Video Retrieval |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Bingbin_Liu_Temporal_Modular_Networks_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Bingbin_Liu_Temporal_Modular_Networks_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/temporal-modular-networks-for-retrieving |
Repo | |
Framework | |
Literality and cognitive effort: Japanese and Spanish
Title | Literality and cognitive effort: Japanese and Spanish |
Authors | Isabel Lacruz, Michael Carl, Masaru Yamada |
Abstract | |
Tasks | Speech Synthesis |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1603/ |
https://www.aclweb.org/anthology/L18-1603 | |
PWC | https://paperswithcode.com/paper/literality-and-cognitive-effort-japanese-and |
Repo | |
Framework | |
Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks
Title | Nearest Neighbour Radial Basis Function Solvers for Deep Neural Networks |
Authors | Benjamin J. Meyer, Ben Harwood, Tom Drummond |
Abstract | We present a radial basis function solver for convolutional neural networks that can be directly applied to both distance metric learning and classification problems. Our method treats all training features from a deep neural network as radial basis function centres and computes loss by summing the influence of a feature’s nearby centres in the embedding space. Having a radial basis function centred on each training feature is made scalable by treating it as an approximate nearest neighbour search problem. End-to-end learning of the network and solver is carried out, mapping high dimensional features into clusters of the same class. This results in a well formed embedding space, where semantically related instances are likely to be located near one another, regardless of whether or not the network was trained on those classes. The same loss function is used for both the metric learning and classification problems. We show that our radial basis function solver outperforms state-of-the-art embedding approaches on the Stanford Cars196 and CUB-200-2011 datasets. Additionally, we show that when used as a classifier, our method outperforms a conventional softmax classifier on the CUB-200-2011, Stanford Cars196, Oxford 102 Flowers and Leafsnap fine-grained classification datasets. |
Tasks | Metric Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SkFEGHx0Z |
https://openreview.net/pdf?id=SkFEGHx0Z | |
PWC | https://paperswithcode.com/paper/nearest-neighbour-radial-basis-function |
Repo | |
Framework | |
A fine-grained error analysis of NMT, SMT and RBMT output for English-to-Dutch
Title | A fine-grained error analysis of NMT, SMT and RBMT output for English-to-Dutch |
Authors | Laura Van Brussel, Arda Tezcan, Lieve Macken |
Abstract | |
Tasks | Machine Translation, Word Sense Disambiguation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1600/ |
https://www.aclweb.org/anthology/L18-1600 | |
PWC | https://paperswithcode.com/paper/a-fine-grained-error-analysis-of-nmt-smt-and |
Repo | |
Framework | |
Consensus Maximization for Semantic Region Correspondences
Title | Consensus Maximization for Semantic Region Correspondences |
Authors | Pablo Speciale, Danda P. Paudel, Martin R. Oswald, Hayko Riemenschneider, Luc Van Gool, Marc Pollefeys |
Abstract | We propose a novel method for the geometric registration of semantically labeled regions. We approximate semantic regions by ellipsoids, and leverage their convexity to formulate the correspondence search effectively as a constrained optimization problem that maximizes the number of matched regions, and which we solve globally optimal in a branch-and-bound fashion. To this end, we derive suitable linear matrix inequality constraints which describe ellipsoid-to-ellipsoid assignment conditions. Our approach is robust to large percentages of outliers and thus applicable to difficult correspondence search problems. In multiple experiments we demonstrate the flexibility and robustness of our approach on a number of challenging vision problems. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Speciale_Consensus_Maximization_for_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Speciale_Consensus_Maximization_for_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/consensus-maximization-for-semantic-region |
Repo | |
Framework | |
Katyusha X: Simple Momentum Method for Stochastic Sum-of-Nonconvex Optimization
Title | Katyusha X: Simple Momentum Method for Stochastic Sum-of-Nonconvex Optimization |
Authors | Zeyuan Allen-Zhu |
Abstract | The problem of minimizing sum-of-nonconvex functions (i.e., convex functions that are average of non-convex ones) is becoming increasing important in machine learning, and is the core machinery for PCA, SVD, regularized Newton’s method, accelerated non-convex optimization, and more. We show how to provably obtain an accelerated stochastic algorithm for minimizing sum-of-nonconvex functions, by adding one additional line to the well-known SVRG method. This line corresponds to momentum, and shows how to directly apply momentum to the finite-sum stochastic minimization of sum-of-nonconvex functions. As a side result, our method enjoys linear parallel speed-up using mini-batch. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2017 |
http://proceedings.mlr.press/v80/allen-zhu18a/allen-zhu18a.pdf | |
PWC | https://paperswithcode.com/paper/katyusha-x-simple-momentum-method-for |
Repo | |
Framework | |
A dataset and baselines for sequential open-domain question answering
Title | A dataset and baselines for sequential open-domain question answering |
Authors | Ahmed Elgohary, Chen Zhao, Jordan Boyd-Graber |
Abstract | Previous work on question-answering systems mainly focuses on answering individual questions, assuming they are independent and devoid of context. Instead, we investigate sequential question answering, asking multiple related questions. We present QBLink, a new dataset of fully human-authored questions. We extend existing strong question answering frameworks to include previous questions to improve the overall question-answering accuracy in open-domain question answering. The dataset is publicly available at \url{http://sequential.qanta.org}. |
Tasks | Information Retrieval, Open-Domain Question Answering, Question Answering, Reading Comprehension |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1134/ |
https://www.aclweb.org/anthology/D18-1134 | |
PWC | https://paperswithcode.com/paper/a-dataset-and-baselines-for-sequential-open |
Repo | |
Framework | |
Learning temporal evolution of probability distribution with Recurrent Neural Network
Title | Learning temporal evolution of probability distribution with Recurrent Neural Network |
Authors | Kyongmin Yeo, Igor Melnyk, Nam Nguyen, Eun Kyung Lee |
Abstract | We propose to tackle a time series regression problem by computing temporal evolution of a probability density function to provide a probabilistic forecast. A Recurrent Neural Network (RNN) based model is employed to learn a nonlinear operator for temporal evolution of a probability density function. We use a softmax layer for a numerical discretization of a smooth probability density functions, which transforms a function approximation problem to a classification task. Explicit and implicit regularization strategies are introduced to impose a smoothness condition on the estimated probability distribution. A Monte Carlo procedure to compute the temporal evolution of the distribution for a multiple-step forecast is presented. The evaluation of the proposed algorithm on three synthetic and two real data sets shows advantage over the compared baselines. |
Tasks | Time Series |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=BkDB51WR- |
https://openreview.net/pdf?id=BkDB51WR- | |
PWC | https://paperswithcode.com/paper/learning-temporal-evolution-of-probability |
Repo | |
Framework | |