October 16, 2019

3073 words 15 mins read

Paper Group ANR 1055

Paper Group ANR 1055

Video Representation Learning Using Discriminative Pooling. 2D/3D Megavoltage Image Registration Using Convolutional Neural Networks. On Evaluating the Generalization of LSTM Models in Formal Languages. Introduction to the SP theory of intelligence. Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning. How to Read …

Video Representation Learning Using Discriminative Pooling

Title Video Representation Learning Using Discriminative Pooling
Authors Jue Wang, Anoop Cherian, Fatih Porikli, Stephen Gould
Abstract Popular deep models for action recognition in videos generate independent predictions for short clips, which are then pooled heuristically to assign an action label to the full video segment. As not all frames may characterize the underlying action—indeed, many are common across multiple actions—pooling schemes that impose equal importance on all frames might be unfavorable. In an attempt to tackle this problem, we propose discriminative pooling, based on the notion that among the deep features generated on all short clips, there is at least one that characterizes the action. To this end, we learn a (nonlinear) hyperplane that separates this unknown, yet discriminative, feature from the rest. Applying multiple instance learning in a large-margin setup, we use the parameters of this separating hyperplane as a descriptor for the full video segment. Since these parameters are directly related to the support vectors in a max-margin framework, they serve as robust representations for pooling of the features. We formulate a joint objective and an efficient solver that learns these hyperplanes per video and the corresponding action classifiers over the hyperplanes. Our pooling scheme is end-to-end trainable within a deep framework. We report results from experiments on three benchmark datasets spanning a variety of challenges and demonstrate state-of-the-art performance across these tasks.
Tasks Action Recognition In Videos, Multiple Instance Learning, Representation Learning, Temporal Action Localization
Published 2018-03-26
URL http://arxiv.org/abs/1803.10628v2
PDF http://arxiv.org/pdf/1803.10628v2.pdf
PWC https://paperswithcode.com/paper/video-representation-learning-using
Repo
Framework

2D/3D Megavoltage Image Registration Using Convolutional Neural Networks

Title 2D/3D Megavoltage Image Registration Using Convolutional Neural Networks
Authors Hector N. B. Pinheiro, Tsang Ing Ren, Stefan Scheib, Armel Rosselet, Stefan Thieme-Marti
Abstract We presented a 2D/3D MV image registration method based on a Convolutional Neural Network. Most of the traditional image registration method intensity-based, which use optimization algorithms to maximize the similarity between to images. Although these methods can achieve good results for kilovoltage images, the same does not occur for megavoltage images due to the lower image quality. Also, these methods most of the times do not present a good capture range. To deal with this problem, we propose the use of Convolutional Neural Network. The experiments were performed using a dataset of 50 brain images. The results showed to be promising compared to traditional image registration methods.
Tasks Image Registration
Published 2018-11-28
URL http://arxiv.org/abs/1811.11816v1
PDF http://arxiv.org/pdf/1811.11816v1.pdf
PWC https://paperswithcode.com/paper/2d3d-megavoltage-image-registration-using
Repo
Framework

On Evaluating the Generalization of LSTM Models in Formal Languages

Title On Evaluating the Generalization of LSTM Models in Formal Languages
Authors Mirac Suzgun, Yonatan Belinkov, Stuart M. Shieber
Abstract Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established themselves as a dominant model for language processing. Yet, there still remains an uncertainty regarding their language learning capabilities. In this paper, we empirically evaluate the inductive learning capabilities of Long Short-Term Memory networks, a popular extension of simple RNNs, to learn simple formal languages, in particular $a^nb^n$, $a^nb^nc^n$, and $a^nb^nc^nd^n$. We investigate the influence of various aspects of learning, such as training data regimes and model capacity, on the generalization to unobserved samples. We find striking differences in model performances under different training settings and highlight the need for careful analysis and assessment when making claims about the learning capabilities of neural network models.
Tasks
Published 2018-11-02
URL http://arxiv.org/abs/1811.01001v1
PDF http://arxiv.org/pdf/1811.01001v1.pdf
PWC https://paperswithcode.com/paper/on-evaluating-the-generalization-of-lstm
Repo
Framework

Introduction to the SP theory of intelligence

Title Introduction to the SP theory of intelligence
Authors J Gerard Wolff
Abstract This article provides a brief introduction to the “Theory of Intelligence” and its realisation in the “SP Computer Model”. The overall goal of the SP programme of research, in accordance with long-established principles in science, has been the simplification and integration of observations and concepts across artificial intelligence, mainstream computing, mathematics, and human learning, perception, and cognition. In broad terms, the SP system is a brain-like system that takes in “New” information through its senses and stores some or all of it as “Old” information. A central idea in the system is the powerful concept of “SP-multiple-alignment”, borrowed and adapted from bioinformatics. This the key to the system’s versatility in aspects of intelligence, in the representation of diverse kinds of knowledge, and in the seamless integration of diverse aspects of intelligence and diverse kinds of knowledge, in any combination. There are many potential benefits and applications of the SP system. It is envisaged that the system will be developed as the “SP Machine”, which will initially be a software virtual machine, hosted on a high-performance computer, a vehicle for further research and a step towards the development of an industrial-strength SP Machine.
Tasks
Published 2018-02-24
URL http://arxiv.org/abs/1802.09924v1
PDF http://arxiv.org/pdf/1802.09924v1.pdf
PWC https://paperswithcode.com/paper/introduction-to-the-sp-theory-of-intelligence
Repo
Framework

Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning

Title Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning
Authors Can Karakus, Yifan Sun, Suhas Diggavi, Wotao Yin
Abstract Performance of distributed optimization and learning systems is bottlenecked by “straggler” nodes and slow communication links, which significantly delay computation. We propose a distributed optimization framework where the dataset is “encoded” to have an over-complete representation with built-in redundancy, and the straggling nodes in the system are dynamically left out of the computation at every iteration, whose loss is compensated by the embedded redundancy. We show that oblivious application of several popular optimization algorithms on encoded data, including gradient descent, L-BFGS, proximal gradient under data parallelism, and coordinate descent under model parallelism, converge to either approximate or exact solutions of the original problem when stragglers are treated as erasures. These convergence results are deterministic, i.e., they establish sample path convergence for arbitrary sequences of delay patterns or distributions on the nodes, and are independent of the tail behavior of the delay distribution. We demonstrate that equiangular tight frames have desirable properties as encoding matrices, and propose efficient mechanisms for encoding large-scale data. We implement the proposed technique on Amazon EC2 clusters, and demonstrate its performance over several learning problems, including matrix factorization, LASSO, ridge regression and logistic regression, and compare the proposed method with uncoded, asynchronous, and data replication strategies.
Tasks Distributed Optimization
Published 2018-03-14
URL http://arxiv.org/abs/1803.05397v1
PDF http://arxiv.org/pdf/1803.05397v1.pdf
PWC https://paperswithcode.com/paper/redundancy-techniques-for-straggler
Repo
Framework

How to Read Paintings: Semantic Art Understanding with Multi-Modal Retrieval

Title How to Read Paintings: Semantic Art Understanding with Multi-Modal Retrieval
Authors Noa Garcia, George Vogiatzis
Abstract Automatic art analysis has been mostly focused on classifying artworks into different artistic styles. However, understanding an artistic representation involves more complex processes, such as identifying the elements in the scene or recognizing author influences. We present SemArt, a multi-modal dataset for semantic art understanding. SemArt is a collection of fine-art painting images in which each image is associated to a number of attributes and a textual artistic comment, such as those that appear in art catalogues or museum collections. To evaluate semantic art understanding, we envisage the Text2Art challenge, a multi-modal retrieval task where relevant paintings are retrieved according to an artistic text, and vice versa. We also propose several models for encoding visual and textual artistic representations into a common semantic space. Our best approach is able to find the correct image within the top 10 ranked images in the 45.5% of the test samples. Moreover, our models show remarkable levels of art understanding when compared against human evaluation.
Tasks Art Analysis
Published 2018-10-23
URL http://arxiv.org/abs/1810.09617v1
PDF http://arxiv.org/pdf/1810.09617v1.pdf
PWC https://paperswithcode.com/paper/how-to-read-paintings-semantic-art
Repo
Framework

Findings of the Second Workshop on Neural Machine Translation and Generation

Title Findings of the Second Workshop on Neural Machine Translation and Generation
Authors Alexandra Birch, Andrew Finch, Minh-Thang Luong, Graham Neubig, Yusuke Oda
Abstract This document describes the findings of the Second Workshop on Neural Machine Translation and Generation, held in concert with the annual conference of the Association for Computational Linguistics (ACL 2018). First, we summarize the research trends of papers presented in the proceedings, and note that there is particular interest in linguistic structure, domain adaptation, data augmentation, handling inadequate resources, and analysis of models. Second, we describe the results of the workshop’s shared task on efficient neural machine translation, where participants were tasked with creating MT systems that are both accurate and efficient.
Tasks Data Augmentation, Domain Adaptation, Machine Translation
Published 2018-06-08
URL http://arxiv.org/abs/1806.02940v3
PDF http://arxiv.org/pdf/1806.02940v3.pdf
PWC https://paperswithcode.com/paper/findings-of-the-second-workshop-on-neural
Repo
Framework

Learning to Optimize via Wasserstein Deep Inverse Optimal Control

Title Learning to Optimize via Wasserstein Deep Inverse Optimal Control
Authors Yichen Wang, Le Song, Hongyuan Zha
Abstract We study the inverse optimal control problem in social sciences: we aim at learning a user’s true cost function from the observed temporal behavior. In contrast to traditional phenomenological works that aim to learn a generative model to fit the behavioral data, we propose a novel variational principle and treat user as a reinforcement learning algorithm, which acts by optimizing his cost function. We first propose a unified KL framework that generalizes existing maximum entropy inverse optimal control methods. We further propose a two-step Wasserstein inverse optimal control framework. In the first step, we compute the optimal measure with a novel mass transport equation. In the second step, we formulate the learning problem as a generative adversarial network. In two real world experiments - recommender systems and social networks, we show that our framework obtains significant performance gains over both existing inverse optimal control methods and point process based generative models.
Tasks Recommendation Systems
Published 2018-05-22
URL http://arxiv.org/abs/1805.08395v1
PDF http://arxiv.org/pdf/1805.08395v1.pdf
PWC https://paperswithcode.com/paper/learning-to-optimize-via-wasserstein-deep
Repo
Framework

Distributed Adaptive Sampling for Kernel Matrix Approximation

Title Distributed Adaptive Sampling for Kernel Matrix Approximation
Authors Daniele Calandriello, Alessandro Lazaric, Michal Valko
Abstract Most kernel-based methods, such as kernel or Gaussian process regression, kernel PCA, ICA, or $k$-means clustering, do not scale to large datasets, because constructing and storing the kernel matrix $\mathbf{K}n$ requires at least $\mathcal{O}(n^2)$ time and space for $n$ samples. Recent works show that sampling points with replacement according to their ridge leverage scores (RLS) generates small dictionaries of relevant points with strong spectral approximation guarantees for $\mathbf{K}n$. The drawback of RLS-based methods is that computing exact RLS requires constructing and storing the whole kernel matrix. In this paper, we introduce SQUEAK, a new algorithm for kernel approximation based on RLS sampling that sequentially processes the dataset, storing a dictionary which creates accurate kernel matrix approximations with a number of points that only depends on the effective dimension $d{eff}(\gamma)$ of the dataset. Moreover since all the RLS estimations are efficiently performed using only the small dictionary, SQUEAK is the first RLS sampling algorithm that never constructs the whole matrix $\mathbf{K}n$, runs in linear time $\widetilde{\mathcal{O}}(nd{eff}(\gamma)^3)$ w.r.t. $n$, and requires only a single pass over the dataset. We also propose a parallel and distributed version of SQUEAK that linearly scales across multiple machines, achieving similar accuracy in as little as $\widetilde{\mathcal{O}}(\log(n)d{eff}(\gamma)^3)$ time.
Tasks
Published 2018-03-27
URL http://arxiv.org/abs/1803.10172v1
PDF http://arxiv.org/pdf/1803.10172v1.pdf
PWC https://paperswithcode.com/paper/distributed-adaptive-sampling-for-kernel
Repo
Framework

Dissociable neural representations of adversarially perturbed images in deep neural networks and the human brain

Title Dissociable neural representations of adversarially perturbed images in deep neural networks and the human brain
Authors Chi Zhang, Xiaohan Duan, Linyuan Wang, Yongli Li, Bin Yan, Guoen Hu, Ruyuan Zhang, Li Tong
Abstract Despite the remarkable similarities between deep neural networks (DNN) and the human brain as shown in previous studies, the fact that DNNs still fall behind humans in many visual tasks suggests that considerable differences still exist between the two systems. To probe their dissimilarities, we leverage adversarial noise (AN) and adversarial interference (AI) images that yield distinct recognition performance in a prototypical DNN (AlexNet) and human vision. The evoked activity by regular (RE) and adversarial images in both systems is thoroughly compared. We find that representational similarity between RE and adversarial images in the human brain resembles their perceptual similarity. However, such representation-perception association is disrupted in the DNN. Especially, the representational similarity between RE and AN images idiosyncratically increases from low- to high-level layers. Furthermore, forward encoding modeling reveals that the DNN-brain hierarchical correspondence proposed in previous studies only holds when the two systems process RE and AI images but not AN images. These results might be due to the deterministic modeling approach of current DNNs. Taken together, our results provide a complementary perspective on the comparison between DNNs and the human brain, and highlight the need to characterize their differences to further bridge artificial and human intelligence research.
Tasks
Published 2018-12-22
URL http://arxiv.org/abs/1812.09431v1
PDF http://arxiv.org/pdf/1812.09431v1.pdf
PWC https://paperswithcode.com/paper/dissociable-neural-representations-of
Repo
Framework

Adversarial Balancing for Causal Inference

Title Adversarial Balancing for Causal Inference
Authors Michal Ozery-Flato, Pierre Thodoroff, Matan Ninio, Michal Rosen-Zvi, Tal El-Hay
Abstract Biases in observational data of treatments pose a major challenge to estimating expected treatment outcomes in different populations. An important technique that accounts for these biases is reweighting samples to minimize the discrepancy between treatment groups. We present a novel reweighting approach that uses bi-level optimization to alternately train a discriminator to minimize classification error, and a balancing weights generator that uses exponentiated gradient descent to maximize this error. This approach borrows principles from generative adversarial networks (GANs) to exploit the power of classifiers for measuring two-sample divergence. We provide theoretical results for conditions in which the estimation error is bounded by two factors: (i) the discrepancy measure induced by the discriminator; and (ii) the weights variability. Experimental results on several benchmarks comparing to previous state-of-the-art reweighting methods demonstrate the effectiveness of this approach in estimating causal effects.
Tasks Causal Inference
Published 2018-10-17
URL http://arxiv.org/abs/1810.07406v2
PDF http://arxiv.org/pdf/1810.07406v2.pdf
PWC https://paperswithcode.com/paper/adversarial-balancing-for-causal-inference
Repo
Framework

Specialized Interior Point Algorithm for Stable Nonlinear System Identification

Title Specialized Interior Point Algorithm for Stable Nonlinear System Identification
Authors Jack Umenberger, Ian R. Manchester
Abstract Estimation of nonlinear dynamic models from data poses many challenges, including model instability and non-convexity of long-term simulation fidelity. Recently Lagrangian relaxation has been proposed as a method to approximate simulation fidelity and guarantee stability via semidefinite programming (SDP), however the resulting SDPs have large dimension, limiting their utility in practical problems. In this paper we develop a path-following interior point algorithm that takes advantage of special structure in the problem and reduces computational complexity from cubic to linear growth with the length of the data set. The new algorithm enables empirical comparisons to established methods including Nonlinear ARX, and we demonstrate superior generalization to new data. We also explore the “regularizing” effect of stability constraints as an alternative to regressor subset selection.
Tasks
Published 2018-03-02
URL http://arxiv.org/abs/1803.01066v1
PDF http://arxiv.org/pdf/1803.01066v1.pdf
PWC https://paperswithcode.com/paper/specialized-interior-point-algorithm-for
Repo
Framework

Regularization Effect of Fast Gradient Sign Method and its Generalization

Title Regularization Effect of Fast Gradient Sign Method and its Generalization
Authors Chandler Zuo
Abstract Fast Gradient Sign Method (FGSM) is a popular method to generate adversarial examples that make neural network models robust against perturbations. Despite its empirical success, its theoretical property is not well understood. This paper develops theory to explain the regularization effect of Generalized FGSM, a class of methods to generate adversarial examples. Motivated from the relationship between FGSM and LASSO penalty, the asymptotic properties of Generalized FGSM are derived in the Generalized Linear Model setting, which is essentially the 1-layer neural network setting with certain activation functions. In such simple neural network models, I prove that Generalized FGSM estimation is root n-consistent and weakly oracle under proper conditions. The asymptotic results are also highly similar to penalized likelihood estimation. Nevertheless, Generalized FGSM introduces additional bias when data sampling is not sign neutral, a concept I introduce to describe the balance-ness of the noise signs. Although the theory in this paper is developed under simple neural network settings, I argue that it may give insights and justification for FGSM in deep neural network settings as well.
Tasks
Published 2018-10-27
URL http://arxiv.org/abs/1810.11711v2
PDF http://arxiv.org/pdf/1810.11711v2.pdf
PWC https://paperswithcode.com/paper/regularization-effect-of-fast-gradient-sign
Repo
Framework

Discrete Structural Planning for Neural Machine Translation

Title Discrete Structural Planning for Neural Machine Translation
Authors Raphael Shu, Hideki Nakayama
Abstract Structural planning is important for producing long sentences, which is a missing part in current language generation models. In this work, we add a planning phase in neural machine translation to control the coarse structure of output sentences. The model first generates some planner codes, then predicts real output words conditioned on them. The codes are learned to capture the coarse structure of the target sentence. In order to obtain the codes, we design an end-to-end neural network with a discretization bottleneck, which predicts the simplified part-of-speech tags of target sentences. Experiments show that the translation performance are generally improved by planning ahead. We also find that translations with different structures can be obtained by manipulating the planner codes.
Tasks Machine Translation, Text Generation
Published 2018-08-14
URL http://arxiv.org/abs/1808.04525v1
PDF http://arxiv.org/pdf/1808.04525v1.pdf
PWC https://paperswithcode.com/paper/discrete-structural-planning-for-neural
Repo
Framework

Detect, Quantify, and Incorporate Dataset Bias: A Neuroimaging Analysis on 12,207 Individuals

Title Detect, Quantify, and Incorporate Dataset Bias: A Neuroimaging Analysis on 12,207 Individuals
Authors Christian Wachinger, Benjamin Gutierrez Becker, Anna Rieckmann
Abstract Neuroimaging datasets keep growing in size to address increasingly complex medical questions. However, even the largest datasets today alone are too small for training complex models or for finding genome wide associations. A solution is to grow the sample size by merging data across several datasets. However, bias in datasets complicates this approach and includes additional sources of variation in the data instead. In this work, we combine 15 large neuroimaging datasets to study bias. First, we detect bias by demonstrating that scans can be correctly assigned to a dataset with 73.3% accuracy. Next, we introduce metrics to quantify the compatibility across datasets and to create embeddings of neuroimaging sites. Finally, we incorporate the presence of bias for the selection of a training set for predicting autism. For the quantification of the dataset bias, we introduce two metrics: the Bhattacharyya distance between datasets and the age prediction error. The presented embedding of neuroimaging sites provides an interesting new visualization about the similarity of different sites. This could be used to guide the merging of data sources, while limiting the introduction of unwanted variation. Finally, we demonstrate a clear performance increase when incorporating dataset bias for training set selection in autism prediction. Overall, we believe that the growing amount of neuroimaging data necessitates to incorporate data-driven methods for quantifying dataset bias in future analyses.
Tasks
Published 2018-04-28
URL http://arxiv.org/abs/1804.10764v1
PDF http://arxiv.org/pdf/1804.10764v1.pdf
PWC https://paperswithcode.com/paper/detect-quantify-and-incorporate-dataset-bias
Repo
Framework
comments powered by Disqus