October 19, 2019

2919 words 14 mins read

Paper Group ANR 166

Paper Group ANR 166

Attention based Sentence Extraction from Scientific Articles using Pseudo-Labeled data. An Instability in Variational Inference for Topic Models. Semisupervised Learning on Heterogeneous Graphs and its Applications to Facebook News Feed. DCFNet: Deep Neural Network with Decomposed Convolutional Filters. Topic Modeling on Health Journals with Regula …

Attention based Sentence Extraction from Scientific Articles using Pseudo-Labeled data

Title Attention based Sentence Extraction from Scientific Articles using Pseudo-Labeled data
Authors Parth Mehta, Gaurav Arora, Prasenjit Majumder
Abstract In this work, we present a weakly supervised sentence extraction technique for identifying important sentences in scientific papers that are worthy of inclusion in the abstract. We propose a new attention based deep learning architecture that jointly learns to identify important content, as well as the cue phrases that are indicative of summary worthy sentences. We propose a new context embedding technique for determining the focus of a given paper using topic models and use it jointly with an LSTM based sequence encoder to learn attention weights across the sentence words. We use a collection of articles publicly available through ACL anthology for our experiments. Our system achieves a performance that is better, in terms of several ROUGE metrics, as compared to several state of art extractive techniques. It also generates more coherent summaries and preserves the overall structure of the document.
Tasks Topic Models
Published 2018-02-13
URL http://arxiv.org/abs/1802.04675v1
PDF http://arxiv.org/pdf/1802.04675v1.pdf
PWC https://paperswithcode.com/paper/attention-based-sentence-extraction-from
Repo
Framework

An Instability in Variational Inference for Topic Models

Title An Instability in Variational Inference for Topic Models
Authors Behrooz Ghorbani, Hamid Javadi, Andrea Montanari
Abstract Topic models are Bayesian models that are frequently used to capture the latent structure of certain corpora of documents or images. Each data element in such a corpus (for instance each item in a collection of scientific articles) is regarded as a convex combination of a small number of vectors corresponding to topics' or components’. The weights are assumed to have a Dirichlet prior distribution. The standard approach towards approximating the posterior is to use variational inference algorithms, and in particular a mean field approximation. We show that this approach suffers from an instability that can produce misleading conclusions. Namely, for certain regimes of the model parameters, variational inference outputs a non-trivial decomposition into topics. However –for the same parameter values– the data contain no actual information about the true decomposition, and hence the output of the algorithm is uncorrelated with the true topic decomposition. Among other consequences, the estimated posterior mean is significantly wrong, and estimated Bayesian credible regions do not achieve the nominal coverage. We discuss how this instability is remedied by more accurate mean field approximations.
Tasks Topic Models
Published 2018-02-02
URL http://arxiv.org/abs/1802.00568v1
PDF http://arxiv.org/pdf/1802.00568v1.pdf
PWC https://paperswithcode.com/paper/an-instability-in-variational-inference-for
Repo
Framework

Semisupervised Learning on Heterogeneous Graphs and its Applications to Facebook News Feed

Title Semisupervised Learning on Heterogeneous Graphs and its Applications to Facebook News Feed
Authors Cheng Ju, James Li, Bram Wasti, Shengbo Guo
Abstract Graph-based semi-supervised learning is a fundamental machine learning problem, and has been well studied. Most studies focus on homogeneous networks (e.g. citation network, friend network). In the present paper, we propose the Heterogeneous Embedding Label Propagation (HELP) algorithm, a graph-based semi-supervised deep learning algorithm, for graphs that are characterized by heterogeneous node types. Empirically, we demonstrate the effectiveness of this method in domain classification tasks with Facebook user-domain interaction graph, and compare the performance of the proposed HELP algorithm with the state of the art algorithms. We show that the HELP algorithm improves the predictive performance across multiple tasks, together with semantically meaningful embedding that are discriminative for downstream classification or regression tasks.
Tasks
Published 2018-05-18
URL http://arxiv.org/abs/1805.07479v2
PDF http://arxiv.org/pdf/1805.07479v2.pdf
PWC https://paperswithcode.com/paper/semisupervised-learning-on-heterogeneous
Repo
Framework

DCFNet: Deep Neural Network with Decomposed Convolutional Filters

Title DCFNet: Deep Neural Network with Decomposed Convolutional Filters
Authors Qiang Qiu, Xiuyuan Cheng, Robert Calderbank, Guillermo Sapiro
Abstract Filters in a Convolutional Neural Network (CNN) contain model parameters learned from enormous amounts of data. In this paper, we suggest to decompose convolutional filters in CNN as a truncated expansion with pre-fixed bases, namely the Decomposed Convolutional Filters network (DCFNet), where the expansion coefficients remain learned from data. Such a structure not only reduces the number of trainable parameters and computation, but also imposes filter regularity by bases truncation. Through extensive experiments, we consistently observe that DCFNet maintains accuracy for image classification tasks with a significant reduction of model parameters, particularly with Fourier-Bessel (FB) bases, and even with random bases. Theoretically, we analyze the representation stability of DCFNet with respect to input variations, and prove representation stability under generic assumptions on the expansion coefficients. The analysis is consistent with the empirical observations.
Tasks Image Classification
Published 2018-02-12
URL http://arxiv.org/abs/1802.04145v3
PDF http://arxiv.org/pdf/1802.04145v3.pdf
PWC https://paperswithcode.com/paper/dcfnet-deep-neural-network-with-decomposed
Repo
Framework

Topic Modeling on Health Journals with Regularized Variational Inference

Title Topic Modeling on Health Journals with Regularized Variational Inference
Authors Robert Giaquinto, Arindam Banerjee
Abstract Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors. The novelty of the DAP model lies in its representation of authors by a persona — where personas capture the propensity to write about certain topics over time. Further, we present a regularized variational inference algorithm, which we use to encourage the DAP model’s personas to be distinct. Our results show significant improvements over competing topic models — particularly after regularization, and highlight the DAP model’s unique ability to capture common journeys shared by different authors.
Tasks Topic Models
Published 2018-01-15
URL http://arxiv.org/abs/1801.04958v1
PDF http://arxiv.org/pdf/1801.04958v1.pdf
PWC https://paperswithcode.com/paper/topic-modeling-on-health-journals-with
Repo
Framework

Perceptual Visual Interactive Learning

Title Perceptual Visual Interactive Learning
Authors Shenglan Liu, Xiang Liu, Yang Liu, Lin Feng, Hong Qiao, Jian Zhou, Yang Wang
Abstract Supervised learning methods are widely used in machine learning. However, the lack of labels in existing data limits the application of these technologies. Visual interactive learning (VIL) compared with computers can avoid semantic gap, and solve the labeling problem of small label quantity (SLQ) samples in a groundbreaking way. In order to fully understand the importance of VIL to the interaction process, we re-summarize the interactive learning related algorithms (e.g. clustering, classification, retrieval etc.) from the perspective of VIL. Note that, perception and cognition are two main visual processes of VIL. On this basis, we propose a perceptual visual interactive learning (PVIL) framework, which adopts gestalt principle to design interaction strategy and multi-dimensionality reduction (MDR) to optimize the process of visualization. The advantage of PVIL framework is that it combines computer’s sensitivity of detailed features and human’s overall understanding of global tasks. Experimental results validate that the framework is superior to traditional computer labeling methods (such as label propagation) in both accuracy and efficiency, which achieves significant classification results on dense distribution and sparse classes dataset.
Tasks Dimensionality Reduction
Published 2018-10-25
URL http://arxiv.org/abs/1810.10789v1
PDF http://arxiv.org/pdf/1810.10789v1.pdf
PWC https://paperswithcode.com/paper/perceptual-visual-interactive-learning
Repo
Framework

Real-time Prediction of Intermediate-Horizon Automotive Collision Risk

Title Real-time Prediction of Intermediate-Horizon Automotive Collision Risk
Authors Blake Wulfe, Sunil Chintakindi, Sou-Cheng T. Choi, Rory Hartong-Redden, Anuradha Kodali, Mykel J. Kochenderfer
Abstract Advanced collision avoidance and driver hand-off systems can benefit from the ability to accurately predict, in real time, the probability a vehicle will be involved in a collision within an intermediate horizon of 10 to 20 seconds. The rarity of collisions in real-world data poses a significant challenge to developing this capability because, as we demonstrate empirically, intermediate-horizon risk prediction depends heavily on high-dimensional driver behavioral features. As a result, a large amount of data is required to fit an effective predictive model. In this paper, we assess whether simulated data can help alleviate this issue. Focusing on highway driving, we present a three-step approach for generating data and fitting a predictive model capable of real-time prediction. First, high-risk automotive scenes are generated using importance sampling on a learned Bayesian network scene model. Second, collision risk is estimated through Monte Carlo simulation. Third, a neural network domain adaptation model is trained on real and simulated data to address discrepancies between the two domains. Experiments indicate that simulated data can mitigate issues resulting from collision rarity, thereby improving risk prediction in real-world data.
Tasks Domain Adaptation
Published 2018-02-05
URL http://arxiv.org/abs/1802.01532v1
PDF http://arxiv.org/pdf/1802.01532v1.pdf
PWC https://paperswithcode.com/paper/real-time-prediction-of-intermediate-horizon
Repo
Framework

Distance-based Kernels for Surrogate Model-based Neuroevolution

Title Distance-based Kernels for Surrogate Model-based Neuroevolution
Authors Jörg Stork, Martin Zaefferer, Thomas Bartz-Beielstein
Abstract The topology optimization of artificial neural networks can be particularly difficult if the fitness evaluations require expensive experiments or simulations. For that reason, the optimization methods may need to be supported by surrogate models. We propose different distances for a suitable surrogate model, and compare them in a simple numerical test scenario.
Tasks
Published 2018-07-20
URL http://arxiv.org/abs/1807.07839v1
PDF http://arxiv.org/pdf/1807.07839v1.pdf
PWC https://paperswithcode.com/paper/distance-based-kernels-for-surrogate-model
Repo
Framework

Stochastic Gradient Descent Learns State Equations with Nonlinear Activations

Title Stochastic Gradient Descent Learns State Equations with Nonlinear Activations
Authors Samet Oymak
Abstract We study discrete time dynamical systems governed by the state equation $h_{t+1}=\phi(Ah_t+Bu_t)$. Here $A,B$ are weight matrices, $\phi$ is an activation function, and $u_t$ is the input data. This relation is the backbone of recurrent neural networks (e.g. LSTMs) which have broad applications in sequential learning tasks. We utilize stochastic gradient descent to learn the weight matrices from a finite input/state trajectory $(u_t,h_t)_{t=0}^N$. We prove that SGD estimate linearly converges to the ground truth weights while using near-optimal sample size. Our results apply to increasing activations whose derivatives are bounded away from zero. The analysis is based on i) a novel SGD convergence result with nonlinear activations and ii) careful statistical characterization of the state vector. Numerical experiments verify the fast convergence of SGD on ReLU and leaky ReLU in consistence with our theory.
Tasks
Published 2018-09-09
URL http://arxiv.org/abs/1809.03019v1
PDF http://arxiv.org/pdf/1809.03019v1.pdf
PWC https://paperswithcode.com/paper/stochastic-gradient-descent-learns-state
Repo
Framework

Universal approximations of invariant maps by neural networks

Title Universal approximations of invariant maps by neural networks
Authors Dmitry Yarotsky
Abstract We describe generalizations of the universal approximation theorem for neural networks to maps invariant or equivariant with respect to linear representations of groups. Our goal is to establish network-like computational models that are both invariant/equivariant and provably complete in the sense of their ability to approximate any continuous invariant/equivariant map. Our contribution is three-fold. First, in the general case of compact groups we propose a construction of a complete invariant/equivariant network using an intermediate polynomial layer. We invoke classical theorems of Hilbert and Weyl to justify and simplify this construction; in particular, we describe an explicit complete ansatz for approximation of permutation-invariant maps. Second, we consider groups of translations and prove several versions of the universal approximation theorem for convolutional networks in the limit of continuous signals on euclidean spaces. Finally, we consider 2D signal transformations equivariant with respect to the group SE(2) of rigid euclidean motions. In this case we introduce the “charge–conserving convnet” – a convnet-like computational model based on the decomposition of the feature space into isotypic representations of SO(2). We prove this model to be a universal approximator for continuous SE(2)–equivariant signal transformations.
Tasks
Published 2018-04-26
URL http://arxiv.org/abs/1804.10306v1
PDF http://arxiv.org/pdf/1804.10306v1.pdf
PWC https://paperswithcode.com/paper/universal-approximations-of-invariant-maps-by
Repo
Framework

Multi-Label Transfer Learning for Multi-Relational Semantic Similarity

Title Multi-Label Transfer Learning for Multi-Relational Semantic Similarity
Authors Li Zhang, Steven R. Wilson, Rada Mihalcea
Abstract Multi-relational semantic similarity datasets define the semantic relations between two short texts in multiple ways, e.g., similarity, relatedness, and so on. Yet, all the systems to date designed to capture such relations target one relation at a time. We propose a multi-label transfer learning approach based on LSTM to make predictions for several relations simultaneously and aggregate the losses to update the parameters. This multi-label regression approach jointly learns the information provided by the multiple relations, rather than treating them as separate tasks. Not only does this approach outperform the single-task approach and the traditional multi-task learning approach, but it also achieves state-of-the-art performance on all but one relation of the Human Activity Phrase dataset.
Tasks Multi-Task Learning, Semantic Similarity, Semantic Textual Similarity, Transfer Learning
Published 2018-05-31
URL http://arxiv.org/abs/1805.12501v2
PDF http://arxiv.org/pdf/1805.12501v2.pdf
PWC https://paperswithcode.com/paper/multi-label-transfer-learning-for-semantic
Repo
Framework

Machine learning and evolutionary techniques in interplanetary trajectory design

Title Machine learning and evolutionary techniques in interplanetary trajectory design
Authors Dario Izzo, Christopher Sprague, Dharmesh Tailor
Abstract After providing a brief historical overview on the synergies between artificial intelligence research, in the areas of evolutionary computations and machine learning, and the optimal design of interplanetary trajectories, we propose and study the use of deep artificial neural networks to represent, on-board, the optimal guidance profile of an interplanetary mission. The results, limited to the chosen test case of an Earth-Mars orbital transfer, extend the findings made previously for landing scenarios and quadcopter dynamics, opening a new research area in interplanetary trajectory planning.
Tasks
Published 2018-02-01
URL http://arxiv.org/abs/1802.00180v2
PDF http://arxiv.org/pdf/1802.00180v2.pdf
PWC https://paperswithcode.com/paper/machine-learning-and-evolutionary-techniques
Repo
Framework

Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions

Title Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
Authors Qing Li, Jianlong Fu, Dongfei Yu, Tao Mei, Jiebo Luo
Abstract Visual Question Answering (VQA) has attracted attention from both computer vision and natural language processing communities. Most existing approaches adopt the pipeline of representing an image via pre-trained CNNs, and then using the uninterpretable CNN features in conjunction with the question to predict the answer. Although such end-to-end models might report promising performance, they rarely provide any insight, apart from the answer, into the VQA process. In this work, we propose to break up the end-to-end VQA into two steps: explaining and reasoning, in an attempt towards a more explainable VQA by shedding light on the intermediate results between these two steps. To that end, we first extract attributes and generate descriptions as explanations for an image using pre-trained attribute detectors and image captioning models, respectively. Next, a reasoning module utilizes these explanations in place of the image to infer an answer to the question. The advantages of such a breakdown include: (1) the attributes and captions can reflect what the system extracts from the image, thus can provide some explanations for the predicted answer; (2) these intermediate results can help us identify the inabilities of both the image understanding part and the answer inference part when the predicted answer is wrong. We conduct extensive experiments on a popular VQA dataset and dissect all results according to several measurements of the explanation quality. Our system achieves comparable performance with the state-of-the-art, yet with added benefits of explainability and the inherent ability to further improve with higher quality explanations.
Tasks Image Captioning, Question Answering, Visual Question Answering
Published 2018-01-27
URL http://arxiv.org/abs/1801.09041v1
PDF http://arxiv.org/pdf/1801.09041v1.pdf
PWC https://paperswithcode.com/paper/tell-and-answer-towards-explainable-visual
Repo
Framework

Fast Stochastic Algorithms for Low-rank and Nonsmooth Matrix Problems

Title Fast Stochastic Algorithms for Low-rank and Nonsmooth Matrix Problems
Authors Dan Garber, Atara Kaplan
Abstract Composite convex optimization problems which include both a nonsmooth term and a low-rank promoting term have important applications in machine learning and signal processing, such as when one wishes to recover an unknown matrix that is simultaneously low-rank and sparse. However, such problems are highly challenging to solve in large-scale: the low-rank promoting term prohibits efficient implementations of proximal methods for composite optimization and even simple subgradient methods. On the other hand, methods which are tailored for low-rank optimization, such as conditional gradient-type methods, which are often applied to a smooth approximation of the nonsmooth objective, are slow since their runtime scales with both the large Lipshitz parameter of the smoothed gradient vector and with $1/\epsilon$. In this paper we develop efficient algorithms for \textit{stochastic} optimization of a strongly-convex objective which includes both a nonsmooth term and a low-rank promoting term. In particular, to the best of our knowledge, we present the first algorithm that enjoys all following critical properties for large-scale problems: i) (nearly) optimal sample complexity, ii) each iteration requires only a single \textit{low-rank} SVD computation, and iii) overall number of thin-SVD computations scales only with $\log{1/\epsilon}$ (as opposed to $\textrm{poly}(1/\epsilon)$ in previous methods). We also give an algorithm for the closely-related finite-sum setting. At the heart of our results lie a novel combination of a variance-reduction technique and the use of a \textit{weak-proximal oracle} which is key to obtaining all above three properties simultaneously.
Tasks Stochastic Optimization
Published 2018-09-27
URL http://arxiv.org/abs/1809.10477v1
PDF http://arxiv.org/pdf/1809.10477v1.pdf
PWC https://paperswithcode.com/paper/fast-stochastic-algorithms-for-low-rank-and
Repo
Framework

Implicit Modeling with Uncertainty Estimation for Intravoxel Incoherent Motion Imaging

Title Implicit Modeling with Uncertainty Estimation for Intravoxel Incoherent Motion Imaging
Authors Lin Zhang, Valery Vishnevskiy, Andras Jakab, Orcun Goksel
Abstract Intravoxel incoherent motion (IVIM) imaging allows contrast-agent free in vivo perfusion quantification with magnetic resonance imaging (MRI). However, its use is limited by typically low accuracy due to low signal-to-noise ratio (SNR) at large gradient encoding magnitudes as well as dephasing artefacts caused by subject motion, which is particularly challenging in fetal MRI. To mitigate this problem, we propose an implicit IVIM signal acquisition model with which we learn full posterior distribution of perfusion parameters using artificial neural networks. This posterior then encapsulates the uncertainty of the inferred parameter estimates, which we validate herein via numerical experiments with rejection-based Bayesian sampling. Compared to state-of-the-art IVIM estimation method of segmented least-squares fitting, our proposed approach improves parameter estimation accuracy by 65% on synthetic anisotropic perfusion data. On paired rescans of in vivo fetal MRI, our method increases repeatability of parameter estimation in placenta by 46%.
Tasks
Published 2018-10-22
URL http://arxiv.org/abs/1810.10358v1
PDF http://arxiv.org/pdf/1810.10358v1.pdf
PWC https://paperswithcode.com/paper/implicit-modeling-with-uncertainty-estimation
Repo
Framework
comments powered by Disqus