May 6, 2019

3031 words 15 mins read

Paper Group ANR 402

Paper Group ANR 402

Graph-Structured Representations for Visual Question Answering. Modelling Student Behavior using Granular Large Scale Action Data from a MOOC. Toward Socially-Infused Information Extraction: Embedding Authors, Mentions, and Entities. Deterministic and Probabilistic Conditions for Finite Completability of Low-Tucker-Rank Tensor. Communication-Effici …

Graph-Structured Representations for Visual Question Answering

Title Graph-Structured Representations for Visual Question Answering
Authors Damien Teney, Lingqiao Liu, Anton van den Hengel
Abstract This paper proposes to improve visual question answering (VQA) with structured representations of both scene contents and questions. A key challenge in VQA is to require joint reasoning over the visual and text domains. The predominant CNN/LSTM-based approach to VQA is limited by monolithic vector representations that largely ignore structure in the scene and in the form of the question. CNN feature vectors cannot effectively capture situations as simple as multiple object instances, and LSTMs process questions as series of words, which does not reflect the true complexity of language structure. We instead propose to build graphs over the scene objects and over the question words, and we describe a deep neural network that exploits the structure in these representations. This shows significant benefit over the sequential processing of LSTMs. The overall efficacy of our approach is demonstrated by significant improvements over the state-of-the-art, from 71.2% to 74.4% in accuracy on the “abstract scenes” multiple-choice benchmark, and from 34.7% to 39.1% in accuracy over pairs of “balanced” scenes, i.e. images with fine-grained differences and opposite yes/no answers to a same question.
Tasks Question Answering, Visual Question Answering
Published 2016-09-19
URL http://arxiv.org/abs/1609.05600v2
PDF http://arxiv.org/pdf/1609.05600v2.pdf
PWC https://paperswithcode.com/paper/graph-structured-representations-for-visual
Repo
Framework

Modelling Student Behavior using Granular Large Scale Action Data from a MOOC

Title Modelling Student Behavior using Granular Large Scale Action Data from a MOOC
Authors Steven Tang, Joshua C. Peterson, Zachary A. Pardos
Abstract Digital learning environments generate a precise record of the actions learners take as they interact with learning materials and complete exercises towards comprehension. With this high quantity of sequential data comes the potential to apply time series models to learn about underlying behavioral patterns and trends that characterize successful learning based on the granular record of student actions. There exist several methods for looking at longitudinal, sequential data like those recorded from learning environments. In the field of language modelling, traditional n-gram techniques and modern recurrent neural network (RNN) approaches have been applied to algorithmically find structure in language and predict the next word given the previous words in the sentence or paragraph as input. In this paper, we draw an analogy to this work by treating student sequences of resource views and interactions in a MOOC as the inputs and predicting students’ next interaction as outputs. In this study, we train only on students who received a certificate of completion. In doing so, the model could potentially be used for recommendation of sequences eventually leading to success, as opposed to perpetuating unproductive behavior. Given that the MOOC used in our study had over 3,500 unique resources, predicting the exact resource that a student will interact with next might appear to be a difficult classification problem. We find that simply following the syllabus (built-in structure of the course) gives on average 23% accuracy in making this prediction, followed by the n-gram method with 70.4%, and RNN based methods with 72.2%. This research lays the ground work for recommendation in a MOOC and other digital learning environments where high volumes of sequential data exist.
Tasks Language Modelling, Time Series
Published 2016-08-16
URL http://arxiv.org/abs/1608.04789v1
PDF http://arxiv.org/pdf/1608.04789v1.pdf
PWC https://paperswithcode.com/paper/modelling-student-behavior-using-granular
Repo
Framework

Toward Socially-Infused Information Extraction: Embedding Authors, Mentions, and Entities

Title Toward Socially-Infused Information Extraction: Embedding Authors, Mentions, and Entities
Authors Yi Yang, Ming-Wei Chang, Jacob Eisenstein
Abstract Entity linking is the task of identifying mentions of entities in text, and linking them to entries in a knowledge base. This task is especially difficult in microblogs, as there is little additional text to provide disambiguating context; rather, authors rely on an implicit common ground of shared knowledge with their readers. In this paper, we attempt to capture some of this implicit context by exploiting the social network structure in microblogs. We build on the theory of homophily, which implies that socially linked individuals share interests, and are therefore likely to mention the same sorts of entities. We implement this idea by encoding authors, mentions, and entities in a continuous vector space, which is constructed so that socially-connected authors have similar vector representations. These vectors are incorporated into a neural structured prediction model, which captures structural constraints that are inherent in the entity linking task. Together, these design decisions yield F1 improvements of 1%-5% on benchmark datasets, as compared to the previous state-of-the-art.
Tasks Entity Linking, Structured Prediction
Published 2016-09-26
URL http://arxiv.org/abs/1609.08084v1
PDF http://arxiv.org/pdf/1609.08084v1.pdf
PWC https://paperswithcode.com/paper/toward-socially-infused-information
Repo
Framework

Deterministic and Probabilistic Conditions for Finite Completability of Low-Tucker-Rank Tensor

Title Deterministic and Probabilistic Conditions for Finite Completability of Low-Tucker-Rank Tensor
Authors Morteza Ashraphijuo, Vaneet Aggarwal, Xiaodong Wang
Abstract We investigate the fundamental conditions on the sampling pattern, i.e., locations of the sampled entries, for finite completability of a low-rank tensor given some components of its Tucker rank. In order to find the deterministic necessary and sufficient conditions, we propose an algebraic geometric analysis on the Tucker manifold, which allows us to incorporate multiple rank components in the proposed analysis in contrast with the conventional geometric approaches on the Grassmannian manifold. This analysis characterizes the algebraic independence of a set of polynomials defined based on the sampling pattern, which is closely related to finite completion. Probabilistic conditions are then studied and a lower bound on the sampling probability is given, which guarantees that the proposed deterministic conditions on the sampling patterns for finite completability hold with high probability. Furthermore, using the proposed geometric approach for finite completability, we propose a sufficient condition on the sampling pattern that ensures there exists exactly one completion for the sampled tensor.
Tasks
Published 2016-12-06
URL https://arxiv.org/abs/1612.01597v3
PDF https://arxiv.org/pdf/1612.01597v3.pdf
PWC https://paperswithcode.com/paper/deterministic-and-probabilistic-conditions-1
Repo
Framework

Communication-Efficient Distributed Statistical Inference

Title Communication-Efficient Distributed Statistical Inference
Authors Michael I. Jordan, Jason D. Lee, Yun Yang
Abstract We present a Communication-efficient Surrogate Likelihood (CSL) framework for solving distributed statistical inference problems. CSL provides a communication-efficient surrogate to the global likelihood that can be used for low-dimensional estimation, high-dimensional regularized estimation and Bayesian inference. For low-dimensional estimation, CSL provably improves upon naive averaging schemes and facilitates the construction of confidence intervals. For high-dimensional regularized estimation, CSL leads to a minimax-optimal estimator with controlled communication cost. For Bayesian inference, CSL can be used to form a communication-efficient quasi-posterior distribution that converges to the true posterior. This quasi-posterior procedure significantly improves the computational efficiency of MCMC algorithms even in a non-distributed setting. We present both theoretical analysis and experiments to explore the properties of the CSL approximation.
Tasks Bayesian Inference
Published 2016-05-25
URL http://arxiv.org/abs/1605.07689v3
PDF http://arxiv.org/pdf/1605.07689v3.pdf
PWC https://paperswithcode.com/paper/communication-efficient-distributed-3
Repo
Framework

Function-Described Graphs for Structural Pattern Recognition

Title Function-Described Graphs for Structural Pattern Recognition
Authors Francesc Serratosa
Abstract We present in this article the model Function-described graph (FDG), which is a type of compact representation of a set of attributed graphs (AGs) that borrow from Random Graphs the capability of probabilistic modelling of structural and attribute information. We define the FDGs, their features and two distance measures between AGs (unclassified patterns) and FDGs (models or classes) and we also explain an efficient matching algorithm. Two applications of FDGs are presented: in the former, FDGs are used for modelling and matching 3D-objects described by multiple views, whereas in the latter, they are used for representing and recognising human faces, described also by several views.
Tasks
Published 2016-05-10
URL http://arxiv.org/abs/1605.02929v1
PDF http://arxiv.org/pdf/1605.02929v1.pdf
PWC https://paperswithcode.com/paper/function-described-graphs-for-structural
Repo
Framework

Mutual Information and Diverse Decoding Improve Neural Machine Translation

Title Mutual Information and Diverse Decoding Improve Neural Machine Translation
Authors Jiwei Li, Dan Jurafsky
Abstract Sequence-to-sequence neural translation models learn semantic and syntactic relations between sentence pairs by optimizing the likelihood of the target given the source, i.e., $p(yx)$, an objective that ignores other potentially useful sources of information. We introduce an alternative objective function for neural MT that maximizes the mutual information between the source and target sentences, modeling the bi-directional dependency of sources and targets. We implement the model with a simple re-ranking method, and also introduce a decoding algorithm that increases diversity in the N-best list produced by the first pass. Applied to the WMT German/English and French/English tasks, the proposed models offers a consistent performance boost on both standard LSTM and attention-based neural MT architectures.
Tasks Machine Translation
Published 2016-01-04
URL http://arxiv.org/abs/1601.00372v2
PDF http://arxiv.org/pdf/1601.00372v2.pdf
PWC https://paperswithcode.com/paper/mutual-information-and-diverse-decoding
Repo
Framework

Parts for the Whole: The DCT Norm for Extreme Visual Recovery

Title Parts for the Whole: The DCT Norm for Extreme Visual Recovery
Authors Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, Chao Xu
Abstract Here we study the extreme visual recovery problem, in which over 90% of pixel values in a given image are missing. Existing low rank-based algorithms are only effective for recovering data with at most 90% missing values. Thus, we exploit visual data’s smoothness property to help solve this challenging extreme visual recovery problem. Based on the Discrete Cosine Transformation (DCT), we propose a novel DCT norm that involves all pixels and produces smooth estimations in any view. Our theoretical analysis shows that the total variation (TV) norm, which only achieves local smoothness, is a special case of the proposed DCT norm. We also develop a new visual recovery algorithm by minimizing the DCT and nuclear norms to achieve a more visually pleasing estimation. Experimental results on a benchmark image dataset demonstrate that the proposed approach is superior to state-of-the-art methods in terms of peak signal-to-noise ratio and structural similarity.
Tasks
Published 2016-04-19
URL http://arxiv.org/abs/1604.05451v1
PDF http://arxiv.org/pdf/1604.05451v1.pdf
PWC https://paperswithcode.com/paper/parts-for-the-whole-the-dct-norm-for-extreme
Repo
Framework

Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs

Title Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs
Authors Anton Osokin, Jean-Baptiste Alayrac, Isabella Lukasewitz, Puneet K. Dokania, Simon Lacoste-Julien
Abstract In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications. The key intuition behind our improvements is that the estimates of block gaps maintained by BCFW reveal the block suboptimality that can be used as an adaptive criterion. First, we sample objects at each iteration of BCFW in an adaptive non-uniform way via gapbased sampling. Second, we incorporate pairwise and away-step variants of Frank-Wolfe into the block-coordinate setting. Third, we cache oracle calls with a cache-hit criterion based on the block gaps. Fourth, we provide the first method to compute an approximate regularization path for SSVM. Finally, we provide an exhaustive empirical evaluation of all our methods on four structured prediction datasets.
Tasks Structured Prediction
Published 2016-05-30
URL http://arxiv.org/abs/1605.09346v1
PDF http://arxiv.org/pdf/1605.09346v1.pdf
PWC https://paperswithcode.com/paper/minding-the-gaps-for-block-frank-wolfe
Repo
Framework

Perceptual Quality Prediction on Authentically Distorted Images Using a Bag of Features Approach

Title Perceptual Quality Prediction on Authentically Distorted Images Using a Bag of Features Approach
Authors Deepti Ghadiyaram, Alan C. Bovik
Abstract Current top-performing blind perceptual image quality prediction models are generally trained on legacy databases of human quality opinion scores on synthetically distorted images. Therefore they learn image features that effectively predict human visual quality judgments of inauthentic, and usually isolated (single) distortions. However, real-world images usually contain complex, composite mixtures of multiple distortions. We study the perceptually relevant natural scene statistics of such authentically distorted images, in different color spaces and transform domains. We propose a bag of feature-maps approach which avoids assumptions about the type of distortion(s) contained in an image, focusing instead on capturing consistencies, or departures therefrom, of the statistics of real world images. Using a large database of authentically distorted images, human opinions of them, and bags of features computed on them, we train a regressor to conduct image quality prediction. We demonstrate the competence of the features towards improving automatic perceptual quality prediction by testing a learned algorithm using them on a benchmark legacy database as well as on a newly introduced distortion-realistic resource called the LIVE In the Wild Image Quality Challenge Database. We extensively evaluate the perceptual quality prediction model and algorithm and show that it is able to achieve good quality prediction power that is better than other leading models.
Tasks
Published 2016-09-15
URL http://arxiv.org/abs/1609.04757v1
PDF http://arxiv.org/pdf/1609.04757v1.pdf
PWC https://paperswithcode.com/paper/perceptual-quality-prediction-on
Repo
Framework

Maximal Sparsity with Deep Networks?

Title Maximal Sparsity with Deep Networks?
Authors Bo Xin, Yizhou Wang, Wen Gao, David Wipf
Abstract The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer. Consequently, a lengthy sequence of algorithm iterations can be viewed as a deep network with shared, hand-crafted layer weights. It is therefore quite natural to examine the degree to which a learned network model might act as a viable surrogate for traditional sparse estimation in domains where ample training data is available. While the possibility of a reduced computational budget is readily apparent when a ceiling is imposed on the number of layers, our work primarily focuses on estimation accuracy. In particular, it is well-known that when a signal dictionary has coherent columns, as quantified by a large RIP constant, then most tractable iterative algorithms are unable to find maximally sparse representations. In contrast, we demonstrate both theoretically and empirically the potential for a trained deep network to recover minimal $\ell_0$-norm representations in regimes where existing methods fail. The resulting system is deployed on a practical photometric stereo estimation problem, where the goal is to remove sparse outliers that can disrupt the estimation of surface normals from a 3D scene.
Tasks
Published 2016-05-05
URL http://arxiv.org/abs/1605.01636v2
PDF http://arxiv.org/pdf/1605.01636v2.pdf
PWC https://paperswithcode.com/paper/maximal-sparsity-with-deep-networks
Repo
Framework

Algorithm-Induced Prior for Image Restoration

Title Algorithm-Induced Prior for Image Restoration
Authors Stanley H. Chan
Abstract This paper studies a type of image priors that are constructed implicitly through the alternating direction method of multiplier (ADMM) algorithm, called the algorithm-induced prior. Different from classical image priors which are defined before running the reconstruction algorithm, algorithm-induced priors are defined by the denoising procedure used to replace one of the two modules in the ADMM algorithm. Since such prior is not explicitly defined, analyzing the performance has been difficult in the past. Focusing on the class of symmetric smoothing filters, this paper presents an explicit expression of the prior induced by the ADMM algorithm. The new prior is reminiscent to the conventional graph Laplacian but with stronger reconstruction performance. It can also be shown that the overall reconstruction has an efficient closed-form implementation if the associated symmetric smoothing filter is low rank. The results are validated with experiments on image inpainting.
Tasks Denoising, Image Inpainting, Image Restoration
Published 2016-02-01
URL http://arxiv.org/abs/1602.00715v1
PDF http://arxiv.org/pdf/1602.00715v1.pdf
PWC https://paperswithcode.com/paper/algorithm-induced-prior-for-image-restoration
Repo
Framework

Authorship Attribution Based on Life-Like Network Automata

Title Authorship Attribution Based on Life-Like Network Automata
Authors Jeaneth Machicao, Edilson A. Corrêa Jr., Gisele H. B. Miranda, Diego R. Amancio, Odemir M. Bruno
Abstract The authorship attribution is a problem of considerable practical and technical interest. Several methods have been designed to infer the authorship of disputed documents in multiple contexts. While traditional statistical methods based solely on word counts and related measurements have provided a simple, yet effective solution in particular cases; they are prone to manipulation. Recently, texts have been successfully modeled as networks, where words are represented by nodes linked according to textual similarity measurements. Such models are useful to identify informative topological patterns for the authorship recognition task. However, there is no consensus on which measurements should be used. Thus, we proposed a novel method to characterize text networks, by considering both topological and dynamical aspects of networks. Using concepts and methods from cellular automata theory, we devised a strategy to grasp informative spatio-temporal patterns from this model. Our experiments revealed an outperformance over traditional analysis relying only on topological measurements. Remarkably, we have found a dependence of pre-processing steps (such as the lemmatization) on the obtained results, a feature that has mostly been disregarded in related works. The optimized results obtained here pave the way for a better characterization of textual networks.
Tasks Lemmatization
Published 2016-10-20
URL http://arxiv.org/abs/1610.06498v1
PDF http://arxiv.org/pdf/1610.06498v1.pdf
PWC https://paperswithcode.com/paper/authorship-attribution-based-on-life-like
Repo
Framework

An Analysis of Lemmatization on Topic Models of Morphologically Rich Language

Title An Analysis of Lemmatization on Topic Models of Morphologically Rich Language
Authors Chandler May, Ryan Cotterell, Benjamin Van Durme
Abstract Topic models are typically represented by top-$m$ word lists for human interpretation. The corpus is often pre-processed with lemmatization (or stemming) so that those representations are not undermined by a proliferation of words with similar meanings, but there is little public work on the effects of that pre-processing. Recent work studied the effect of stemming on topic models of English texts and found no supporting evidence for the practice. We study the effect of lemmatization on topic models of Russian Wikipedia articles, finding in one configuration that it significantly improves interpretability according to a word intrusion metric. We conclude that lemmatization may benefit topic models on morphologically rich languages, but that further investigation is needed.
Tasks Lemmatization, Topic Models
Published 2016-08-13
URL https://arxiv.org/abs/1608.03995v2
PDF https://arxiv.org/pdf/1608.03995v2.pdf
PWC https://paperswithcode.com/paper/analysis-of-morphology-in-topic-modeling
Repo
Framework

Robust and Parallel Bayesian Model Selection

Title Robust and Parallel Bayesian Model Selection
Authors Michael Minyi Zhang, Henry Lam, Lizhen Lin
Abstract Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another challenge one may encounter is the presence of outliers and contaminations that damage the inference quality. The parallel “divide and conquer” model selection strategy divides the observations of the full data set into roughly equal subsets and perform inference and model selection independently on each subset. After local subset inference, this method aggregates the posterior model probabilities or other model/variable selection criteria to obtain a final model by using the notion of geometric median. This approach leads to improved concentration in finding the “correct” model and model parameters and also is provably robust to outliers and data contamination.
Tasks Model Selection
Published 2016-10-19
URL http://arxiv.org/abs/1610.06194v3
PDF http://arxiv.org/pdf/1610.06194v3.pdf
PWC https://paperswithcode.com/paper/robust-and-parallel-bayesian-model-selection
Repo
Framework
comments powered by Disqus