Paper Group ANR 163
Deep Neural Networks for Estimation and Inference. Top k Memory Candidates in Memory Networks for Common Sense Reasoning. Analysis of Fast Structured Dictionary Learning. Seeded Ising Model and Statistical Natures of Human Iris Templates. Convergence Rates for Projective Splitting. A Corpus for Modeling Word Importance in Spoken Dialogue Transcript …
Deep Neural Networks for Estimation and Inference
Title | Deep Neural Networks for Estimation and Inference |
Authors | Max H. Farrell, Tengyuan Liang, Sanjog Misra |
Abstract | We study deep neural networks and their use in semiparametric inference. We establish novel rates of convergence for deep feedforward neural nets. Our new rates are sufficiently fast (in some cases minimax optimal) to allow us to establish valid second-step inference after first-step estimation with deep learning, a result also new to the literature. Our estimation rates and semiparametric inference results handle the current standard architecture: fully connected feedforward neural networks (multi-layer perceptrons), with the now-common rectified linear unit activation function and a depth explicitly diverging with the sample size. We discuss other architectures as well, including fixed-width, very deep networks. We establish nonasymptotic bounds for these deep nets for a general class of nonparametric regression-type loss functions, which includes as special cases least squares, logistic regression, and other generalized linear models. We then apply our theory to develop semiparametric inference, focusing on causal parameters for concreteness, such as treatment effects, expected welfare, and decomposition effects. Inference in many other semiparametric contexts can be readily obtained. We demonstrate the effectiveness of deep learning with a Monte Carlo analysis and an empirical application to direct mail marketing. |
Tasks | |
Published | 2018-09-26 |
URL | https://arxiv.org/abs/1809.09953v3 |
https://arxiv.org/pdf/1809.09953v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-for-estimation-and |
Repo | |
Framework | |
Top k Memory Candidates in Memory Networks for Common Sense Reasoning
Title | Top k Memory Candidates in Memory Networks for Common Sense Reasoning |
Authors | Vatsal Mahajan |
Abstract | Successful completion of reasoning task requires the agent to have relevant prior knowledge or some given context of the world dynamics. Usually, the information provided to the system for a reasoning task is just the query or some supporting story, which is often not enough for common reasoning tasks. The goal here is that, if the information provided along the question is not sufficient to correctly answer the question, the model should choose k most relevant documents that can aid its inference process. In this work, the model dynamically selects top k most relevant memory candidates that can be used to successfully solve reasoning tasks. Experiments were conducted on a subset of Winograd Schema Challenge (WSC) problems to show that the proposed model has the potential for commonsense reasoning. The WSC is a test of machine intelligence, designed to be an improvement on the Turing test. |
Tasks | Common Sense Reasoning |
Published | 2018-01-14 |
URL | https://arxiv.org/abs/1801.04622v2 |
https://arxiv.org/pdf/1801.04622v2.pdf | |
PWC | https://paperswithcode.com/paper/top-k-memory-candidates-in-memory-networks |
Repo | |
Framework | |
Analysis of Fast Structured Dictionary Learning
Title | Analysis of Fast Structured Dictionary Learning |
Authors | Saiprasad Ravishankar, Anna Ma, Deanna Needell |
Abstract | Sparsity-based models and techniques have been exploited in many signal processing and imaging applications. Data-driven methods based on dictionary and sparsifying transform learning enable learning rich image features from data, and can outperform analytical models. In particular, alternating optimization algorithms have been popular for learning such models. In this work, we focus on alternating minimization for a specific structured unitary sparsifying operator learning problem, and provide a convergence analysis. While the algorithm converges to the critical points of the problem generally, our analysis establishes under mild assumptions, the local linear convergence of the algorithm to the underlying sparsifying model of the data. Analysis and numerical simulations show that our assumptions hold for standard probabilistic data models. In practice, the algorithm is robust to initialization. |
Tasks | Dictionary Learning |
Published | 2018-05-31 |
URL | https://arxiv.org/abs/1805.12529v3 |
https://arxiv.org/pdf/1805.12529v3.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-fast-structured-dictionary |
Repo | |
Framework | |
Seeded Ising Model and Statistical Natures of Human Iris Templates
Title | Seeded Ising Model and Statistical Natures of Human Iris Templates |
Authors | Song-Hwa Kwon, Hyeong In Choi, Sung Jin Lee, Nam-Sook Wee |
Abstract | We propose a variant of Ising model, called the Seeded Ising Model, to model probabilistic nature of human iris templates. This model is an Ising model in which the values at certain lattice points are held fixed throughout Ising model evolution. Using this we show how to reconstruct the full iris template from partial information, and we show that about 1/6 of the given template is needed to recover almost all information content of the original one in the sense that the resulting Hamming distance is well within the range to assert correctly the identity of the subject. This leads us to propose the concept of effective statistical degree of freedom of iris templates and show it is about 1/6 of the total number of bits. In particular, for a template of $2048$ bits, its effective statistical degree of freedom is about $342$ bits, which coincides very well with the degree of freedom computed by the completely different method proposed by Daugman. |
Tasks | |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1802.02223v1 |
http://arxiv.org/pdf/1802.02223v1.pdf | |
PWC | https://paperswithcode.com/paper/seeded-ising-model-and-statistical-natures-of |
Repo | |
Framework | |
Convergence Rates for Projective Splitting
Title | Convergence Rates for Projective Splitting |
Authors | Patrick R. Johnstone, Jonathan Eckstein |
Abstract | Projective splitting is a family of methods for solving inclusions involving sums of maximal monotone operators. First introduced by Eckstein and Svaiter in 2008, these methods have enjoyed significant innovation in recent years, becoming one of the most flexible operator splitting frameworks available. While weak convergence of the iterates to a solution has been established, there have been few attempts to study convergence rates of projective splitting. The purpose of this paper is to do so under various assumptions. To this end, there are three main contributions. First, in the context of convex optimization, we establish an $O(1/k)$ ergodic function convergence rate. Second, for strongly monotone inclusions, strong convergence is established as well as an ergodic $O(1/\sqrt{k})$ convergence rate for the distance of the iterates to the solution. Finally, for inclusions featuring strong monotonicity and cocoercivity, linear convergence is established. |
Tasks | |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.03920v3 |
http://arxiv.org/pdf/1806.03920v3.pdf | |
PWC | https://paperswithcode.com/paper/convergence-rates-for-projective-splitting |
Repo | |
Framework | |
A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts
Title | A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts |
Authors | Sushant Kafle, Matt Huenerfauth |
Abstract | Motivated by a project to create a system for people who are deaf or hard-of-hearing that would use automatic speech recognition (ASR) to produce real-time text captions of spoken English during in-person meetings with hearing individuals, we have augmented a transcript of the Switchboard conversational dialogue corpus with an overlay of word-importance annotations, with a numeric score for each word, to indicate its importance to the meaning of each dialogue turn. Further, we demonstrate the utility of this corpus by training an automatic word importance labeling model; our best performing model has an F-score of 0.60 in an ordinal 6-class word-importance classification task with an agreement (concordance correlation coefficient) of 0.839 with the human annotators (agreement score between annotators is 0.89). Finally, we discuss our intended future applications of this resource, particularly for the task of evaluating ASR performance, i.e. creating metrics that predict ASR-output caption text usability for DHH users better thanWord Error Rate (WER). |
Tasks | Speech Recognition |
Published | 2018-01-29 |
URL | https://arxiv.org/abs/1801.09746v3 |
https://arxiv.org/pdf/1801.09746v3.pdf | |
PWC | https://paperswithcode.com/paper/a-corpus-for-modeling-word-importance-in |
Repo | |
Framework | |
Conscious enactive computation
Title | Conscious enactive computation |
Authors | Daniel Estrada |
Abstract | This paper looks at recent debates in the enactivist literature on computation and consciousness in order to assess major obstacles to building artificial conscious agents. We consider a proposal from Villalobos and Dewhurst (2018) for enactive computation on the basis of organizational closure. We attempt to improve the argument by reflecting on the closed paths through state space taken by finite state automata. This motivates a defense against Clark’s recent criticisms of “extended consciousness”, and perhaps a new perspective on living with machines. |
Tasks | |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.02578v1 |
http://arxiv.org/pdf/1812.02578v1.pdf | |
PWC | https://paperswithcode.com/paper/conscious-enactive-computation |
Repo | |
Framework | |
Dictionary Learning for Adaptive GPR Landmine Classification
Title | Dictionary Learning for Adaptive GPR Landmine Classification |
Authors | Fabio Giovanneschi, Kumar Vijay Mishra, Maria Antonia Gonzalez-Huici, Yonina C. Eldar, Joachim H. G. Ender |
Abstract | Ground penetrating radar (GPR) target detection and classification is a challenging task. Here, we consider online dictionary learning (DL) methods to obtain sparse representations (SR) of the GPR data to enhance feature extraction for target classification via support vector machines. Online methods are preferred because traditional batch DL like K-SVD is not scalable to high-dimensional training sets and infeasible for real-time operation. We also develop Drop-Off MINi-batch Online Dictionary Learning (DOMINODL) which exploits the fact that a lot of the training data may be correlated. The DOMINODL algorithm iteratively considers elements of the training set in small batches and drops off samples which become less relevant. For the case of abandoned anti-personnel landmines classification, we compare the performance of K-SVD with three online algorithms: classical Online Dictionary Learning, its correlation-based variant, and DOMINODL. Our experiments with real data from L-band GPR show that online DL methods reduce learning time by 36-93% and increase mine detection by 4-28% over K-SVD. Our DOMINODL is the fastest and retains similar classification performance as the other two online DL approaches. We use a Kolmogorov-Smirnoff test distance and the Dvoretzky-Kiefer-Wolfowitz inequality for the selection of DL input parameters leading to enhanced classification results. To further compare with state-of-the-art classification approaches, we evaluate a convolutional neural network (CNN) classifier which performs worse than the proposed approach. Moreover, when the acquired samples are randomly reduced by 25%, 50% and 75%, sparse decomposition based classification with DL remains robust while the CNN accuracy is drastically compromised. |
Tasks | Dictionary Learning |
Published | 2018-05-24 |
URL | https://arxiv.org/abs/1806.04599v2 |
https://arxiv.org/pdf/1806.04599v2.pdf | |
PWC | https://paperswithcode.com/paper/dictionary-learning-for-adaptive-gpr-target |
Repo | |
Framework | |
Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories
Title | Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories |
Authors | Tharindu Fernando, Sridha Sridharan, Clinton Fookes, Simon Denman |
Abstract | With the explosion in the availability of spatio-temporal tracking data in modern sports, there is an enormous opportunity to better analyse, learn and predict important events in adversarial group environments. In this paper, we propose a deep decision tree architecture for discriminative dictionary learning from adversarial multi-agent trajectories. We first build up a hierarchy for the tree structure by adding each layer and performing feature weight based clustering in the forward pass. We then fine tune the player role weights using back propagation. The hierarchical architecture ensures the interpretability and the integrity of the group representation. The resulting architecture is a decision tree, with leaf-nodes capturing a dictionary of multi-agent group interactions. Due to the ample volume of data available, we focus on soccer tracking data, although our approach can be used in any adversarial multi-agent domain. We present applications of proposed method for simulating soccer games as well as evaluating and quantifying team strategies. |
Tasks | Dictionary Learning |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05009v1 |
http://arxiv.org/pdf/1805.05009v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-decision-trees-for-discriminative |
Repo | |
Framework | |
OpenTag: Open Attribute Value Extraction from Product Profiles [Deep Learning, Active Learning, Named Entity Recognition]
Title | OpenTag: Open Attribute Value Extraction from Product Profiles [Deep Learning, Active Learning, Named Entity Recognition] |
Authors | Guineng Zheng, Subhabrata Mukherjee, Xin Luna Dong, Feifei Li |
Abstract | Extraction of missing attribute values is to find values describing an attribute of interest from a free text input. Most past related work on extraction of missing attribute values work with a closed world assumption with the possible set of values known beforehand, or use dictionaries of values and hand-crafted features. How can we discover new attribute values that we have never seen before? Can we do this with limited human annotation or supervision? We study this problem in the context of product catalogs that often have missing values for many attributes of interest. In this work, we leverage product profile information such as titles and descriptions to discover missing values of product attributes. We develop a novel deep tagging model OpenTag for this extraction problem with the following contributions: (1) we formalize the problem as a sequence tagging task, and propose a joint model exploiting recurrent neural networks (specifically, bidirectional LSTM) to capture context and semantics, and Conditional Random Fields (CRF) to enforce tagging consistency, (2) we develop a novel attention mechanism to provide interpretable explanation for our model’s decisions, (3) we propose a novel sampling strategy exploring active learning to reduce the burden of human annotation. OpenTag does not use any dictionary or hand-crafted features as in prior works. Extensive experiments in real-life datasets in different domains show that OpenTag with our active learning strategy discovers new attribute values from as few as 150 annotated samples (reduction in 3.3x amount of annotation effort) with a high F-score of 83%, outperforming state-of-the-art models. |
Tasks | Active Learning, Named Entity Recognition |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.01264v2 |
http://arxiv.org/pdf/1806.01264v2.pdf | |
PWC | https://paperswithcode.com/paper/opentag-open-attribute-value-extraction-from |
Repo | |
Framework | |
Dictionary Learning and Sparse Coding on Statistical Manifolds
Title | Dictionary Learning and Sparse Coding on Statistical Manifolds |
Authors | Rudrasis Chakraborty, Monami Banerjee, Baba C. Vemuri |
Abstract | In this paper, we propose a novel information theoretic framework for dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the manifold of probability distributions). Unlike the traditional DL and SC framework, our new formulation does not explicitly incorporate any sparsity inducing norm in the cost function being optimized but yet yields sparse codes. Our algorithm approximates the data points on the statistical manifold (which are probability distributions) by the weighted Kullback-Leibeler center/mean (KL-center) of the dictionary atoms. The KL-center is defined as the minimizer of the maximum KL-divergence between itself and members of the set whose center is being sought. Further, we prove that the weighted KL-center is a sparse combination of the dictionary atoms. This result also holds for the case when the KL-divergence is replaced by the well known Hellinger distance. From an applications perspective, we present an extension of the aforementioned framework to the manifold of symmetric positive definite matrices (which can be identified with the manifold of zero mean gaussian distributions), $\mathcal{P}_n$. We present experiments involving a variety of dictionary-based reconstruction and classification problems in Computer Vision. Performance of the proposed algorithm is demonstrated by comparing it to several state-of-the-art methods in terms of reconstruction and classification accuracy as well as sparsity of the chosen representation. |
Tasks | Dictionary Learning |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.02505v1 |
http://arxiv.org/pdf/1805.02505v1.pdf | |
PWC | https://paperswithcode.com/paper/dictionary-learning-and-sparse-coding-on-1 |
Repo | |
Framework | |
Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection
Title | Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection |
Authors | Yuxing Tang, Josiah Wang, Xiaofang Wang, Boyang Gao, Emmanuel Dellandrea, Robert Gaizauskas, Liming Chen |
Abstract | Deep CNN-based object detection systems have achieved remarkable success on several large-scale object detection benchmarks. However, training such detectors requires a large number of labeled bounding boxes, which are more difficult to obtain than image-level annotations. Previous work addresses this issue by transforming image-level classifiers into object detectors. This is done by modeling the differences between the two on categories with both image-level and bounding box annotations, and transferring this information to convert classifiers to detectors for categories without bounding box annotations. We improve this previous work by incorporating knowledge about object similarities from visual and semantic domains during the transfer process. The intuition behind our proposed method is that visually and semantically similar categories should exhibit more common transferable properties than dissimilar categories, e.g. a better detector would result by transforming the differences between a dog classifier and a dog detector onto the cat class, than would by transforming from the violin class. Experimental results on the challenging ILSVRC2013 detection dataset demonstrate that each of our proposed object similarity based knowledge transfer methods outperforms the baseline methods. We found strong evidence that visual similarity and semantic relatedness are complementary for the task, and when combined notably improve detection, achieving state-of-the-art detection performance in a semi-supervised setting. |
Tasks | Object Detection, Transfer Learning |
Published | 2018-01-09 |
URL | http://arxiv.org/abs/1801.03145v2 |
http://arxiv.org/pdf/1801.03145v2.pdf | |
PWC | https://paperswithcode.com/paper/visual-and-semantic-knowledge-transfer-for |
Repo | |
Framework | |
Learning Category-Specific Mesh Reconstruction from Image Collections
Title | Learning Category-Specific Mesh Reconstruction from Image Collections |
Authors | Angjoo Kanazawa, Shubham Tulsiani, Alexei A. Efros, Jitendra Malik |
Abstract | We present a learning framework for recovering the 3D shape, camera, and texture of an object from a single image. The shape is represented as a deformable 3D mesh model of an object category where a shape is parameterized by a learned mean shape and per-instance predicted deformation. Our approach allows leveraging an annotated image collection for training, where the deformable model and the 3D prediction mechanism are learned without relying on ground-truth 3D or multi-view supervision. Our representation enables us to go beyond existing 3D prediction approaches by incorporating texture inference as prediction of an image in a canonical appearance space. Additionally, we show that semantic keypoints can be easily associated with the predicted shapes. We present qualitative and quantitative results of our approach on CUB and PASCAL3D datasets and show that we can learn to predict diverse shapes and textures across objects using only annotated image collections. The project website can be found at https://akanazawa.github.io/cmr/. |
Tasks | |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07549v2 |
http://arxiv.org/pdf/1803.07549v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-category-specific-mesh |
Repo | |
Framework | |
Benchmarking Deep Sequential Models on Volatility Predictions for Financial Time Series
Title | Benchmarking Deep Sequential Models on Volatility Predictions for Financial Time Series |
Authors | Qiang Zhang, Rui Luo, Yaodong Yang, Yuanyuan Liu |
Abstract | Volatility is a quantity of measurement for the price movements of stocks or options which indicates the uncertainty within financial markets. As an indicator of the level of risk or the degree of variation, volatility is important to analyse the financial market, and it is taken into consideration in various decision-making processes in financial activities. On the other hand, recent advancement in deep learning techniques has shown strong capabilities in modelling sequential data, such as speech and natural language. In this paper, we empirically study the applicability of the latest deep structures with respect to the volatility modelling problem, through which we aim to provide an empirical guidance for the theoretical analysis of the marriage between deep learning techniques and financial applications in the future. We examine both the traditional approaches and the deep sequential models on the task of volatility prediction, including the most recent variants of convolutional and recurrent networks, such as the dilated architecture. Accordingly, experiments with real-world stock price datasets are performed on a set of 1314 daily stock series for 2018 days of transaction. The evaluation and comparison are based on the negative log likelihood (NLL) of real-world stock price time series. The result shows that the dilated neural models, including dilated CNN and Dilated RNN, produce most accurate estimation and prediction, outperforming various widely-used deterministic models in the GARCH family and several recently proposed stochastic models. In addition, the high flexibility and rich expressive power are validated in this study. |
Tasks | Decision Making, Time Series |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03711v1 |
http://arxiv.org/pdf/1811.03711v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-deep-sequential-models-on |
Repo | |
Framework | |
3D Traffic Simulation for Autonomous Vehicles in Unity and Python
Title | 3D Traffic Simulation for Autonomous Vehicles in Unity and Python |
Authors | Zhijing Jin, Tristan Swedish, Ramesh Raskar |
Abstract | Over the recent years, there has been an explosion of studies on autonomous vehicles. Many collected large amount of data from human drivers. However, compared to the tedious data collection approach, building a virtual simulation of traffic makes the autonomous vehicle research more flexible, time-saving, and scalable. Our work features a 3D simulation that takes in real time position information parsed from street cameras. The simulation can easily switch between a global bird view of the traffic and a local perspective of a car. It can also filter out certain objects in its customized camera, creating various channels for objects of different categories. This provides alternative supervised or unsupervised ways to train deep neural networks. Another advantage of the 3D simulation is its conformation to physical laws. Its naturalness to accelerate and collide prepares the system for potential deep reinforcement learning needs. |
Tasks | Autonomous Vehicles |
Published | 2018-10-30 |
URL | https://arxiv.org/abs/1810.12552v2 |
https://arxiv.org/pdf/1810.12552v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-traffic-simulation-for-autonomous-vehicles |
Repo | |
Framework | |