July 29, 2019

2972 words 14 mins read

Paper Group AWR 137

Paper Group AWR 137

Blind Image Fusion for Hyperspectral Imaging with the Directional Total Variation. Rotational Unit of Memory. Deep and Confident Prediction for Time Series at Uber. Cynical Selection of Language Model Training Data. Temporal Relational Reasoning in Videos. End-to-end Neural Coreference Resolution. Deep Rotation Equivariant Network. Classification w …

Blind Image Fusion for Hyperspectral Imaging with the Directional Total Variation

Title Blind Image Fusion for Hyperspectral Imaging with the Directional Total Variation
Authors Leon Bungert, David A. Coomes, Matthias J. Ehrhardt, Jennifer Rasch, Rafael Reisenhofer, Carola-Bibiane Schönlieb
Abstract Hyperspectral imaging is a cutting-edge type of remote sensing used for mapping vegetation properties, rock minerals and other materials. A major drawback of hyperspectral imaging devices is their intrinsic low spatial resolution. In this paper, we propose a method for increasing the spatial resolution of a hyperspectral image by fusing it with an image of higher spatial resolution that was obtained with a different imaging modality. This is accomplished by solving a variational problem in which the regularization functional is the directional total variation. To accommodate for possible mis-registrations between the two images, we consider a non-convex blind super-resolution problem where both a fused image and the corresponding convolution kernel are estimated. Using this approach, our model can realign the given images if needed. Our experimental results indicate that the non-convexity is negligible in practice and that reliable solutions can be computed using a variety of different optimization algorithms. Numerical results on real remote sensing data from plant sciences and urban monitoring show the potential of the proposed method and suggests that it is robust with respect to the regularization parameters, mis-registration and the shape of the kernel.
Tasks Super-Resolution
Published 2017-10-04
URL http://arxiv.org/abs/1710.05705v4
PDF http://arxiv.org/pdf/1710.05705v4.pdf
PWC https://paperswithcode.com/paper/blind-image-fusion-for-hyperspectral-imaging
Repo https://github.com/mehrhardt/blind_remote_sensing
Framework none

Rotational Unit of Memory

Title Rotational Unit of Memory
Authors Rumen Dangovski, Li Jing, Marin Soljacic
Abstract The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. We evaluate our model on synthetic memorization, question answering and language modeling tasks. RUM learns the Copying Memory task completely and improves the state-of-the-art result in the Recall task. RUM’s performance in the bAbI Question Answering task is comparable to that of models with attention mechanism. We also improve the state-of-the-art result to 1.189 bits-per-character (BPC) loss in the Character Level Penn Treebank (PTB) task, which is to signify the applications of RUM to real-world sequential data. The universality of our construction, at the core of RNN, establishes RUM as a promising approach to language modeling, speech recognition and machine translation.
Tasks Language Modelling, Machine Translation, Question Answering, Speech Recognition
Published 2017-10-26
URL http://arxiv.org/abs/1710.09537v1
PDF http://arxiv.org/pdf/1710.09537v1.pdf
PWC https://paperswithcode.com/paper/rotational-unit-of-memory
Repo https://github.com/jingli9111/RUM-Tensorflow
Framework tf

Deep and Confident Prediction for Time Series at Uber

Title Deep and Confident Prediction for Time Series at Uber
Authors Lingxue Zhu, Nikolay Laptev
Abstract Reliable uncertainty estimation for time series prediction is critical in many fields, including physics, biology, and manufacturing. At Uber, probabilistic time series forecasting is used for robust prediction of number of trips during special events, driver incentive allocation, as well as real-time anomaly detection across millions of metrics. Classical time series models are often used in conjunction with a probabilistic formulation for uncertainty estimation. However, such models are hard to tune, scale, and add exogenous variables to. Motivated by the recent resurgence of Long Short Term Memory networks, we propose a novel end-to-end Bayesian deep model that provides time series prediction along with uncertainty estimation. We provide detailed experiments of the proposed solution on completed trips data, and successfully apply it to large-scale time series anomaly detection at Uber.
Tasks Anomaly Detection, Time Series, Time Series Forecasting, Time Series Prediction
Published 2017-09-06
URL http://arxiv.org/abs/1709.01907v1
PDF http://arxiv.org/pdf/1709.01907v1.pdf
PWC https://paperswithcode.com/paper/deep-and-confident-prediction-for-time-series
Repo https://github.com/jsiloto/dengAI
Framework tf

Cynical Selection of Language Model Training Data

Title Cynical Selection of Language Model Training Data
Authors Amittai Axelrod
Abstract The Moore-Lewis method of “intelligent selection of language model training data” is very effective, cheap, efficient… and also has structural problems. (1) The method defines relevance by playing language models trained on the in-domain and the out-of-domain (or data pool) corpora against each other. This powerful idea– which we set out to preserve– treats the two corpora as the opposing ends of a single spectrum. This lack of nuance does not allow for the two corpora to be very similar. In the extreme case where the come from the same distribution, all of the sentences have a Moore-Lewis score of zero, so there is no resulting ranking. (2) The selected sentences are not guaranteed to be able to model the in-domain data, nor to even cover the in-domain data. They are simply well-liked by the in-domain model; this is necessary, but not sufficient. (3) There is no way to tell what is the optimal number of sentences to select, short of picking various thresholds and building the systems. We present a greedy, lazy, approximate, and generally efficient information-theoretic method of accomplishing the same goal using only vocabulary counts. The method has the following properties: (1) Is responsive to the extent to which two corpora differ. (2) Quickly reaches near-optimal vocabulary coverage. (3) Takes into account what has already been selected. (4) Does not involve defining any kind of domain, nor any kind of classifier. (6) Knows approximately when to stop. This method can be used as an inherently-meaningful measure of similarity, as it measures the bits of information to be gained by adding one text to another.
Tasks Language Modelling
Published 2017-09-07
URL http://arxiv.org/abs/1709.02279v1
PDF http://arxiv.org/pdf/1709.02279v1.pdf
PWC https://paperswithcode.com/paper/cynical-selection-of-language-model-training
Repo https://github.com/amittai/cynical
Framework none

Temporal Relational Reasoning in Videos

Title Temporal Relational Reasoning in Videos
Authors Bolei Zhou, Alex Andonian, Aude Oliva, Antonio Torralba
Abstract Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species. In this paper, we introduce an effective and interpretable network module, the Temporal Relation Network (TRN), designed to learn and reason about temporal dependencies between video frames at multiple time scales. We evaluate TRN-equipped networks on activity recognition tasks using three recent video datasets - Something-Something, Jester, and Charades - which fundamentally depend on temporal relational reasoning. Our results demonstrate that the proposed TRN gives convolutional neural networks a remarkable capacity to discover temporal relations in videos. Through only sparsely sampled video frames, TRN-equipped networks can accurately predict human-object interactions in the Something-Something dataset and identify various human gestures on the Jester dataset with very competitive performance. TRN-equipped networks also outperform two-stream networks and 3D convolution networks in recognizing daily activities in the Charades dataset. Further analyses show that the models learn intuitive and interpretable visual common sense knowledge in videos.
Tasks Action Classification, Action Recognition In Videos, Activity Recognition, Common Sense Reasoning, Human-Object Interaction Detection, Relational Reasoning
Published 2017-11-22
URL http://arxiv.org/abs/1711.08496v2
PDF http://arxiv.org/pdf/1711.08496v2.pdf
PWC https://paperswithcode.com/paper/temporal-relational-reasoning-in-videos
Repo https://github.com/okankop/MFF-pytorch
Framework pytorch

End-to-end Neural Coreference Resolution

Title End-to-end Neural Coreference Resolution
Authors Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer
Abstract We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector. The key idea is to directly consider all spans in a document as potential mentions and learn distributions over possible antecedents for each. The model computes span embeddings that combine context-dependent boundary representations with a head-finding attention mechanism. It is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. Experiments demonstrate state-of-the-art performance, with a gain of 1.5 F1 on the OntoNotes benchmark and by 3.1 F1 using a 5-model ensemble, despite the fact that this is the first approach to be successfully trained with no external resources.
Tasks Coreference Resolution
Published 2017-07-21
URL http://arxiv.org/abs/1707.07045v2
PDF http://arxiv.org/pdf/1707.07045v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-neural-coreference-resolution
Repo https://github.com/kentonl/e2e-coref
Framework tf

Deep Rotation Equivariant Network

Title Deep Rotation Equivariant Network
Authors Junying Li, Zichen Yang, Haifeng Liu, Deng Cai
Abstract Recently, learning equivariant representations has attracted considerable research attention. Dieleman et al. introduce four operations which can be inserted into convolutional neural network to learn deep representations equivariant to rotation. However, feature maps should be copied and rotated four times in each layer in their approach, which causes much running time and memory overhead. In order to address this problem, we propose Deep Rotation Equivariant Network consisting of cycle layers, isotonic layers and decycle layers. Our proposed layers apply rotation transformation on filters rather than feature maps, achieving a speed up of more than 2 times with even less memory overhead. We evaluate DRENs on Rotated MNIST and CIFAR-10 datasets and demonstrate that it can improve the performance of state-of-the-art architectures.
Tasks
Published 2017-05-24
URL http://arxiv.org/abs/1705.08623v2
PDF http://arxiv.org/pdf/1705.08623v2.pdf
PWC https://paperswithcode.com/paper/deep-rotation-equivariant-network
Repo https://github.com/microljy/DREN_Tensorflow
Framework tf

Classification with Costly Features using Deep Reinforcement Learning

Title Classification with Costly Features using Deep Reinforcement Learning
Authors Jaromír Janisch, Tomáš Pevný, Viliam Lisý
Abstract We study a classification problem where each feature can be acquired for a cost and the goal is to optimize a trade-off between the expected classification error and the feature cost. We revisit a former approach that has framed the problem as a sequential decision-making problem and solved it by Q-learning with a linear approximation, where individual actions are either requests for feature values or terminate the episode by providing a classification decision. On a set of eight problems, we demonstrate that by replacing the linear approximation with neural networks the approach becomes comparable to the state-of-the-art algorithms developed specifically for this problem. The approach is flexible, as it can be improved with any new reinforcement learning enhancement, it allows inclusion of pre-trained high-performance classifier, and unlike prior art, its performance is robust across all evaluated datasets.
Tasks Classification with Costly Features, Decision Making, Q-Learning
Published 2017-11-20
URL http://arxiv.org/abs/1711.07364v2
PDF http://arxiv.org/pdf/1711.07364v2.pdf
PWC https://paperswithcode.com/paper/classification-with-costly-features-using
Repo https://github.com/jaromiru/cwcf
Framework pytorch

A simple neural network module for relational reasoning

Title A simple neural network module for relational reasoning
Authors Adam Santoro, David Raposo, David G. T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap
Abstract Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn. In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning. We tested RN-augmented networks on three tasks: visual question answering using a challenging dataset called CLEVR, on which we achieve state-of-the-art, super-human performance; text-based question answering using the bAbI suite of tasks; and complex reasoning about dynamic physical systems. Then, using a curated dataset called Sort-of-CLEVR we show that powerful convolutional networks do not have a general capacity to solve relational questions, but can gain this capacity when augmented with RNs. Our work shows how a deep learning architecture equipped with an RN module can implicitly discover and learn to reason about entities and their relations.
Tasks Question Answering, Relational Reasoning, Visual Question Answering
Published 2017-06-05
URL http://arxiv.org/abs/1706.01427v1
PDF http://arxiv.org/pdf/1706.01427v1.pdf
PWC https://paperswithcode.com/paper/a-simple-neural-network-module-for-relational
Repo https://github.com/moduIo/Relation-Networks
Framework tf

MemNet: A Persistent Memory Network for Image Restoration

Title MemNet: A Persistent Memory Network for Image Restoration
Authors Ying Tai, Jian Yang, Xiaoming Liu, Chunyan Xu
Abstract Recently, very deep convolutional neural networks (CNNs) have been attracting considerable attention in image restoration. However, as the depth grows, the long-term dependency problem is rarely realized for these very deep models, which results in the prior states/layers having little influence on the subsequent ones. Motivated by the fact that human thoughts have persistency, we propose a very deep persistent memory network (MemNet) that introduces a memory block, consisting of a recursive unit and a gate unit, to explicitly mine persistent memory through an adaptive learning process. The recursive unit learns multi-level representations of the current state under different receptive fields. The representations and the outputs from the previous memory blocks are concatenated and sent to the gate unit, which adaptively controls how much of the previous states should be reserved, and decides how much of the current state should be stored. We apply MemNet to three image restoration tasks, i.e., image denosing, super-resolution and JPEG deblocking. Comprehensive experiments demonstrate the necessity of the MemNet and its unanimous superiority on all three tasks over the state of the arts. Code is available at https://github.com/tyshiwo/MemNet.
Tasks Image Restoration, Image Super-Resolution, Super-Resolution
Published 2017-08-07
URL http://arxiv.org/abs/1708.02209v1
PDF http://arxiv.org/pdf/1708.02209v1.pdf
PWC https://paperswithcode.com/paper/memnet-a-persistent-memory-network-for-image
Repo https://github.com/tyshiwo/MemNet
Framework tf

End-to-End Deep Learning for Steering Autonomous Vehicles Considering Temporal Dependencies

Title End-to-End Deep Learning for Steering Autonomous Vehicles Considering Temporal Dependencies
Authors Hesham M. Eraqi, Mohamed N. Moustafa, Jens Honer
Abstract Steering a car through traffic is a complex task that is difficult to cast into algorithms. Therefore, researchers turn to training artificial neural networks from front-facing camera data stream along with the associated steering angles. Nevertheless, most existing solutions consider only the visual camera frames as input, thus ignoring the temporal relationship between frames. In this work, we propose a Convolutional Long Short-Term Memory Recurrent Neural Network (C-LSTM), that is end-to-end trainable, to learn both visual and dynamic temporal dependencies of driving. Additionally, We introduce posing the steering angle regression problem as classification while imposing a spatial relationship between the output layer neurons. Such method is based on learning a sinusoidal function that encodes steering angles. To train and validate our proposed methods, we used the publicly available Comma.ai dataset. Our solution improved steering root mean square error by 35% over recent methods, and led to a more stable steering by 87%.
Tasks Autonomous Vehicles
Published 2017-10-10
URL http://arxiv.org/abs/1710.03804v3
PDF http://arxiv.org/pdf/1710.03804v3.pdf
PWC https://paperswithcode.com/paper/end-to-end-deep-learning-for-steering
Repo https://github.com/Sondreab/TDT4265_final_project
Framework tf

Unbiased Shape Compactness for Segmentation

Title Unbiased Shape Compactness for Segmentation
Authors Jose Dolz, Ismail Ben Ayed, Christian Desrosiers
Abstract We propose to constrain segmentation functionals with a dimensionless, unbiased and position-independent shape compactness prior, which we solve efficiently with an alternating direction method of multipliers (ADMM). Involving a squared sum of pairwise potentials, our prior results in a challenging high-order optimization problem, which involves dense (fully connected) graphs. We split the problem into a sequence of easier sub-problems, each performed efficiently at each iteration: (i) a sparse-matrix inversion based on Woodbury identity, (ii) a closed-form solution of a cubic equation and (iii) a graph-cut update of a sub-modular pairwise sub-problem with a sparse graph. We deploy our prior in an energy minimization, in conjunction with a supervised classifier term based on CNNs and standard regularization constraints. We demonstrate the usefulness of our energy in several medical applications. In particular, we report comprehensive evaluations of our fully automated algorithm over 40 subjects, showing a competitive performance for the challenging task of abdominal aorta segmentation in MRI.
Tasks
Published 2017-04-28
URL http://arxiv.org/abs/1704.08908v2
PDF http://arxiv.org/pdf/1704.08908v2.pdf
PWC https://paperswithcode.com/paper/unbiased-shape-compactness-for-segmentation
Repo https://github.com/josedolz/UnbiasedShapeCompactness
Framework none

Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

Title Ontology-Aware Token Embeddings for Prepositional Phrase Attachment
Authors Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy
Abstract Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language. Instead, we embed semantic concepts (or synsets) as defined in WordNet and represent a word token in a particular context by estimating a distribution over relevant semantic concepts. We use the new, context-sensitive embeddings in a model for predicting prepositional phrase(PP) attachments and jointly learn the concept embeddings and model parameters. We show that using context-sensitive embeddings improves the accuracy of the PP attachment model by 5.4% absolute points, which amounts to a 34.4% relative reduction in errors.
Tasks Prepositional Phrase Attachment, Word Embeddings
Published 2017-05-08
URL http://arxiv.org/abs/1705.02925v1
PDF http://arxiv.org/pdf/1705.02925v1.pdf
PWC https://paperswithcode.com/paper/ontology-aware-token-embeddings-for
Repo https://github.com/pdasigi/onto-lstm
Framework none

Deep learning with convolutional neural networks for EEG decoding and visualization

Title Deep learning with convolutional neural networks for EEG decoding and visualization
Authors Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, Tonio Ball
Abstract PLEASE READ AND CITE THE REVISED VERSION at Human Brain Mapping: http://onlinelibrary.wiley.com/doi/10.1002/hbm.23730/full Code available here: https://github.com/robintibor/braindecode
Tasks EEG, Eeg Decoding
Published 2017-03-15
URL http://arxiv.org/abs/1703.05051v5
PDF http://arxiv.org/pdf/1703.05051v5.pdf
PWC https://paperswithcode.com/paper/deep-learning-with-convolutional-neural
Repo https://github.com/rczhen/Movement-Classification-based-on-Electroencephalography-EEG-Signals
Framework none

Machine Learning for Set-Identified Linear Models

Title Machine Learning for Set-Identified Linear Models
Authors Vira Semenova
Abstract This paper provides estimation and inference methods for an identified set where the selection among a very large number of covariates is based on modern machine learning tools. I characterize the boundary of the identified set (i.e., support function) using a semiparametric moment condition. Combining Neyman-orthogonality and sample splitting ideas, I construct a root-N consistent, uniformly asymptotically Gaussian estimator of the support function and propose a weighted bootstrap procedure to conduct inference about the identified set. I provide a general method to construct a Neyman-orthogonal moment condition for the support function. Applying my method to Lee (2008)‘s endogenous selection model, I provide the asymptotic theory for the sharp (i.e., the tightest possible) bounds on the Average Treatment Effect in the presence of high-dimensional covariates. Furthermore, I relax the conventional monotonicity assumption and allow the sign of the treatment effect on the selection (e.g., employment) to be determined by covariates. Using JobCorps data set with very rich baseline characteristics, I substantially tighten the bounds on the JobCorps effect on wages under weakened monotonicity assumption.
Tasks Model Selection
Published 2017-12-28
URL https://arxiv.org/abs/1712.10024v3
PDF https://arxiv.org/pdf/1712.10024v3.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-set-identified-linear
Repo https://github.com/vsemenova/leebounds
Framework none
comments powered by Disqus