October 17, 2019

3310 words 16 mins read

Paper Group ANR 862

Compressed Sensing Parallel MRI with Adaptive Shrinkage TV Regularization. Controllable Neural Story Plot Generation via Reinforcement Learning. Building Prior Knowledge: A Markov Based Pedestrian Prediction Model Using Urban Environmental Data. Spectral Mixture Kernels with Time and Phase Delay Dependencies. Extreme Relative Pose Estimation for RG …

Compressed Sensing Parallel MRI with Adaptive Shrinkage TV Regularization


Title	Compressed Sensing Parallel MRI with Adaptive Shrinkage TV Regularization
Authors	Raji Susan Mathew, Joseph Suresh Paul
Abstract	Compressed sensing (CS) methods in magnetic resonance imaging (MRI) offer rapid acquisition and improved image quality but require iterative reconstruction schemes with regularization to enforce sparsity. Regardless of the difficulty in obtaining a fast numerical solution, the total variation (TV) regularization is a preferred choice due to its edge-preserving and structure recovery capabilities. While many approaches have been proposed to overcome the non-differentiability of the TV cost term, an iterative shrinkage based formulation allows recovering an image through recursive application of linear filtering and soft thresholding. However, providing an optimal setting for the regularization parameter is critical due to its direct impact on the rate of convergence as well as steady state error. In this paper, a regularizer adaptively varying in the derivative space is proposed, that follows the generalized discrepancy principle (GDP). The implementation proceeds by adaptively reducing the discrepancy level expressed as the absolute difference between TV norms of the consistency error and the sparse approximation error. A criterion based on the absolute difference between TV norms of consistency and sparse approximation errors is used to update the threshold. Application of the adaptive shrinkage TV regularizer to CS recovery of parallel MRI (pMRI) and temporal gradient adaptation in dynamic MRI are shown to result in improved image quality with accelerated convergence. In addition, the adaptive TV-based iterative shrinkage (ATVIS) provides a significant speed advantage over the fast iterative shrinkage-thresholding algorithm (FISTA).
Tasks
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06665v1
PDF	http://arxiv.org/pdf/1809.06665v1.pdf
PWC	https://paperswithcode.com/paper/compressed-sensing-parallel-mri-with-adaptive
Repo
Framework

Controllable Neural Story Plot Generation via Reinforcement Learning


Title	Controllable Neural Story Plot Generation via Reinforcement Learning
Authors	Pradyumna Tambwekar, Murtaza Dhuliawala, Lara J. Martin, Animesh Mehta, Brent Harrison, Mark O. Riedl
Abstract	Language-modeling–based approaches to story plot generation attempt to construct a plot by sampling from a language model (LM) to predict the next character, word, or sentence to add to the story. LM techniques lack the ability to receive guidance from the user to achieve a specific goal, resulting in stories that don’t have a clear sense of progression and lack coherence. We present a reward-shaping technique that analyzes a story corpus and produces intermediate rewards that are backpropagated into a pre-trained LM in order to guide the model towards a given goal. Automated evaluations show our technique can create a model that generates story plots which consistently achieve a specified goal. Human-subject studies show that the generated stories have more plausible event ordering than baseline plot generation techniques.
Tasks	Language Modelling
Published	2018-09-27
URL	https://arxiv.org/abs/1809.10736v3
PDF	https://arxiv.org/pdf/1809.10736v3.pdf
PWC	https://paperswithcode.com/paper/controllable-neural-story-plot-generation-via
Repo
Framework

Building Prior Knowledge: A Markov Based Pedestrian Prediction Model Using Urban Environmental Data


Title	Building Prior Knowledge: A Markov Based Pedestrian Prediction Model Using Urban Environmental Data
Authors	Pavan Vasishta, Dominique Vaufreydaz, Anne Spalanzani
Abstract	Autonomous Vehicles navigating in urban areas have a need to understand and predict future pedestrian behavior for safer navigation. This high level of situational awareness requires observing pedestrian behavior and extrapolating their positions to know future positions. While some work has been done in this field using Hidden Markov Models (HMMs), one of the few observed drawbacks of the method is the need for informed priors for learning behavior. In this work, an extension to the Growing Hidden Markov Model (GHMM) method is proposed to solve some of these drawbacks. This is achieved by building on existing work using potential cost maps and the principle of Natural Vision. As a consequence, the proposed model is able to predict pedestrian positions more precisely over a longer horizon compared to the state of the art. The method is tested over “legal” and “illegal” behavior of pedestrians, having trained the model with sparse observations and partial trajectories. The method, with no training data, is compared against a trained state of the art model. It is observed that the proposed method is robust even in new, previously unseen areas.
Tasks	Autonomous Vehicles
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06045v1
PDF	http://arxiv.org/pdf/1809.06045v1.pdf
PWC	https://paperswithcode.com/paper/building-prior-knowledge-a-markov-based
Repo
Framework

Spectral Mixture Kernels with Time and Phase Delay Dependencies


Title	Spectral Mixture Kernels with Time and Phase Delay Dependencies
Authors	Kai Chen, Perry Groot, Jinsong Chen, Elena Marchiori
Abstract	Spectral Mixture (SM) kernels form a powerful class of kernels for Gaussian processes, capable to discover patterns, extrapolate, and model negative covariances. Being a linear superposition of quasi-periodical Gaussian components, an SM kernel does not explicitly model dependencies between components. In this paper we investigate the benefits of modeling explicitly time and phase delay dependencies between components in an AM kernel. We analyze the presence of statistical dependencies between components using Gaussian conditionals and posterior covariance and use this framework to motivate the proposed SM kernel extension, called Spectral Mixture kernel with time and phase delay Dependencies (SMD). SMD is constructed in two steps: first, time delay and phase delay are incorporated into each base component; next, cross-convolution between a base component and the reversed complex conjugate of another base component is performed which yields a complex-valued and positive definite kernel representing correlations between base components. The number of hyper-parameters of SMD, except the time and phase delay ones, remains equal to that of the SM kernel. We perform a thorough comparative experimental analysis of SMD on synthetic and real-life data sets. Results indicate the beneficial effect of modeling time and phase delay dependencies between base components, notably for natural phenomena involving little or no influence from human activity.
Tasks	Gaussian Processes
Published	2018-08-01
URL	https://arxiv.org/abs/1808.00560v6
PDF	https://arxiv.org/pdf/1808.00560v6.pdf
PWC	https://paperswithcode.com/paper/spectral-mixture-kernels-with-time-and-phase
Repo
Framework

Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion


Title	Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion
Authors	Zhenpei Yang, Jeffrey Z. Pan, Linjie Luo, Xiaowei Zhou, Kristen Grauman, Qixing Huang
Abstract	Estimating the relative rigid pose between two RGB-D scans of the same underlying environment is a fundamental problem in computer vision, robotics, and computer graphics. Most existing approaches allow only limited maximum relative pose changes since they require considerable overlap between the input scans. We introduce a novel deep neural network that extends the scope to extreme relative poses, with little or even no overlap between the input scans. The key idea is to infer more complete scene information about the underlying environment and match on the completed scans. In particular, instead of only performing scene completion from each individual scan, our approach alternates between relative pose estimation and scene completion. This allows us to perform scene completion by utilizing information from both input scans at late iterations, resulting in better results for both scene completion and relative pose estimation. Experimental results on benchmark datasets show that our approach leads to considerable improvements over state-of-the-art approaches for relative pose estimation. In particular, our approach provides encouraging relative pose estimates even between non-overlapping scans.
Tasks	Pose Estimation
Published	2018-12-31
URL	http://arxiv.org/abs/1901.00063v2
PDF	http://arxiv.org/pdf/1901.00063v2.pdf
PWC	https://paperswithcode.com/paper/extreme-relative-pose-estimation-for-rgb-d
Repo
Framework

Demonstrating Advantages of Neuromorphic Computation: A Pilot Study


Title	Demonstrating Advantages of Neuromorphic Computation: A Pilot Study
Authors	Timo Wunderlich, Akos F. Kungl, Eric Müller, Andreas Hartel, Yannik Stradmann, Syed Ahmed Aamir, Andreas Grübl, Arthur Heimbrecht, Korbinian Schreiber, David Stöckel, Christian Pehle, Sebastian Billaudelle, Gerd Kiene, Christian Mauch, Johannes Schemmel, Karlheinz Meier, Mihai A. Petrovici
Abstract	Neuromorphic devices represent an attempt to mimic aspects of the brain’s architecture and dynamics with the aim of replicating its hallmark functional capabilities in terms of computational power, robust learning and energy efficiency. We employ a single-chip prototype of the BrainScaleS 2 neuromorphic system to implement a proof-of-concept demonstration of reward-modulated spike-timing-dependent plasticity in a spiking network that learns to play the Pong video game by smooth pursuit. This system combines an electronic mixed-signal substrate for emulating neuron and synapse dynamics with an embedded digital processor for on-chip learning, which in this work also serves to simulate the virtual environment and learning agent. The analog emulation of neuronal membrane dynamics enables a 1000-fold acceleration with respect to biological real-time, with the entire chip operating on a power budget of 57mW. Compared to an equivalent simulation using state-of-the-art software, the on-chip emulation is at least one order of magnitude faster and three orders of magnitude more energy-efficient. We demonstrate how on-chip learning can mitigate the effects of fixed-pattern noise, which is unavoidable in analog substrates, while making use of temporal variability for action exploration. Learning compensates imperfections of the physical substrate, as manifested in neuronal parameter variability, by adapting synaptic weights to match respective excitability of individual neurons.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03618v4
PDF	http://arxiv.org/pdf/1811.03618v4.pdf
PWC	https://paperswithcode.com/paper/demonstrating-advantages-of-neuromorphic
Repo
Framework

Learning to generate filters for convolutional neural networks


Title	Learning to generate filters for convolutional neural networks
Authors	Wei Shen, Rujie Liu
Abstract	Conventionally, convolutional neural networks (CNNs) process different images with the same set of filters. However, the variations in images pose a challenge to this fashion. In this paper, we propose to generate sample-specific filters for convolutional layers in the forward pass. Since the filters are generated on-the-fly, the model becomes more flexible and can better fit the training data compared to traditional CNNs. In order to obtain sample-specific features, we extract the intermediate feature maps from an autoencoder. As filters are usually high dimensional, we propose to learn a set of coefficients instead of a set of filters. These coefficients are used to linearly combine the base filters from a filter repository to generate the final filters for a CNN. The proposed method is evaluated on MNIST, MTFL and CIFAR10 datasets. Experiment results demonstrate that the classification accuracy of the baseline model can be improved by using the proposed filter generation method.
Tasks
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01894v1
PDF	http://arxiv.org/pdf/1812.01894v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-generate-filters-for
Repo
Framework

Dynamic Graph Modules for Modeling Object-Object Interactions in Activity Recognition


Title	Dynamic Graph Modules for Modeling Object-Object Interactions in Activity Recognition
Authors	Hao Huang, Luowei Zhou, Wei Zhang, Jason J. Corso, Chenliang Xu
Abstract	Video action recognition, a critical problem in video understanding, has been gaining increasing attention. To identify actions induced by complex object-object interactions, we need to consider not only spatial relations among objects in a single frame, but also temporal relations among different or the same objects across multiple frames. However, existing approaches that model video representations and non-local features are either incapable of explicitly modeling relations at the object-object level or unable to handle streaming videos. In this paper, we propose a novel dynamic hidden graph module to model complex object-object interactions in videos, of which two instantiations are considered: a visual graph that captures appearance/motion changes among objects and a location graph that captures relative spatiotemporal position changes among objects. Additionally, the proposed graph module allows us to process streaming videos, setting it apart from existing methods. Experimental results on benchmark datasets, Something-Something and ActivityNet, show the competitive performance of our method.
Tasks	3D Human Action Recognition, Activity Recognition, Temporal Action Localization, Video Understanding
Published	2018-12-13
URL	https://arxiv.org/abs/1812.05637v3
PDF	https://arxiv.org/pdf/1812.05637v3.pdf
PWC	https://paperswithcode.com/paper/dynamic-graph-modules-for-modeling-higher
Repo
Framework

Non-asymptotic Identification of LTI Systems from a Single Trajectory


Title	Non-asymptotic Identification of LTI Systems from a Single Trajectory
Authors	Samet Oymak, Necmiye Ozay
Abstract	We consider the problem of learning a realization for a linear time-invariant (LTI) dynamical system from input/output data. Given a single input/output trajectory, we provide finite time analysis for learning the system’s Markov parameters, from which a balanced realization is obtained using the classical Ho-Kalman algorithm. By proving a stability result for the Ho-Kalman algorithm and combining it with the sample complexity results for Markov parameters, we show how much data is needed to learn a balanced realization of the system up to a desired accuracy with high probability.
Tasks
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05722v2
PDF	http://arxiv.org/pdf/1806.05722v2.pdf
PWC	https://paperswithcode.com/paper/non-asymptotic-identification-of-lti-systems
Repo
Framework

Supersaliency: A Novel Pipeline for Predicting Smooth Pursuit-Based Attention Improves Generalizability of Video Saliency


Title	Supersaliency: A Novel Pipeline for Predicting Smooth Pursuit-Based Attention Improves Generalizability of Video Saliency
Authors	Mikhail Startsev, Michael Dorr
Abstract	Predicting attention is a popular topic at the intersection of human and computer vision. However, even though most of the available video saliency data sets and models claim to target human observers’ fixations, they fail to differentiate them from smooth pursuit (SP), a major eye movement type that is unique to perception of dynamic scenes. In this work, we highlight the importance of SP and its prediction (which we call supersaliency, due to greater selectivity compared to fixations), and aim to make its distinction from fixations explicit for computational models. To this end, we (i) use algorithmic and manual annotations of SP and fixations for two well-established video saliency data sets, (ii) train Slicing Convolutional Neural Networks for saliency prediction on either fixation- or SP-salient locations, and (iii) evaluate our and 26 publicly available dynamic saliency models on three data sets against traditional saliency and supersaliency ground truth. Overall, our models outperform the state of the art in both the new supersaliency and the traditional saliency problem settings, for which literature models are optimized. Importantly, on two independent data sets, our supersaliency model shows greater generalization ability and outperforms all other models, even for fixation prediction.
Tasks	Saliency Prediction
Published	2018-01-26
URL	http://arxiv.org/abs/1801.08925v3
PDF	http://arxiv.org/pdf/1801.08925v3.pdf
PWC	https://paperswithcode.com/paper/supersaliency-predicting-smooth-pursuit-based
Repo
Framework

Synthesizing Efficient Solutions for Patrolling Problems in the Internet Environment


Title	Synthesizing Efficient Solutions for Patrolling Problems in the Internet Environment
Authors	Tomáš Brázdil, Antonín Kučera, Vojtěch Řehák
Abstract	We propose an algorithm for constructing efficient patrolling strategies in the Internet environment, where the protected targets are nodes connected to the network and the patrollers are software agents capable of detecting/preventing undesirable activities on the nodes. The algorithm is based on a novel compositional principle designed for a special class of strategies, and it can quickly construct (sub)optimal solutions even if the number of targets reaches hundreds of millions.
Tasks
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02861v2
PDF	http://arxiv.org/pdf/1805.02861v2.pdf
PWC	https://paperswithcode.com/paper/synthesizing-efficient-solutions-for
Repo
Framework

A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization


Title	A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization
Authors	Robin Vogel, Aurélien Bellet, Stéphan Clémençon
Abstract	The performance of many machine learning techniques depends on the choice of an appropriate similarity or distance measure on the input space. Similarity learning (or metric learning) aims at building such a measure from training data so that observations with the same (resp. different) label are as close (resp. far) as possible. In this paper, similarity learning is investigated from the perspective of pairwise bipartite ranking, where the goal is to rank the elements of a database by decreasing order of the probability that they share the same label with some query data point, based on the similarity scores. A natural performance criterion in this setting is pointwise ROC optimization: maximize the true positive rate under a fixed false positive rate. We study this novel perspective on similarity learning through a rigorous probabilistic framework. The empirical version of the problem gives rise to a constrained optimization formulation involving U-statistics, for which we derive universal learning rates as well as faster rates under a noise assumption on the data distribution. We also address the large-scale setting by analyzing the effect of sampling-based approximations. Our theoretical results are supported by illustrative numerical experiments.
Tasks	Metric Learning
Published	2018-07-18
URL	http://arxiv.org/abs/1807.06981v1
PDF	http://arxiv.org/pdf/1807.06981v1.pdf
PWC	https://paperswithcode.com/paper/a-probabilistic-theory-of-supervised
Repo
Framework

Learning Large Euclidean Margin for Sketch-based Image Retrieval


Title	Learning Large Euclidean Margin for Sketch-based Image Retrieval
Authors	Peng Lu, Gao Huang, Yanwei Fu, Guodong Guo, Hangyu Lin
Abstract	This paper addresses the problem of Sketch-Based Image Retrieval (SBIR), for which bridge the gap between the data representations of sketch images and photo images is considered as the key. Previous works mostly focus on learning a feature space to minimize intra-class distances for both sketches and photos. In contrast, we propose a novel loss function, named Euclidean Margin Softmax (EMS), that not only minimizes intra-class distances but also maximizes inter-class distances simultaneously. It enables us to learn a feature space with high discriminability, leading to highly accurate retrieval. In addition, this loss function is applied to a conditional network architecture, which could incorporate the prior knowledge of whether a sample is a sketch or a photo. We show that the conditional information can be conveniently incorporated to the recently proposed Squeeze and Excitation (SE) module, lead to a conditional SE (CSE) module. Extensive experiments are conducted on two widely used SBIR benchmark datasets. Our approach, although being very simple, achieved new state-of-the-art on both datasets, surpassing existing methods by a large margin.
Tasks	Image Retrieval, Sketch-Based Image Retrieval
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04275v1
PDF	http://arxiv.org/pdf/1812.04275v1.pdf
PWC	https://paperswithcode.com/paper/learning-large-euclidean-margin-for-sketch
Repo
Framework

Gaussian Process Uncertainty in Age Estimation as a Measure of Brain Abnormality


Title	Gaussian Process Uncertainty in Age Estimation as a Measure of Brain Abnormality
Authors	Benjamin Gutierrez Becker, Tassilo Klein, Christian Wachinger
Abstract	Multivariate regression models for age estimation are a powerful tool for assessing abnormal brain morphology associated to neuropathology. Age prediction models are built on cohorts of healthy subjects and are built to reflect normal aging patterns. The application of these multivariate models to diseased subjects usually results in high prediction errors, under the hypothesis that neuropathology presents a similar degenerative pattern as that of accelerated aging. In this work, we propose an alternative to the idea that pathology follows a similar trajectory than normal aging. Instead, we propose the use of metrics which measure deviations from the mean aging trajectory. We propose to measure these deviations using two different metrics: uncertainty in a Gaussian process regression model and a newly proposed age weighted uncertainty measure. Consequently, our approach assumes that pathologic brain patterns are different to those of normal aging. We present results for subjects with autism, mild cognitive impairment and Alzheimer’s disease to highlight the versatility of the approach to different diseases and age ranges. We evaluate volume, thickness, and VBM features for quantifying brain morphology. Our evaluations are performed on a large number of images obtained from a variety of publicly available neuroimaging databases. Across all features, our uncertainty based measurements yield a better separation between diseased subjects and healthy individuals than the prediction error. Finally, we illustrate differences in the disease pattern to normal aging, supporting the application of uncertainty as a measure of neuropathology.
Tasks	Age Estimation
Published	2018-04-04
URL	http://arxiv.org/abs/1804.01296v1
PDF	http://arxiv.org/pdf/1804.01296v1.pdf
PWC	https://paperswithcode.com/paper/gaussian-process-uncertainty-in-age
Repo
Framework

Sparsemax and Relaxed Wasserstein for Topic Sparsity


Title	Sparsemax and Relaxed Wasserstein for Topic Sparsity
Authors	Tianyi Lin, Zhiyue Hu, Xin Guo
Abstract	Topic sparsity refers to the observation that individual documents usually focus on several salient topics instead of covering a wide variety of topics, and a real topic adopts a narrow range of terms instead of a wide coverage of the vocabulary. Understanding this topic sparsity is especially important for analyzing user-generated web content and social media, which are featured in the form of extremely short posts and discussions. As topic sparsity of individual documents in online social media increases, so does the difficulty of analyzing the online text sources using traditional methods. In this paper, we propose two novel neural models by providing sparse posterior distributions over topics based on the Gaussian sparsemax construction, enabling efficient training by stochastic backpropagation. We construct an inference network conditioned on the input data and infer the variational distribution with the relaxed Wasserstein (RW) divergence. Unlike existing works based on Gaussian softmax construction and Kullback-Leibler (KL) divergence, our approaches can identify latent topic sparsity with training stability, predictive performance, and topic coherence. Experiments on different genres of large text corpora have demonstrated the effectiveness of our models as they outperform both probabilistic and neural methods.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09079v2
PDF	http://arxiv.org/pdf/1810.09079v2.pdf
PWC	https://paperswithcode.com/paper/sparsemax-and-relaxed-wasserstein-for-topic
Repo
Framework