July 28, 2019

3339 words 16 mins read

Paper Group ANR 313

Learning Convex Regularizers for Optimal Bayesian Denoising. Automated sub-cortical brain structure segmentation combining spatial and deep convolutional features. A Generative Model of Group Conversation. Sequential Attention: A Context-Aware Alignment Function for Machine Reading. What are the visual features underlying human versus machine visio …

Learning Convex Regularizers for Optimal Bayesian Denoising


Title	Learning Convex Regularizers for Optimal Bayesian Denoising
Authors	Ha Q. Nguyen, Emrah Bostan, Michael Unser
Abstract	We propose a data-driven algorithm for the maximum a posteriori (MAP) estimation of stochastic processes from noisy observations. The primary statistical properties of the sought signal is specified by the penalty function (i.e., negative logarithm of the prior probability density function). Our alternating direction method of multipliers (ADMM)-based approach translates the estimation task into successive applications of the proximal mapping of the penalty function. Capitalizing on this direct link, we define the proximal operator as a parametric spline curve and optimize the spline coefficients by minimizing the average reconstruction error for a given training set. The key aspects of our learning method are that the associated penalty function is constrained to be convex and the convergence of the ADMM iterations is proven. As a result of these theoretical guarantees, adaptation of the proposed framework to different levels of measurement noise is extremely simple and does not require any retraining. We apply our method to estimation of both sparse and non-sparse models of L'{e}vy processes for which the minimum mean square error (MMSE) estimators are available. We carry out a single training session and perform comparisons at various signal-to-noise ratio (SNR) values. Simulations illustrate that the performance of our algorithm is practically identical to the one of the MMSE estimator irrespective of the noise power.
Tasks	Denoising
Published	2017-05-16
URL	http://arxiv.org/abs/1705.05591v1
PDF	http://arxiv.org/pdf/1705.05591v1.pdf
PWC	https://paperswithcode.com/paper/learning-convex-regularizers-for-optimal
Repo
Framework

Automated sub-cortical brain structure segmentation combining spatial and deep convolutional features


Title	Automated sub-cortical brain structure segmentation combining spatial and deep convolutional features
Authors	Kaisar Kushibar, Sergi Valverde, Sandra Gonzalez-Villa, Jose Bernal, Mariano Cabezas, Arnau Oliver, Xavier Llado
Abstract	Sub-cortical brain structure segmentation in Magnetic Resonance Images (MRI) has attracted the interest of the research community for a long time because morphological changes in these structures are related to different neurodegenerative disorders. However, manual segmentation of these structures can be tedious and prone to variability, highlighting the need for robust automated segmentation methods. In this paper, we present a novel convolutional neural network based approach for accurate segmentation of the sub-cortical brain structures that combines both convolutional and prior spatial features for improving the segmentation accuracy. In order to increase the accuracy of the automated segmentation, we propose to train the network using a restricted sample selection to force the network to learn the most difficult parts of the structures. We evaluate the accuracy of the proposed method on the public MICCAI 2012 challenge and IBSR 18 datasets, comparing it with different available state-of-the-art methods and other recently proposed deep learning approaches. On the MICCAI 2012 dataset, our method shows an excellent performance comparable to the best challenge participant strategy, while performing significantly better than state-of-the-art techniques such as FreeSurfer and FIRST. On the IBSR 18 dataset, our method also exhibits a significant increase in the performance with respect to not only FreeSurfer and FIRST, but also comparable or better results than other recent deep learning approaches. Moreover, our experiments show that both the addition of the spatial priors and the restricted sampling strategy have a significant effect on the accuracy of the proposed method. In order to encourage the reproducibility and the use of the proposed method, a public version of our approach is available to download for the neuroimaging community.
Tasks
Published	2017-09-26
URL	http://arxiv.org/abs/1709.09075v1
PDF	http://arxiv.org/pdf/1709.09075v1.pdf
PWC	https://paperswithcode.com/paper/automated-sub-cortical-brain-structure
Repo
Framework

A Generative Model of Group Conversation


Title	A Generative Model of Group Conversation
Authors	Hannah Morrison, Chris Martens
Abstract	Conversations with non-player characters (NPCs) in games are typically confined to dialogue between a human player and a virtual agent, where the conversation is initiated and controlled by the player. To create richer, more believable environments for players, we need conversational behavior to reflect initiative on the part of the NPCs, including conversations that include multiple NPCs who interact with one another as well as the player. We describe a generative computational model of group conversation between agents, an abstract simulation of discussion in a small group setting. We define conversational interactions in terms of rules for turn taking and interruption, as well as belief change, sentiment change, and emotional response, all of which are dependent on agent personality, context, and relationships. We evaluate our model using a parameterized expressive range analysis, observing correlations between simulation parameters and features of the resulting conversations. This analysis confirms, for example, that character personalities will predict how often they speak, and that heterogeneous groups of characters will generate more belief change.
Tasks
Published	2017-06-21
URL	http://arxiv.org/abs/1706.06987v1
PDF	http://arxiv.org/pdf/1706.06987v1.pdf
PWC	https://paperswithcode.com/paper/a-generative-model-of-group-conversation
Repo
Framework

Sequential Attention: A Context-Aware Alignment Function for Machine Reading


Title	Sequential Attention: A Context-Aware Alignment Function for Machine Reading
Authors	Sebastian Brarda, Philip Yeres, Samuel R. Bowman
Abstract	In this paper we propose a neural network model with a novel Sequential Attention layer that extends soft attention by assigning weights to words in an input sequence in a way that takes into account not just how well that word matches a query, but how well surrounding words match. We evaluate this approach on the task of reading comprehension (on the Who did What and CNN datasets) and show that it dramatically improves a strong baseline–the Stanford Reader–and is competitive with the state of the art.
Tasks	Reading Comprehension
Published	2017-05-05
URL	http://arxiv.org/abs/1705.02269v2
PDF	http://arxiv.org/pdf/1705.02269v2.pdf
PWC	https://paperswithcode.com/paper/sequential-attention-a-context-aware
Repo
Framework

What are the visual features underlying human versus machine vision?


Title	What are the visual features underlying human versus machine vision?
Authors	Drew Linsley, Sven Eberhardt, Tarun Sharma, Pankaj Gupta, Thomas Serre
Abstract	Although Deep Convolutional Networks (DCNs) are approaching the accuracy of human observers at object recognition, it is unknown whether they leverage similar visual representations to achieve this performance. To address this, we introduce Clicktionary, a web-based game for identifying visual features used by human observers during object recognition. Importance maps derived from the game are consistent across participants and uncorrelated with image saliency measures. These results suggest that Clicktionary identifies image regions that are meaningful and diagnostic for object recognition but different than those driving eye movements. Surprisingly, Clicktionary importance maps are only weakly correlated with relevance maps derived from DCNs trained for object recognition. Our study demonstrates that the narrowing gap between the object recognition accuracy of human observers and DCNs obscures distinct visual strategies used by each to achieve this performance.
Tasks	Object Recognition
Published	2017-01-10
URL	http://arxiv.org/abs/1701.02704v2
PDF	http://arxiv.org/pdf/1701.02704v2.pdf
PWC	https://paperswithcode.com/paper/what-are-the-visual-features-underlying-human
Repo
Framework

Multi-Person Brain Activity Recognition via Comprehensive EEG Signal Analysis


Title	Multi-Person Brain Activity Recognition via Comprehensive EEG Signal Analysis
Authors	Xiang Zhang, Lina Yao, Dalin Zhang, Xianzhi Wang, Quan Z. Sheng, Tao Gu
Abstract	An electroencephalography (EEG) based brain activity recognition is a fundamental field of study for a number of significant applications such as intention prediction, appliance control, and neurological disease diagnosis in smart home and smart healthcare domains. Existing techniques mostly focus on binary brain activity recognition for a single person, which limits their deployment in wider and complex practical scenarios. Therefore, multi-person and multi-class brain activity recognition has obtained popularity recently. Another challenge faced by brain activity recognition is the low recognition accuracy due to the massive noises and the low signal-to-noise ratio in EEG signals. Moreover, the feature engineering in EEG processing is time-consuming and highly re- lies on the expert experience. In this paper, we attempt to solve the above challenges by proposing an approach which has better EEG interpretation ability via raw Electroencephalography (EEG) signal analysis for multi-person and multi-class brain activity recognition. Specifically, we analyze inter-class and inter-person EEG signal characteristics, based on which to capture the discrepancy of inter-class EEG data. Then, we adopt an Autoencoder layer to automatically refine the raw EEG signals by eliminating various artifacts. We evaluate our approach on both a public and a local EEG datasets and conduct extensive experiments to explore the effect of several factors (such as normalization methods, training data size, and Autoencoder hidden neuron size) on the recognition results. The experimental results show that our approach achieves a high accuracy comparing to competitive state-of-the-art methods, indicating its potential in promoting future research on multi-person EEG recognition.
Tasks	Activity Recognition, EEG, Feature Engineering
Published	2017-09-26
URL	http://arxiv.org/abs/1709.09077v1
PDF	http://arxiv.org/pdf/1709.09077v1.pdf
PWC	https://paperswithcode.com/paper/multi-person-brain-activity-recognition-via
Repo
Framework

Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches


Title	Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches
Authors	Shaohui Kuang, Deyi Xiong, Weihua Luo, Guodong Zhou
Abstract	Sentences in a well-formed text are connected to each other via various links to form the cohesive structure of the text. Current neural machine translation (NMT) systems translate a text in a conventional sentence-by-sentence fashion, ignoring such cross-sentence links and dependencies. This may lead to generate an incoherent target text for a coherent source text. In order to handle this issue, we propose a cache-based approach to modeling coherence for neural machine translation by capturing contextual information either from recently translated sentences or the entire document. Particularly, we explore two types of caches: a dynamic cache, which stores words from the best translation hypotheses of preceding sentences, and a topic cache, which maintains a set of target-side topical words that are semantically related to the document to be translated. On this basis, we build a new layer to score target words in these two caches with a cache-based neural model. Here the estimated probabilities from the cache-based neural model are combined with NMT probabilities into the final word prediction probabilities via a gating mechanism. Finally, the proposed cache-based neural model is trained jointly with NMT system in an end-to-end manner. Experiments and analysis presented in this paper demonstrate that the proposed cache-based model achieves substantial improvements over several state-of-the-art SMT and NMT baselines.
Tasks	Machine Translation
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11221v3
PDF	http://arxiv.org/pdf/1711.11221v3.pdf
PWC	https://paperswithcode.com/paper/modeling-coherence-for-neural-machine
Repo
Framework

Statistical Inferences for Polarity Identification in Natural Language


Title	Statistical Inferences for Polarity Identification in Natural Language
Authors	Nicolas Pröllochs, Stefan Feuerriegel, Dirk Neumann
Abstract	Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes a novel method of studying the reception of granular expressions in natural language. The approach utilizes LASSO regularization as a statistical tool to extract decisive words from textual content and draw statistical inferences based on the correspondence between the occurrences of words and an exogenous response variable. Accordingly, the method immediately suggests significant implications for social sciences and Information Systems research: everyone can now identify text segments and word choices that are statistically relevant to authors or readers and, based on this knowledge, test hypotheses from behavioral research. We demonstrate the contribution of our method by examining how authors communicate subjective information through narrative materials. This allows us to answer the question of which words to choose when communicating negative information. On the other hand, we show that investors trade not only upon facts in financial disclosures but are distracted by filler words and non-informative language. Practitioners - for example those in the fields of investor communications or marketing - can exploit our insights to enhance their writings based on the true perception of word choice.
Tasks	Decision Making
Published	2017-06-21
URL	http://arxiv.org/abs/1706.06996v2
PDF	http://arxiv.org/pdf/1706.06996v2.pdf
PWC	https://paperswithcode.com/paper/statistical-inferences-for-polarity
Repo
Framework

Generative learning for deep networks


Title	Generative learning for deep networks
Authors	Boris Flach, Alexander Shekhovtsov, Ondrej Fikar
Abstract	Learning, taking into account full distribution of the data, referred to as generative, is not feasible with deep neural networks (DNNs) because they model only the conditional distribution of the outputs given the inputs. Current solutions are either based on joint probability models facing difficult estimation problems or learn two separate networks, mapping inputs to outputs (recognition) and vice-versa (generation). We propose an intermediate approach. First, we show that forward computation in DNNs with logistic sigmoid activations corresponds to a simplified approximate Bayesian inference in a directed probabilistic multi-layer model. This connection allows to interpret DNN as a probabilistic model of the output and all hidden units given the input. Second, we propose that in order for the recognition and generation networks to be more consistent with the joint model of the data, weights of the recognition and generator network should be related by transposition. We demonstrate in a tentative experiment that such a coupled pair can be learned generatively, modelling the full distribution of the data, and has enough capacity to perform well in both recognition and generation.
Tasks	Bayesian Inference
Published	2017-09-25
URL	http://arxiv.org/abs/1709.08524v1
PDF	http://arxiv.org/pdf/1709.08524v1.pdf
PWC	https://paperswithcode.com/paper/generative-learning-for-deep-networks
Repo
Framework

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video


Title	Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video
Authors	Davide Moltisanti, Michael Wray, Walterio Mayol-Cuevas, Dima Damen
Abstract	Manual annotations of temporal bounds for object interactions (i.e. start and end times) are typical training input to recognition, localization and detection algorithms. For three publicly available egocentric datasets, we uncover inconsistencies in ground truth temporal bounds within and across annotators and datasets. We systematically assess the robustness of state-of-the-art approaches to changes in labeled temporal bounds, for object interaction recognition. As boundaries are trespassed, a drop of up to 10% is observed for both Improved Dense Trajectories and Two-Stream Convolutional Neural Network. We demonstrate that such disagreement stems from a limited understanding of the distinct phases of an action, and propose annotating based on the Rubicon Boundaries, inspired by a similarly named cognitive model, for consistent temporal bounds of object interactions. Evaluated on a public dataset, we report a 4% increase in overall accuracy, and an increase in accuracy for 55% of classes when Rubicon Boundaries are used for temporal annotations.
Tasks
Published	2017-03-27
URL	http://arxiv.org/abs/1703.09026v2
PDF	http://arxiv.org/pdf/1703.09026v2.pdf
PWC	https://paperswithcode.com/paper/trespassing-the-boundaries-labeling-temporal
Repo
Framework

Long-Short Range Context Neural Networks for Language Modeling


Title	Long-Short Range Context Neural Networks for Language Modeling
Authors	Youssef Oualil, Mittul Singh, Clayton Greenberg, Dietrich Klakow
Abstract	The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora. This task typically involves the learning of short range dependencies, which generally model the syntactic properties of a language and/or long range dependencies, which are semantic in nature. We propose in this paper a new multi-span architecture, which separately models the short and long context information while it dynamically merges them to perform the language modeling task. This is done through a novel recurrent Long-Short Range Context (LSRC) network, which explicitly models the local (short) and global (long) context using two separate hidden states that evolve in time. This new architecture is an adaptation of the Long-Short Term Memory network (LSTM) to take into account the linguistic properties. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art language modeling techniques.
Tasks	Language Modelling
Published	2017-08-22
URL	http://arxiv.org/abs/1708.06555v1
PDF	http://arxiv.org/pdf/1708.06555v1.pdf
PWC	https://paperswithcode.com/paper/long-short-range-context-neural-networks-for
Repo
Framework

Personalized Driver Stress Detection with Multi-task Neural Networks using Physiological Signals


Title	Personalized Driver Stress Detection with Multi-task Neural Networks using Physiological Signals
Authors	Aaqib Saeed, Stojan Trajanovski
Abstract	Stress can be seen as a physiological response to everyday emotional, mental and physical challenges. A long-term exposure to stressful situations can have negative health consequences, such as increased risk of cardiovascular diseases and immune system disorder. Therefore, a timely stress detection can lead to systems for better management and prevention in future circumstances. In this paper, we suggest a multi-task learning based neural network approach (with hard parameter sharing of mutual representation and task-specific layers) for personalized stress recognition using skin conductance and heart rate from wearable devices. The proposed method is tested on multi-modal physiological responses collected during real-world and simulator driving tasks.
Tasks	Multi-Task Learning
Published	2017-11-15
URL	http://arxiv.org/abs/1711.06116v1
PDF	http://arxiv.org/pdf/1711.06116v1.pdf
PWC	https://paperswithcode.com/paper/personalized-driver-stress-detection-with
Repo
Framework

Deep Meta Learning for Real-Time Target-Aware Visual Tracking


Title	Deep Meta Learning for Real-Time Target-Aware Visual Tracking
Authors	Janghoon Choi, Junseok Kwon, Kyoung Mu Lee
Abstract	In this paper, we propose a novel on-line visual tracking framework based on the Siamese matching network and meta-learner network, which run at real-time speeds. Conventional deep convolutional feature-based discriminative visual tracking algorithms require continuous re-training of classifiers or correlation filters, which involve solving complex optimization tasks to adapt to the new appearance of a target object. To alleviate this complex process, our proposed algorithm incorporates and utilizes a meta-learner network to provide the matching network with new appearance information of the target objects by adding target-aware feature space. The parameters for the target-specific feature space are provided instantly from a single forward-pass of the meta-learner network. By eliminating the necessity of continuously solving complex optimization tasks in the course of tracking, experimental results demonstrate that our algorithm performs at a real-time speed while maintaining competitive performance among other state-of-the-art tracking algorithms.
Tasks	Meta-Learning, Real-Time Visual Tracking, Visual Tracking
Published	2017-12-26
URL	https://arxiv.org/abs/1712.09153v3
PDF	https://arxiv.org/pdf/1712.09153v3.pdf
PWC	https://paperswithcode.com/paper/deep-meta-learning-for-real-time-visual
Repo
Framework

UCT: Learning Unified Convolutional Networks for Real-time Visual Tracking


Title	UCT: Learning Unified Convolutional Networks for Real-time Visual Tracking
Authors	Zheng Zhu, Guan Huang, Wei Zou, Dalong Du, Chang Huang
Abstract	Convolutional neural networks (CNN) based tracking approaches have shown favorable performance in recent benchmarks. Nonetheless, the chosen CNN features are always pre-trained in different task and individual components in tracking systems are learned separately, thus the achieved tracking performance may be suboptimal. Besides, most of these trackers are not designed towards real-time applications because of their time-consuming feature extraction and complex optimization details.In this paper, we propose an end-to-end framework to learn the convolutional features and perform the tracking process simultaneously, namely, a unified convolutional tracker (UCT). Specifically, The UCT treats feature extractor and tracking process both as convolution operation and trains them jointly, enabling learned CNN features are tightly coupled to tracking process. In online tracking, an efficient updating method is proposed by introducing peak-versus-noise ratio (PNR) criterion, and scale changes are handled efficiently by incorporating a scale branch into network. The proposed approach results in superior tracking performance, while maintaining real-time speed. The standard UCT and UCT-Lite can track generic objects at 41 FPS and 154 FPS without further optimization, respectively. Experiments are performed on four challenging benchmark tracking datasets: OTB2013, OTB2015, VOT2014 and VOT2015, and our method achieves state-of-the-art results on these benchmarks compared with other real-time trackers.
Tasks	Real-Time Visual Tracking, Visual Tracking
Published	2017-11-10
URL	http://arxiv.org/abs/1711.04661v1
PDF	http://arxiv.org/pdf/1711.04661v1.pdf
PWC	https://paperswithcode.com/paper/uct-learning-unified-convolutional-networks
Repo
Framework

DR2-Net: Deep Residual Reconstruction Network for Image Compressive Sensing


Title	DR2-Net: Deep Residual Reconstruction Network for Image Compressive Sensing
Authors	Hantao Yao, Feng Dai, Dongming Zhang, Yike Ma, Shiliang Zhang, Yongdong Zhang, Qi Tian
Abstract	Most traditional algorithms for compressive sensing image reconstruction suffer from the intensive computation. Recently, deep learning-based reconstruction algorithms have been reported, which dramatically reduce the time complexity than iterative reconstruction algorithms. In this paper, we propose a novel \textbf{D}eep \textbf{R}esidual \textbf{R}econstruction Network (DR$^{2}$-Net) to reconstruct the image from its Compressively Sensed (CS) measurement. The DR$^{2}$-Net is proposed based on two observations: 1) linear mapping could reconstruct a high-quality preliminary image, and 2) residual learning could further improve the reconstruction quality. Accordingly, DR$^{2}$-Net consists of two components, \emph{i.e.,} linear mapping network and residual network, respectively. Specifically, the fully-connected layer in neural network implements the linear mapping network. We then expand the linear mapping network to DR$^{2}$-Net by adding several residual learning blocks to enhance the preliminary image. Extensive experiments demonstrate that the DR$^{2}$-Net outperforms traditional iterative methods and recent deep learning-based methods by large margins at measurement rates 0.01, 0.04, 0.1, and 0.25, respectively. The code of DR$^{2}$-Net has been released on: https://github.com/coldrainyht/caffe_dr2
Tasks	Compressive Sensing, Image Reconstruction
Published	2017-02-19
URL	http://arxiv.org/abs/1702.05743v4
PDF	http://arxiv.org/pdf/1702.05743v4.pdf
PWC	https://paperswithcode.com/paper/dr2-net-deep-residual-reconstruction-network
Repo
Framework