October 16, 2019

3177 words 15 mins read

Paper Group ANR 981

Similarity Learning with Higher-Order Graph Convolutions for Brain Network Analysis. Maximum Consensus Parameter Estimation by Reweighted $\ell_1$ Methods. Camera-based Image Forgery Localization using Convolutional Neural Networks. A Review of Meta-Reinforcement Learning for Deep Neural Networks Architecture Search. Dual-label Deep LSTM Dereverber …

Similarity Learning with Higher-Order Graph Convolutions for Brain Network Analysis


Title	Similarity Learning with Higher-Order Graph Convolutions for Brain Network Analysis
Authors	Guixiang Ma, Nesreen K. Ahmed, Ted Willke, Dipanjan Sengupta, Michael W. Cole, Nicholas B. Turk-Browne, Philip S. Yu
Abstract	Learning a similarity metric has gained much attention recently, where the goal is to learn a function that maps input patterns to a target space while preserving the semantic distance in the input space. While most related work focused on images, we focus instead on learning a similarity metric for neuroimages, such as fMRI and DTI images. We propose an end-to-end similarity learning framework called Higher-order Siamese GCN for multi-subject fMRI data analysis. The proposed framework learns the brain network representations via a supervised metric-based approach with siamese neural networks using two graph convolutional networks as the twin networks. Our proposed framework performs higher-order convolutions by incorporating higher-order proximity in graph convolutional networks to characterize and learn the community structure in brain connectivity networks. To the best of our knowledge, this is the first community-preserving similarity learning framework for multi-subject brain network analysis. Experimental results on four real fMRI datasets demonstrate the potential use cases of the proposed framework for multi-subject brain analysis in health and neuropsychiatric disorders. Our proposed approach achieves an average AUC gain of 75% compared to PCA, an average AUC gain of 65.5% compared to Spectral Embedding, and an average AUC gain of 24.3% compared to S-GCN across the four datasets, indicating promising application in clinical investigation and brain disease diagnosis.
Tasks	Graph Similarity
Published	2018-11-02
URL	http://arxiv.org/abs/1811.02662v5
PDF	http://arxiv.org/pdf/1811.02662v5.pdf
PWC	https://paperswithcode.com/paper/similarity-learning-with-higher-order
Repo
Framework

Maximum Consensus Parameter Estimation by Reweighted $\ell_1$ Methods


Title	Maximum Consensus Parameter Estimation by Reweighted $\ell_1$ Methods
Authors	Pulak Purkait, Christopher Zach, Anders Eriksson
Abstract	Robust parameter estimation in computer vision is frequently accomplished by solving the maximum consensus (MaxCon) problem. Widely used randomized methods for MaxCon, however, can only produce {random} approximate solutions, while global methods are too slow to exercise on realistic problem sizes. Here we analyse MaxCon as iterative reweighted algorithms on the data residuals. We propose a smooth surrogate function, the minimization of which leads to an extremely simple iteratively reweighted algorithm for MaxCon. We show that our algorithm is very efficient and in many cases, yields the global solution. This makes it an attractive alternative for randomized methods and global optimizers. The convergence analysis of our method and its fundamental differences from the other iteratively reweighted methods are also presented.
Tasks
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08602v1
PDF	http://arxiv.org/pdf/1803.08602v1.pdf
PWC	https://paperswithcode.com/paper/maximum-consensus-parameter-estimation-by
Repo
Framework

Camera-based Image Forgery Localization using Convolutional Neural Networks


Title	Camera-based Image Forgery Localization using Convolutional Neural Networks
Authors	Davide Cozzolino, Luisa Verdoliva
Abstract	Camera fingerprints are precious tools for a number of image forensics tasks. A well-known example is the photo response non-uniformity (PRNU) noise pattern, a powerful device fingerprint. Here, to address the image forgery localization problem, we rely on noiseprint, a recently proposed CNN-based camera model fingerprint. The CNN is trained to minimize the distance between same-model patches, and maximize the distance otherwise. As a result, the noiseprint accounts for model-related artifacts just like the PRNU accounts for device-related non-uniformities. However, unlike the PRNU, it is only mildly affected by residuals of high-level scene content. The experiments show that the proposed noiseprint-based forgery localization method improves over the PRNU-based reference.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09714v1
PDF	http://arxiv.org/pdf/1808.09714v1.pdf
PWC	https://paperswithcode.com/paper/camera-based-image-forgery-localization-using
Repo
Framework

A Review of Meta-Reinforcement Learning for Deep Neural Networks Architecture Search


Title	A Review of Meta-Reinforcement Learning for Deep Neural Networks Architecture Search
Authors	Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, Mohamed Saber Naceur
Abstract	Deep Neural networks are efficient and flexible models that perform well for a variety of tasks such as image, speech recognition and natural language understanding. In particular, convolutional neural networks (CNN) generate a keen interest among researchers in computer vision and more specifically in classification tasks. CNN architecture and related hyperparameters are generally correlated to the nature of the processed task as the network extracts complex and relevant characteristics allowing the optimal convergence. Designing such architectures requires significant human expertise, substantial computation time and doesn’t always lead to the optimal network. Model configuration topic has been extensively studied in machine learning without leading to a standard automatic method. This survey focuses on reviewing and discussing the current progress in automating CNN architecture search.
Tasks	Neural Architecture Search, Speech Recognition
Published	2018-12-17
URL	http://arxiv.org/abs/1812.07995v1
PDF	http://arxiv.org/pdf/1812.07995v1.pdf
PWC	https://paperswithcode.com/paper/a-review-of-meta-reinforcement-learning-for
Repo
Framework

Dual-label Deep LSTM Dereverberation For Speaker Verification


Title	Dual-label Deep LSTM Dereverberation For Speaker Verification
Authors	Hao Zhang, Stephen Zahorian, Xiao Chen, Peter Guzewich, Xiaoyu Liu
Abstract	In this paper, we present a reverberation removal approach for speaker verification, utilizing dual-label deep neural networks (DNNs). The networks perform feature mapping between the spectral features of reverberant and clean speech. Long short term memory recurrent neural networks (LSTMs) are trained to map corrupted Mel filterbank (MFB) features to two sets of labels: i) the clean MFB features, and ii) either estimated pitch tracks or the fast Fourier transform (FFT) spectrogram of clean speech. The performance of reverberation removal is evaluated by equal error rates (EERs) of speaker verification experiments.
Tasks	Speaker Verification
Published	2018-09-08
URL	http://arxiv.org/abs/1809.03868v1
PDF	http://arxiv.org/pdf/1809.03868v1.pdf
PWC	https://paperswithcode.com/paper/dual-label-deep-lstm-dereverberation-for
Repo
Framework

Spurious Valleys in Two-layer Neural Network Optimization Landscapes


Title	Spurious Valleys in Two-layer Neural Network Optimization Landscapes
Authors	Luca Venturi, Afonso S. Bandeira, Joan Bruna
Abstract	Neural networks provide a rich class of high-dimensional, non-convex optimization problems. Despite their non-convexity, gradient-descent methods often successfully optimize these models. This has motivated a recent spur in research attempting to characterize properties of their loss surface that may explain such success. In this paper, we address this phenomenon by studying a key topological property of the loss: the presence or absence of spurious valleys, defined as connected components of sub-level sets that do not include a global minimum. Focusing on a class of two-layer neural networks defined by smooth (but generally non-linear) activation functions, we identify a notion of intrinsic dimension and show that it provides necessary and sufficient conditions for the absence of spurious valleys. More concretely, finite intrinsic dimension guarantees that for sufficiently overparametrised models no spurious valleys exist, independently of the data distribution. Conversely, infinite intrinsic dimension implies that spurious valleys do exist for certain data distributions, independently of model overparametrisation. Besides these positive and negative results, we show that, although spurious valleys may exist in general, they are confined to low risk levels and avoided with high probability on overparametrised models.
Tasks
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06384v3
PDF	http://arxiv.org/pdf/1802.06384v3.pdf
PWC	https://paperswithcode.com/paper/spurious-valleys-in-two-layer-neural-network
Repo
Framework

Stop Illegal Comments: A Multi-Task Deep Learning Approach


Title	Stop Illegal Comments: A Multi-Task Deep Learning Approach
Authors	Ahmed Elnaggar, Bernhard Waltl, Ingo Glaser, Jörg Landthaler, Elena Scepankova, Florian Matthes
Abstract	Deep learning methods are often difficult to apply in the legal domain due to the large amount of labeled data required by deep learning methods. A recent new trend in the deep learning community is the application of multi-task models that enable single deep neural networks to perform more than one task at the same time, for example classification and translation tasks. These powerful novel models are capable of transferring knowledge among different tasks or training sets and therefore could open up the legal domain for many deep learning applications. In this paper, we investigate the transfer learning capabilities of such a multi-task model on a classification task on the publicly available Kaggle toxic comment dataset for classifying illegal comments and we can report promising results.
Tasks	Transfer Learning
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06665v1
PDF	http://arxiv.org/pdf/1810.06665v1.pdf
PWC	https://paperswithcode.com/paper/stop-illegal-comments-a-multi-task-deep
Repo
Framework

Asynchronous Online Testing of Multiple Hypotheses


Title	Asynchronous Online Testing of Multiple Hypotheses
Authors	Tijana Zrnic, Aaditya Ramdas, Michael I. Jordan
Abstract	We consider the problem of asynchronous online testing, aimed at providing control of the false discovery rate (FDR) during a continual stream of data collection and testing, where each test may be a sequential test that can start and stop at arbitrary times. This setting increasingly characterizes real-world applications in science and industry, where teams of researchers across large organizations may conduct tests of hypotheses in a decentralized manner. The overlap in time and space also tends to induce dependencies among test statistics, a challenge for classical methodology, which either assumes (overly optimistically) independence or (overly pessimistically) arbitrary dependence between test statistics. We present a general framework that addresses both of these issues via a unified computational abstraction that we refer to as “conflict sets.” We show how this framework yields algorithms with formal FDR guarantees under a more intermediate, local notion of dependence. We illustrate these algorithms in simulation experiments, comparing to existing algorithms for online FDR control.
Tasks
Published	2018-12-12
URL	http://arxiv.org/abs/1812.05068v1
PDF	http://arxiv.org/pdf/1812.05068v1.pdf
PWC	https://paperswithcode.com/paper/asynchronous-online-testing-of-multiple
Repo
Framework

AFA-PredNet: The action modulation within predictive coding


Title	AFA-PredNet: The action modulation within predictive coding
Authors	Junpei Zhong, Angelo Cangelosi, Xinzheng Zhang, Tetsuya Ogata
Abstract	The predictive processing (PP) hypothesizes that the predictive inference of our sensorimotor system is encoded implicitly in the regularities between perception and action. We propose a neural architecture in which such regularities of active inference are encoded hierarchically. We further suggest that this encoding emerges during the embodied learning process when the appropriate action is selected to minimize the prediction error in perception. Therefore, this predictive stream in the sensorimotor loop is generated in a top-down manner. Specifically, it is constantly modulated by the motor actions and is updated by the bottom-up prediction error signals. In this way, the top-down prediction originally comes from the prior experience from both perception and action representing the higher levels of this hierarchical cognition. In our proposed embodied model, we extend the PredNet Network, a hierarchical predictive coding network, with the motor action units implemented by a multi-layer perceptron network (MLP) to modulate the network top-down prediction. Two experiments, a minimalistic world experiment, and a mobile robot experiment are conducted to evaluate the proposed model in a qualitative way. In the neural representation, it can be observed that the causal inference of predictive percept from motor actions can be also observed while the agent is interacting with the environment.
Tasks	Causal Inference
Published	2018-04-11
URL	http://arxiv.org/abs/1804.03826v1
PDF	http://arxiv.org/pdf/1804.03826v1.pdf
PWC	https://paperswithcode.com/paper/afa-prednet-the-action-modulation-within
Repo
Framework

Explaining hyperspectral imaging based plant disease identification: 3D CNN and saliency maps


Title	Explaining hyperspectral imaging based plant disease identification: 3D CNN and saliency maps
Authors	Koushik Nagasubramanian, Sarah Jones, Asheesh K. Singh, Arti Singh, Baskar Ganapathysubramanian, Soumik Sarkar
Abstract	Our overarching goal is to develop an accurate and explainable model for plant disease identification using hyperspectral data. Charcoal rot is a soil borne fungal disease that affects the yield of soybean crops worldwide. Hyperspectral images were captured at 240 different wavelengths in the range of 383 - 1032 nm. We developed a 3D Convolutional Neural Network model for soybean charcoal rot disease identification. Our model has classification accuracy of 95.73% and an infected class F1 score of 0.87. We infer the trained model using saliency map and visualize the most sensitive pixel locations that enable classification. The sensitivity of individual wavelengths for classification was also determined using the saliency map visualization. We identify the most sensitive wavelength as 733 nm using the saliency map visualization. Since the most sensitive wavelength is in the Near Infrared Region(700 - 1000 nm) of the electromagnetic spectrum, which is also the commonly used spectrum region for determining the vegetation health of the plant, we were more confident in the predictions using our model.
Tasks
Published	2018-04-24
URL	http://arxiv.org/abs/1804.08831v1
PDF	http://arxiv.org/pdf/1804.08831v1.pdf
PWC	https://paperswithcode.com/paper/explaining-hyperspectral-imaging-based-plant
Repo
Framework

Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects


Title	Linked Causal Variational Autoencoder for Inferring Paired Spillover Effects
Authors	Vineeth Rakesh, Ruocheng Guo, Raha Moraffah, Nitin Agarwal, Huan Liu
Abstract	Modeling spillover effects from observational data is an important problem in economics, business, and other fields of research. % It helps us infer the causality between two seemingly unrelated set of events. For example, if consumer spending in the United States declines, it has spillover effects on economies that depend on the U.S. as their largest export market. In this paper, we aim to infer the causation that results in spillover effects between pairs of entities (or units), we call this effect as \textit{paired spillover}. To achieve this, we leverage the recent developments in variational inference and deep learning techniques to propose a generative model called Linked Causal Variational Autoencoder (LCVA). Similar to variational autoencoders (VAE), LCVA incorporates an encoder neural network to learn the latent attributes and a decoder network to reconstruct the inputs. However, unlike VAE, LCVA treats the \textit{latent attributes as confounders that are assumed to affect both the treatment and the outcome of units}. Specifically, given a pair of units $u$ and $\bar{u}$, their individual treatment and outcomes, the encoder network of LCVA samples the confounders by conditioning on the observed covariates of $u$, the treatments of both $u$ and $\bar{u}$ and the outcome of $u$. Once inferred, the latent attributes (or confounders) of $u$ captures the spillover effect of $\bar{u}$ on $u$. Using a network of users from job training dataset (LaLonde (1986)) and co-purchase dataset from Amazon e-commerce domain, we show that LCVA is significantly more robust than existing methods in capturing spillover effects.
Tasks
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03333v4
PDF	http://arxiv.org/pdf/1808.03333v4.pdf
PWC	https://paperswithcode.com/paper/linked-causal-variational-autoencoder-for
Repo
Framework


Title	Graph-Based Blind Image Deblurring From a Single Photograph
Authors	Yuanchao Bai, Gene Cheung, Xianming Liu, Wen Gao
Abstract	Blind image deblurring, i.e., deblurring without knowledge of the blur kernel, is a highly ill-posed problem. The problem can be solved in two parts: i) estimate a blur kernel from the blurry image, and ii) given estimated blur kernel, de-convolve blurry input to restore the target image. In this paper, we propose a graph-based blind image deblurring algorithm by interpreting an image patch as a signal on a weighted graph. Specifically, we first argue that a skeleton image—a proxy that retains the strong gradients of the target but smooths out the details—can be used to accurately estimate the blur kernel and has a unique bi-modal edge weight distribution. Then, we design a reweighted graph total variation (RGTV) prior that can efficiently promote a bi-modal edge weight distribution given a blurry patch. Further, to analyze RGTV in the graph frequency domain, we introduce a new weight function to represent RGTV as a graph $l_1$-Laplacian regularizer. This leads to a graph spectral filtering interpretation of the prior with desirable properties, including robustness to noise and blur, strong piecewise smooth (PWS) filtering and sharpness promotion. Minimizing a blind image deblurring objective with RGTV results in a non-convex non-differentiable optimization problem. We leverage the new graph spectral interpretation for RGTV to design an efficient algorithm that solves for the skeleton image and the blur kernel alternately. Specifically for Gaussian blur, we propose a further speedup strategy for blind Gaussian deblurring using accelerated graph spectral filtering. Finally, with the computed blur kernel, recent non-blind image deblurring algorithms can be applied to restore the target image. Experimental results demonstrate that our algorithm successfully restores latent sharp images and outperforms state-of-the-art methods quantitatively and qualitatively.
Tasks	Blind Image Deblurring, Deblurring
Published	2018-02-22
URL	http://arxiv.org/abs/1802.07929v1
PDF	http://arxiv.org/pdf/1802.07929v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-blind-image-deblurring-from-a
Repo
Framework

RGB Video Based Tennis Action Recognition Using a Deep Historical Long Short-Term Memory


Title	RGB Video Based Tennis Action Recognition Using a Deep Historical Long Short-Term Memory
Authors	Jiaxin Cai, Xin Tang
Abstract	Action recognition has attracted increasing attention from RGB input in computer vision partially due to potential applications on somatic simulation and statistics of sport such as virtual tennis game and tennis techniques and tactics analysis by video. Recently, deep learning based methods have achieved promising performance for action recognition. In this paper, we propose weighted Long Short-Term Memory adopted with convolutional neural network representations for three dimensional tennis shots recognition. First, the local two-dimensional convolutional neural network spatial representations are extracted from each video frame individually using a pre-trained Inception network. Then, a weighted Long Short-Term Memory decoder is introduced to take the output state at time t and the historical embedding feature at time t-1 to generate feature vector using a score weighting scheme. Finally, we use the adopted CNN and weighted LSTM to map the original visual features into a vector space to generate the spatial-temporal semantical description of visual sequences and classify the action video content. Experiments on the benchmark demonstrate that our method using only simple raw RGB video can achieve better performance than the state-of-the-art baselines for tennis shot recognition.
Tasks	Temporal Action Localization
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00845v2
PDF	http://arxiv.org/pdf/1808.00845v2.pdf
PWC	https://paperswithcode.com/paper/rgb-video-based-tennis-action-recognition
Repo
Framework

Bayesian Nonparametric Modeling of Driver Behavior using HDP Split-Merge Sampling Algorithm


Title	Bayesian Nonparametric Modeling of Driver Behavior using HDP Split-Merge Sampling Algorithm
Authors	Vadim Smolyakov, Julian Straub, Sue Zheng, John W. Fisher III
Abstract	Modern vehicles are equipped with increasingly complex sensors. These sensors generate large volumes of data that provide opportunities for modeling and analysis. Here, we are interested in exploiting this data to learn aspects of behaviors and the road network associated with individual drivers. Our dataset is collected on a standard vehicle used to commute to work and for personal trips. A Hidden Markov Model (HMM) trained on the GPS position and orientation data is utilized to compress the large amount of position information into a small amount of road segment states. Each state has a set of observations, i.e. car signals, associated with it that are quantized and modeled as draws from a Hierarchical Dirichlet Process (HDP). The inference for the topic distributions is carried out using HDP split-merge sampling algorithm. The topic distributions over joint quantized car signals characterize the driving situation in the respective road state. In a novel manner, we demonstrate how the sparsity of the personal road network of a driver in conjunction with a hierarchical topic model allows data driven predictions about destinations as well as likely road conditions.
Tasks
Published	2018-01-27
URL	http://arxiv.org/abs/1801.09150v1
PDF	http://arxiv.org/pdf/1801.09150v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-nonparametric-modeling-of-driver
Repo
Framework

A Trio Neural Model for Dynamic Entity Relatedness Ranking


Title	A Trio Neural Model for Dynamic Entity Relatedness Ranking
Authors	Tu Ngoc Nguyen, Tuan Tran, Wolfgang Nejdl
Abstract	Measuring entity relatedness is a fundamental task for many natural language processing and information retrieval applications. Prior work often studies entity relatedness in static settings and an unsupervised manner. However, entities in real-world are often involved in many different relationships, consequently entity-relations are very dynamic over time. In this work, we propose a neural networkbased approach for dynamic entity relatedness, leveraging the collective attention as supervision. Our model is capable of learning rich and different entity representations in a joint framework. Through extensive experiments on large-scale datasets, we demonstrate that our method achieves better results than competitive baselines.
Tasks	Information Retrieval
Published	2018-08-24
URL	http://arxiv.org/abs/1808.08316v3
PDF	http://arxiv.org/pdf/1808.08316v3.pdf
PWC	https://paperswithcode.com/paper/a-trio-neural-model-for-dynamic-entity
Repo
Framework