January 28, 2020

2714 words 13 mins read

Paper Group ANR 868

Paper Group ANR 868

High-Level Perceptual Similarity is Enabled by Learning Diverse Tasks. Identity Preserve Transform: Understand What Activity Classification Models Have Learnt. Library network, a possible path to explainable neural networks. UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization. Optimal Exploitation of Clust …

High-Level Perceptual Similarity is Enabled by Learning Diverse Tasks

Title High-Level Perceptual Similarity is Enabled by Learning Diverse Tasks
Authors Amir Rosenfeld, Richard Zemel, John K. Tsotsos
Abstract Predicting human perceptual similarity is a challenging subject of ongoing research. The visual process underlying this aspect of human vision is thought to employ multiple different levels of visual analysis (shapes, objects, texture, layout, color, etc). In this paper, we postulate that the perception of image similarity is not an explicitly learned capability, but rather one that is a byproduct of learning others. This claim is supported by leveraging representations learned from a diverse set of visual tasks and using them jointly to predict perceptual similarity. This is done via simple feature concatenation, without any further learning. Nevertheless, experiments performed on the challenging Totally-Looks-Like (TLL) benchmark significantly surpass recent baselines, closing much of the reported gap towards prediction of human perceptual similarity. We provide an analysis of these results and discuss them in a broader context of emergent visual capabilities and their implications on the course of machine-vision research.
Tasks
Published 2019-03-26
URL http://arxiv.org/abs/1903.10920v1
PDF http://arxiv.org/pdf/1903.10920v1.pdf
PWC https://paperswithcode.com/paper/high-level-perceptual-similarity-is-enabled
Repo
Framework

Identity Preserve Transform: Understand What Activity Classification Models Have Learnt

Title Identity Preserve Transform: Understand What Activity Classification Models Have Learnt
Authors Jialing Lyu, Weichao Qiu, Xinyue Wei, Yi Zhang, Alan Yuille, Zheng-Jun Zha
Abstract Activity classification has observed great success recently. The performance on small dataset is almost saturated and people are moving towards larger datasets. What leads to the performance gain on the model and what the model has learnt? In this paper we propose identity preserve transform (IPT) to study this problem. IPT manipulates the nuisance factors (background, viewpoint, etc.) of the data while keeping those factors related to the task (human motion) unchanged. To our surprise, we found popular models are using highly correlated information (background, object) to achieve high classification accuracy, rather than using the essential information (human motion). This can explain why an activity classification model usually fails to generalize to datasets it is not trained on. We implement IPT in two forms, i.e. image-space transform and 3D transform, using synthetic images. The tool will be made open-source to help study model and dataset design.
Tasks
Published 2019-12-13
URL https://arxiv.org/abs/1912.06314v1
PDF https://arxiv.org/pdf/1912.06314v1.pdf
PWC https://paperswithcode.com/paper/identity-preserve-transform-understand-what
Repo
Framework

Library network, a possible path to explainable neural networks

Title Library network, a possible path to explainable neural networks
Authors Jung Hoon Lee
Abstract Deep neural networks (DNNs) may outperform human brains in complex tasks, but the lack of transparency in their decision-making processes makes us question whether we could fully trust DNNs with high stakes problems. As DNNs’ operations rely on a massive number of both parallel and sequential linear/nonlinear computations, predicting their mistakes is nearly impossible. Also, a line of studies suggests that DNNs can be easily deceived by adversarial attacks, indicating that their decisions can easily be corrupted by unexpected factors. Such vulnerability must be overcome if we intend to take advantage of DNNs’ efficiency in high stakes problems. Here, we propose an algorithm that can help us better understand DNNs’ decision-making processes. Our empirical evaluations suggest that this algorithm can effectively trace DNNs’ decision processes from one layer to another and detect adversarial attacks.
Tasks Decision Making
Published 2019-09-29
URL https://arxiv.org/abs/1909.13360v3
PDF https://arxiv.org/pdf/1909.13360v3.pdf
PWC https://paperswithcode.com/paper/libraries-of-hidden-layer-activity-patterns
Repo
Framework

UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization

Title UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization
Authors Ali Kavis, Kfir Y. Levy, Francis Bach, Volkan Cevher
Abstract We propose a novel adaptive, accelerated algorithm for the stochastic constrained convex optimization setting. Our method, which is inspired by the Mirror-Prox method, \emph{simultaneously} achieves the optimal rates for smooth/non-smooth problems with either deterministic/stochastic first-order oracles. This is done without any prior knowledge of the smoothness nor the noise properties of the problem. To the best of our knowledge, this is the first adaptive, unified algorithm that achieves the optimal rates in the constrained setting. We demonstrate the practical performance of our framework through extensive numerical experiments.
Tasks
Published 2019-10-30
URL https://arxiv.org/abs/1910.13857v1
PDF https://arxiv.org/pdf/1910.13857v1.pdf
PWC https://paperswithcode.com/paper/unixgrad-a-universal-adaptive-algorithm-with
Repo
Framework

Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit

Title Optimal Exploitation of Clustering and History Information in Multi-Armed Bandit
Authors Djallel Bouneffouf, Srinivasan Parthasarathy, Horst Samulowitz, Martin Wistub
Abstract We consider the stochastic multi-armed bandit problem and the contextual bandit problem with historical observations and pre-clustered arms. The historical observations can contain any number of instances for each arm, and the pre-clustering information is a fixed clustering of arms provided as part of the input. We develop a variety of algorithms which incorporate this offline information effectively during the online exploration phase and derive their regret bounds. In particular, we develop the META algorithm which effectively hedges between two other algorithms: one which uses both historical observations and clustering, and another which uses only the historical observations. The former outperforms the latter when the clustering quality is good, and vice-versa. Extensive experiments on synthetic and real world datasets on Warafin drug dosage and web server selection for latency minimization validate our theoretical insights and demonstrate that META is a robust strategy for optimally exploiting the pre-clustering information.
Tasks
Published 2019-05-31
URL https://arxiv.org/abs/1906.03979v1
PDF https://arxiv.org/pdf/1906.03979v1.pdf
PWC https://paperswithcode.com/paper/optimal-exploitation-of-clustering-and
Repo
Framework

Safety-Guided Deep Reinforcement Learning via Online Gaussian Process Estimation

Title Safety-Guided Deep Reinforcement Learning via Online Gaussian Process Estimation
Authors Jiameng Fan, Wenchao Li
Abstract An important facet of reinforcement learning (RL) has to do with how the agent goes about exploring the environment. Traditional exploration strategies typically focus on efficiency and ignore safety. However, for practical applications, ensuring safety of the agent during exploration is crucial since performing an unsafe action or reaching an unsafe state could result in irreversible damage to the agent. The main challenge of safe exploration is that characterizing the unsafe states and actions is difficult for large continuous state or action spaces and unknown environments. In this paper, we propose a novel approach to incorporate estimations of safety to guide exploration and policy search in deep reinforcement learning. By using a cost function to capture trajectory-based safety, our key idea is to formulate the state-action value function of this safety cost as a candidate Lyapunov function and extend control-theoretic results to approximate its derivative using online Gaussian Process (GP) estimation. We show how to use these statistical models to guide the agent in unknown environments to obtain high-performance control policies with provable stability certificates.
Tasks Safe Exploration
Published 2019-03-06
URL http://arxiv.org/abs/1903.02526v2
PDF http://arxiv.org/pdf/1903.02526v2.pdf
PWC https://paperswithcode.com/paper/safety-guided-deep-reinforcement-learning-via
Repo
Framework

Cubic-Spline Flows

Title Cubic-Spline Flows
Authors Conor Durkan, Artur Bekasov, Iain Murray, George Papamakarios
Abstract A normalizing flow models a complex probability density as an invertible transformation of a simple density. The invertibility means that we can evaluate densities and generate samples from a flow. In practice, autoregressive flow-based models are slow to invert, making either density estimation or sample generation slow. Flows based on coupling transforms are fast for both tasks, but have previously performed less well at density estimation than autoregressive flows. We stack a new coupling transform, based on monotonic cubic splines, with LU-decomposed linear layers. The resulting cubic-spline flow retains an exact one-pass inverse, can be used to generate high-quality images, and closes the gap with autoregressive flows on a suite of density-estimation tasks.
Tasks Density Estimation
Published 2019-06-05
URL https://arxiv.org/abs/1906.02145v1
PDF https://arxiv.org/pdf/1906.02145v1.pdf
PWC https://paperswithcode.com/paper/cubic-spline-flows
Repo
Framework

Neural Query Language: A Knowledge Base Query Language for Tensorflow

Title Neural Query Language: A Knowledge Base Query Language for Tensorflow
Authors William W. Cohen, Matthew Siegler, Alex Hofer
Abstract Large knowledge bases (KBs) are useful for many AI tasks, but are difficult to integrate into modern gradient-based learning systems. Here we describe a framework for accessing soft symbolic database using only differentiable operators. For example, this framework makes it easy to conveniently write neural models that adjust confidences associated with facts in a soft KB; incorporate prior knowledge in the form of hand-coded KB access rules; or learn to instantiate query templates using information extracted from text. NQL can work well with KBs with millions of tuples and hundreds of thousands of entities on a single GPU.
Tasks
Published 2019-05-15
URL https://arxiv.org/abs/1905.06209v1
PDF https://arxiv.org/pdf/1905.06209v1.pdf
PWC https://paperswithcode.com/paper/neural-query-language-a-knowledge-base-query
Repo
Framework

Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings

Title Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings
Authors Sarik Ghazarian, Johnny Tian-Zheng Wei, Aram Galstyan, Nanyun Peng
Abstract Despite advances in open-domain dialogue systems, automatic evaluation of such systems is still a challenging problem. Traditional reference-based metrics such as BLEU are ineffective because there could be many valid responses for a given context that share no common words with reference responses. A recent work proposed Referenced metric and Unreferenced metric Blended Evaluation Routine (RUBER) to combine a learning-based metric, which predicts relatedness between a generated response and a given query, with reference-based metric; it showed high correlation with human judgments. In this paper, we explore using contextualized word embeddings to compute more accurate relatedness scores, thus better evaluation metrics. Experiments show that our evaluation metrics outperform RUBER, which is trained on static embeddings.
Tasks Word Embeddings
Published 2019-04-24
URL http://arxiv.org/abs/1904.10635v1
PDF http://arxiv.org/pdf/1904.10635v1.pdf
PWC https://paperswithcode.com/paper/better-automatic-evaluation-of-open-domain
Repo
Framework

Extreme Low Resolution Activity Recognition with Spatial-Temporal Attention Transfer

Title Extreme Low Resolution Activity Recognition with Spatial-Temporal Attention Transfer
Authors Yucai Bai, Qiang Dai, Long Chen, Lingxi Li, Zhengming Ding, Qin Zou
Abstract Activity recognition on extreme low-resolution videos, e.g., a resolution of 12 * 6 pixels, plays a vital role in far-view surveillance and privacy-preserving multimedia analysis. Low-resolution videos only contain limited information. Given the fact that one same activity may be represented by videos in both high resolution(HR) and low resolution (LR), it is worth studying to utilize the relevant HR data to improve the LR activity recognition. In this work, we propose a novel Spatial-Temporal Attention Transfer (STAT) for LR activity recognition. STAT can acquire information from HR data by reducing the attention differences with a transfer-learning strategy. Experimental results on two well-known datasets, i.e., UCF101 and HMDB51, demonstrate that, the proposed method can effectively improve the accuracy of LR activity recognition, and achieves an accuracy of 58.12% on 12 * 16 videos in HMDB51, a state-of-the-art performance.
Tasks Activity Recognition, Transfer Learning
Published 2019-09-09
URL https://arxiv.org/abs/1909.03580v3
PDF https://arxiv.org/pdf/1909.03580v3.pdf
PWC https://paperswithcode.com/paper/extreme-low-resolution-activity-recognition-1
Repo
Framework

Characterization of citizens using word2vec and latent topic analysis in a large set of tweets

Title Characterization of citizens using word2vec and latent topic analysis in a large set of tweets
Authors Vargas-Calderón Vladimir, Camargo Jorge
Abstract With the increasing use of the Internet and mobile devices, social networks are becoming the most used media to communicate citizens’ ideas and thoughts. This information is very useful to identify communities with common ideas based on what they publish in the network. This paper presents a method to automatically detect city communities based on machine learning techniques applied to a set of tweets from Bogot'a’s citizens. An analysis was performed in a collection of 2,634,176 tweets gathered from Twitter in a period of six months. Results show that the proposed method is an interesting tool to characterize a city population based on a machine learning methods and text analytics.
Tasks
Published 2019-04-15
URL http://arxiv.org/abs/1904.08926v1
PDF http://arxiv.org/pdf/1904.08926v1.pdf
PWC https://paperswithcode.com/paper/190408926
Repo
Framework

A Calibration Scheme for Non-Line-of-Sight Imaging Setups

Title A Calibration Scheme for Non-Line-of-Sight Imaging Setups
Authors Jonathan Klein, Martin Laurenzis, Matthias B. Hullin, Julian Iseringhausen
Abstract The recent years have given rise to a large number of techniques for “looking around corners”, i.e., for reconstructing occluded objects from time-resolved measurements of indirect light reflections off a wall. While the direct view of cameras is routinely calibrated in computer vision applications, the calibration of non-line-of-sight setups has so far relied on manual measurement of the most important dimensions (device positions, wall position and orientation, etc.). In this paper, we propose a semi-automatic method for calibrating such systems that relies on mirrors as known targets. A roughly determined initialization is refined in order to optimize a spatio-temporal consistency. Our system is general enough to be applicable to a variety of sensing scenarios ranging from single sources/detectors via scanning arrangements to large-scale arrays. It is robust towards bad initialization and the achieved accuracy is proportional to the depth resolution of the camera system. We demonstrate this capability with a real-world setup and despite a large number of dead pixels and very low temporal resolution achieve a result that outperforms a manual calibration.
Tasks Calibration
Published 2019-12-20
URL https://arxiv.org/abs/1912.09923v1
PDF https://arxiv.org/pdf/1912.09923v1.pdf
PWC https://paperswithcode.com/paper/a-calibration-scheme-for-non-line-of-sight
Repo
Framework

Lung Cancer Detection using Co-learning from Chest CT Images and Clinical Demographics

Title Lung Cancer Detection using Co-learning from Chest CT Images and Clinical Demographics
Authors Jiachen Wang, Riqiang Gao, Yuankai Huo, Shunxing Bao, Yunxi Xiong, Sanja L. Antic, Travis J. Osterman, Pierre P. Massion, Bennett A. Landman
Abstract Early detection of lung cancer is essential in reducing mortality. Recent studies have demonstrated the clinical utility of low-dose computed tomography (CT) to detect lung cancer among individuals selected based on very limited clinical information. However, this strategy yields high false positive rates, which can lead to unnecessary and potentially harmful procedures. To address such challenges, we established a pipeline that co-learns from detailed clinical demographics and 3D CT images. Toward this end, we leveraged data from the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions (MCL), which focuses on early detection of lung cancer. A 3D attention-based deep convolutional neural net (DCNN) is proposed to identify lung cancer from the chest CT scan without prior anatomical location of the suspicious nodule. To improve upon the non-invasive discrimination between benign and malignant, we applied a random forest classifier to a dataset integrating clinical information to imaging data. The results show that the AUC obtained from clinical demographics alone was 0.635 while the attention network alone reached an accuracy of 0.687. In contrast when applying our proposed pipeline integrating clinical and imaging variables, we reached an AUC of 0.787 on the testing dataset. The proposed network both efficiently captures anatomical information for classification and also generates attention maps that explain the features that drive performance.
Tasks Computed Tomography (CT)
Published 2019-02-21
URL http://arxiv.org/abs/1902.08236v1
PDF http://arxiv.org/pdf/1902.08236v1.pdf
PWC https://paperswithcode.com/paper/lung-cancer-detection-using-co-learning-from
Repo
Framework

Predicting TED Talk Ratings from Language and Prosody

Title Predicting TED Talk Ratings from Language and Prosody
Authors Md Iftekhar Tanveer, Md Kamrul Hassan, Daniel Gildea, M. Ehsan Hoque
Abstract We use the largest open repository of public speaking—TED Talks—to predict the ratings of the online viewers. Our dataset contains over 2200 TED Talk transcripts (includes over 200 thousand sentences), audio features and the associated meta information including about 5.5 Million ratings from spontaneous visitors of the website. We propose three neural network architectures and compare with statistical machine learning. Our experiments reveal that it is possible to predict all the 14 different ratings with an average AUC of 0.83 using the transcripts and prosody features only. The dataset and the complete source code is available for further analysis.
Tasks
Published 2019-05-21
URL https://arxiv.org/abs/1906.03940v1
PDF https://arxiv.org/pdf/1906.03940v1.pdf
PWC https://paperswithcode.com/paper/predicting-ted-talk-ratings-from-language-and
Repo
Framework

Using Depth for Pixel-Wise Detection of Adversarial Attacks in Crowd Counting

Title Using Depth for Pixel-Wise Detection of Adversarial Attacks in Crowd Counting
Authors Weizhe Liu, Mathieu Salzmann, Pascal Fua
Abstract State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density. While effective, deep learning approaches are vulnerable to adversarial attacks, which, in a crowd-counting context, can lead to serious security issues. However, attack and defense mechanisms have been virtually unexplored in regression tasks, let alone for crowd density estimation. In this paper, we investigate the effectiveness of existing attack strategies on crowd-counting networks, and introduce a simple yet effective pixel-wise detection mechanism. It builds on the intuition that, when attacking a multitask network, in our case estimating crowd density and scene depth, both outputs will be perturbed, and thus the second one can be used for detection purposes. We will demonstrate that this significantly outperforms heuristic and uncertainty-based strategies.
Tasks Crowd Counting, Density Estimation
Published 2019-11-26
URL https://arxiv.org/abs/1911.11484v2
PDF https://arxiv.org/pdf/1911.11484v2.pdf
PWC https://paperswithcode.com/paper/using-depth-for-pixel-wise-detection-of
Repo
Framework
comments powered by Disqus