January 27, 2020

3102 words 15 mins read

Paper Group ANR 1093

The FacT: Taming Latent Factor Models for Explainability with Factorization Trees. Semi-Supervised First-Person Activity Recognition in Body-Worn Video. Artificial Intelligence for Prosthetics - challenge solutions. Adversarial Balancing-based Representation Learning for Causal Effect Inference with Observational Data. What do Deep Networks Like to …

The FacT: Taming Latent Factor Models for Explainability with Factorization Trees


Title	The FacT: Taming Latent Factor Models for Explainability with Factorization Trees
Authors	Yiyi Tao, Yiling Jia, Nan Wang, Hongning Wang
Abstract	Latent factor models have achieved great success in personalized recommendations, but they are also notoriously difficult to explain. In this work, we integrate regression trees to guide the learning of latent factor models for recommendation, and use the learnt tree structure to explain the resulting latent factors. Specifically, we build regression trees on users and items respectively with user-generated reviews, and associate a latent profile to each node on the trees to represent users and items. With the growth of regression tree, the latent factors are gradually refined under the regularization imposed by the tree structure. As a result, we are able to track the creation of latent profiles by looking into the path of each factor on regression trees, which thus serves as an explanation for the resulting recommendations. Extensive experiments on two large collections of Amazon and Yelp reviews demonstrate the advantage of our model over several competitive baseline algorithms. Besides, our extensive user study also confirms the practical value of explainable recommendations generated by our model.
Tasks
Published	2019-06-03
URL	https://arxiv.org/abs/1906.02037v1
PDF	https://arxiv.org/pdf/1906.02037v1.pdf
PWC	https://paperswithcode.com/paper/the-fact-taming-latent-factor-models-for
Repo
Framework

Semi-Supervised First-Person Activity Recognition in Body-Worn Video


Title	Semi-Supervised First-Person Activity Recognition in Body-Worn Video
Authors	Honglin Chen, Hao Li, Alexander Song, Matt Haberland, Osman Akar, Adam Dhillon, Tiankuang Zhou, Andrea L. Bertozzi, P. Jeffrey Brantingham
Abstract	Body-worn cameras are now commonly used for logging daily life, sports, and law enforcement activities, creating a large volume of archived footage. This paper studies the problem of classifying frames of footage according to the activity of the camera-wearer with an emphasis on application to real-world police body-worn video. Real-world datasets pose a different set of challenges from existing egocentric vision datasets: the amount of footage of different activities is unbalanced, the data contains personally identifiable information, and in practice it is difficult to provide substantial training footage for a supervised approach. We address these challenges by extracting features based exclusively on motion information then segmenting the video footage using a semi-supervised classification algorithm. On publicly available datasets, our method achieves results comparable to, if not better than, supervised and/or deep learning methods using a fraction of the training data. It also shows promising results on real-world police body-worn video.
Tasks	Activity Recognition
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09062v1
PDF	http://arxiv.org/pdf/1904.09062v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-first-person-activity
Repo
Framework

Artificial Intelligence for Prosthetics - challenge solutions


Title	Artificial Intelligence for Prosthetics - challenge solutions
Authors	Łukasz Kidziński, Carmichael Ong, Sharada Prasanna Mohanty, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian, Hao Tian, Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungström, Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu, Jun Huang, Aleksei Shpilman, Ivan Sosin, Oleg Svidchenko, Aleksandra Malysheva, Daniel Kudenko, Lance Rane, Aditya Bhatt, Zhengfei Wang, Penghui Qi, Zeyang Yu, Peng Peng, Quan Yuan, Wenxin Li, Yunsheng Tian, Ruihan Yang, Pingchuan Ma, Shauharda Khadka, Somdeb Majumdar, Zach Dwiel, Yinyin Liu, Evren Tumer, Jeremy Watson, Marcel Salathé, Sergey Levine, Scott Delp
Abstract	In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning.
Tasks	Imitation Learning
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02441v1
PDF	http://arxiv.org/pdf/1902.02441v1.pdf
PWC	https://paperswithcode.com/paper/artificial-intelligence-for-prosthetics
Repo
Framework

Adversarial Balancing-based Representation Learning for Causal Effect Inference with Observational Data


Title	Adversarial Balancing-based Representation Learning for Causal Effect Inference with Observational Data
Authors	Xin Du, Lei Sun, Wouter Duivesteijn, Alexander Nikolaev, Mykola Pechenizkiy
Abstract	Learning causal effects from observational data greatly benefits a variety of domains such as health care, education and sociology. For instance, one could estimate the impact of a new drug to improve the survive rate. In this paper, we conduct causal inference with observational studies based on potential outcome framework (PO) (Rubin, 2005). The central problem for causal effect inference in PO is dealing with the unobserved counterfactuals and treatment selection bias. The state-of-the-art approaches focus on solving these problems by balancing the treatment and control groups (Sun and Nikolaev, 2016). However, during the learning and balancing process, highly predictive information from the original covariate space might be lost. In order to build more robust estimators, we tackle this information loss problem by presenting a method called Adversarial Balancing-based representation learning for Causal Effect Inference (ABCEI), based on the recent advances in representation learning. ABCEI uses adversarial learning to balance the distributions of treatment and control group in the latent representation space, without any assumption on the form of the treatment selection/assignment function. ABCEI preserves useful information for predicting causal effects under the regularization of a mutual information estimator. The experimental results show that ABCEI is robust against treatment selection bias, and matches/outperforms the state-of-the-art approaches. Our experiments show promising results on several datasets, representing different health care domains among others.
Tasks	Causal Inference, Representation Learning
Published	2019-04-30
URL	https://arxiv.org/abs/1904.13335v2
PDF	https://arxiv.org/pdf/1904.13335v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-balancing-based-representation
Repo
Framework

What do Deep Networks Like to Read?


Title	What do Deep Networks Like to Read?
Authors	Jonas Pfeiffer, Aishwarya Kamath, Iryna Gurevych, Sebastian Ruder
Abstract	Recent research towards understanding neural networks probes models in a top-down manner, but is only able to identify model tendencies that are known a priori. We propose Susceptibility Identification through Fine-Tuning (SIFT), a novel abstractive method that uncovers a model’s preferences without imposing any prior. By fine-tuning an autoencoder with the gradients from a fixed classifier, we are able to extract propensities that characterize different kinds of classifiers in a bottom-up manner. We further leverage the SIFT architecture to rephrase sentences in order to predict the opposing class of the ground truth label, uncovering potential artifacts encoded in the fixed classification model. We evaluate our method on three diverse tasks with four different models. We contrast the propensities of the models as well as reproduce artifacts reported in the literature.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04547v1
PDF	https://arxiv.org/pdf/1909.04547v1.pdf
PWC	https://paperswithcode.com/paper/what-do-deep-networks-like-to-read
Repo
Framework


Title	The Secret Lives of Names? Name Embeddings from Social Media
Authors	Junting Ye, Steven Skiena
Abstract	Your name tells a lot about you: your gender, ethnicity and so on. It has been shown that name embeddings are more effective in representing names than traditional substring features. However, our previous name embedding model is trained on private email data and are not publicly accessible. In this paper, we explore learning name embeddings from public Twitter data. We argue that Twitter embeddings have two key advantages: \textit{(i)} they can and will be publicly released to support research community. \textit{(ii)} even with a smaller training corpus, Twitter embeddings achieve similar performances on multiple tasks comparing to email embeddings. As a test case to show the power of name embeddings, we investigate the modeling of lifespans. We find it interesting that adding name embeddings can further improve the performances of models using demographic features, which are traditionally used for lifespan modeling. Through residual analysis, we observe that fine-grained groups (potentially reflecting socioeconomic status) are the latent contributing factors encoded in name embeddings. These were previously hidden to demographic models, and may help to enhance the predictive power of a wide class of research studies.
Tasks
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04799v1
PDF	https://arxiv.org/pdf/1905.04799v1.pdf
PWC	https://paperswithcode.com/paper/the-secret-lives-of-names-name-embeddings
Repo
Framework

Does the Multisecretary Problem Always Have Bounded Regret?


Title	Does the Multisecretary Problem Always Have Bounded Regret?
Authors	Robert L. Bray
Abstract	Arlotto and Gurvich (2019) showed that the regret in the multisecretary problem is bounded in the number of job openings, $ n $, and the number of applicants, $ k $, provided that the applicant valuations are drawn from a distribution with finite support. I show that this result does not hold when applicant valuations are drawn from a standard uniform distribution. In this case, the regret is between log(n)/16 - 1/4 and log(n+1) / 8, when k = n/2 and n >= 16. I establish these bounds by decomposing the regret into a sum of expected myopic regrets. This decomposition also yields a shorter proof of Arlotto and Gurvich’s original result.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.08917v1
PDF	https://arxiv.org/pdf/1912.08917v1.pdf
PWC	https://paperswithcode.com/paper/does-the-multisecretary-problem-always-have
Repo
Framework

Validation of Modulation Transfer Functions and Noise Power Spectra from Natural Scenes


Title	Validation of Modulation Transfer Functions and Noise Power Spectra from Natural Scenes
Authors	Edward W. S. Fry, Sophie Triantaphillidou, Robin B. Jenkin, John R. Jarvis, Ralph E. Jacobson
Abstract	The Modulation Transfer Function (MTF) and the Noise Power Spectrum (NPS) characterize imaging system sharpness/resolution and noise, respectively. Both measures are based on linear system theory but are applied routinely to systems employing non-linear, content-aware image processing. For such systems, MTFs/NPSs are derived inaccurately from traditional test charts containing edges, sinusoids, noise or uniform tone signals, which are unrepresentative of natural scene signals. The dead leaves test chart delivers improved measurements, but still has limitations when describing the performance of scene-dependent systems. In this paper, we validate several novel scene-and-process-dependent MTF (SPD-MTF) and NPS (SPD-NPS) measures that characterize, either: i) system performance concerning one scene, or ii) average real-world performance concerning many scenes, or iii) the level of system scene-dependency. We also derive novel SPD-NPS and SPD-MTF measures using the dead leaves chart. We demonstrate that all the proposed measures are robust and preferable for scene-dependent systems than current measures.
Tasks
Published	2019-07-21
URL	https://arxiv.org/abs/1907.08924v1
PDF	https://arxiv.org/pdf/1907.08924v1.pdf
PWC	https://paperswithcode.com/paper/validation-of-modulation-transfer-functions
Repo
Framework

Improved Cardinality Estimation by Learning Queries Containment Rates


Title	Improved Cardinality Estimation by Learning Queries Containment Rates
Authors	Rojeh Hayek, Oded Shmueli
Abstract	The containment rate of query Q1 in query Q2 over database D is the percentage of Q1’s result tuples over D that are also in Q2’s result over D. We directly estimate containment rates between pairs of queries over a specific database. For this, we use a specialized deep learning scheme, CRN, which is tailored to representing pairs of SQL queries. Result-cardinality estimation is a core component of query optimization. We describe a novel approach for estimating queries result-cardinalities using estimated containment rates among queries. This containment rate estimation may rely on CRN or embed, unchanged, known cardinality estimation methods. Experimentally, our novel approach for estimating cardinalities, using containment rates between queries, on a challenging real-world database, realizes significant improvements to state of the art cardinality estimation methods.
Tasks
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07723v1
PDF	https://arxiv.org/pdf/1908.07723v1.pdf
PWC	https://paperswithcode.com/paper/improved-cardinality-estimation-by-learning
Repo
Framework

An Adversarial Super-Resolution Remedy for Radar Design Trade-offs


Title	An Adversarial Super-Resolution Remedy for Radar Design Trade-offs
Authors	Karim Armanious, Sherif Abdulatif, Fady Aziz, Urs Schneider, Bin Yang
Abstract	Radar is of vital importance in many fields, such as autonomous driving, safety and surveillance applications. However, it suffers from stringent constraints on its design parametrization leading to multiple trade-offs. For example, the bandwidth in FMCW radars is inversely proportional with both the maximum unambiguous range and range resolution. In this work, we introduce a new method for circumventing radar design trade-offs. We propose the use of recent advances in computer vision, more specifically generative adversarial networks (GANs), to enhance low-resolution radar acquisitions into higher resolution counterparts while maintaining the advantages of the low-resolution parametrization. The capability of the proposed method was evaluated on the velocity resolution and range-azimuth trade-offs in micro-Doppler signatures and FMCW uniform linear array (ULA) radars, respectively.
Tasks	Autonomous Driving, Super-Resolution
Published	2019-03-04
URL	https://arxiv.org/abs/1903.01392v2
PDF	https://arxiv.org/pdf/1903.01392v2.pdf
PWC	https://paperswithcode.com/paper/an-adversarial-super-resolution-remedy-for
Repo
Framework

Topology and dynamics of narratives on Brexit propagated by UK press during 2016 and 2017


Title	Topology and dynamics of narratives on Brexit propagated by UK press during 2016 and 2017
Authors	Jorge Louçã, António Fonseca
Abstract	This article identifies and characterises political narratives regarding Europe and broadcasted in UK press during 2016 and 2017. A new theoretical and operational framework is proposed for typifying discourse narratives propagated in the public opinion space, based on the social constructivism and structural linguistics approaches, and the mathematical theory of hypernetworks, where elementary units are aggregated into high-level entities. In this line of thought, a narrative is understood as a social construct where a related and coherent aggregate of terms within public discourse is repeated and propagated on media until it can be identified as a communication pattern, embodying meaning in a way that provides individuals some interpretation of their world. An inclusive methodology, with state-of-the-art technologies on natural language processing and network theory, implements this concept of narrative. A corpus from the Observatorium database, including articles from six UK newspapers and incorporating far-right, right-wing, and left-wing narratives, is analysed. The research revealed clear distinctions between narratives along the political spectrum. In 2016 far-right was particularly focused on emigration and refugees. Namely, during the referendum campaign, Europe was related to attacks on women and children, sexual offences, and terrorism. Right-wing was manly focused on internal politics, while left-wing was remarkably mentioning a diversity of non-political topics, such as sports, side by side with economics. During 2017, in general terrorism was less mentioned, and negotiations with EU, namely regarding economics, finance, and Ireland, became central.
Tasks
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08558v1
PDF	http://arxiv.org/pdf/1902.08558v1.pdf
PWC	https://paperswithcode.com/paper/topology-and-dynamics-of-narratives-on-brexit
Repo
Framework

Modeling and Forecasting Art Movements with CGANs


Title	Modeling and Forecasting Art Movements with CGANs
Authors	Edoardo Lisi, Mohammad Malekzadeh, Hamed Haddadi, F. Din-Houn Lau, Seth Flaxman
Abstract	Conditional Generative Adversarial Networks~(CGAN) are a recent and popular method for generating samples from a probability distribution conditioned on latent information. The latent information often comes in the form of a discrete label from a small set. We propose a novel method for training CGANs which allows us to condition on a sequence of continuous latent distributions $f^{(1)}, \ldots, f^{(K)}$. This training allows CGANs to generate samples from a sequence of distributions. We apply our method to paintings from a sequence of artistic movements, where each movement is considered to be its own distribution. Exploiting the temporal aspect of the data, a vector autoregressive (VAR) model is fitted to the means of the latent distributions that we learn, and used for one-step-ahead forecasting, to predict the latent distribution of a future art movement $f^{{(K+1)}}$. Realisations from this distribution can be used by the CGAN to generate “future” paintings. In experiments, this novel methodology generates accurate predictions of the evolution of art. The training set consists of a large dataset of past paintings. While there is no agreement on exactly what current art period we find ourselves in, we test on plausible candidate sets of present art, and show that the mean distance to our predictions is small.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09230v2
PDF	https://arxiv.org/pdf/1906.09230v2.pdf
PWC	https://paperswithcode.com/paper/modeling-and-forecasting-art-movements-with
Repo
Framework

Audio-Linguistic Embeddings for Spoken Sentences


Title	Audio-Linguistic Embeddings for Spoken Sentences
Authors	Albert Haque, Michelle Guo, Prateek Verma, Li Fei-Fei
Abstract	We propose spoken sentence embeddings which capture both acoustic and linguistic content. While existing works operate at the character, phoneme, or word level, our method learns long-term dependencies by modeling speech at the sentence level. Formulated as an audio-linguistic multitask learning problem, our encoder-decoder model simultaneously reconstructs acoustic and natural language features from audio. Our results show that spoken sentence embeddings outperform phoneme and word-level baselines on speech recognition and emotion recognition tasks. Ablation studies show that our embeddings can better model high-level acoustic concepts while retaining linguistic content. Overall, our work illustrates the viability of generic, multi-modal sentence embeddings for spoken language understanding.
Tasks	Emotion Recognition, Sentence Embeddings, Speech Recognition, Spoken Language Understanding
Published	2019-02-20
URL	http://arxiv.org/abs/1902.07817v1
PDF	http://arxiv.org/pdf/1902.07817v1.pdf
PWC	https://paperswithcode.com/paper/audio-linguistic-embeddings-for-spoken
Repo
Framework

Perspective-Guided Convolution Networks for Crowd Counting


Title	Perspective-Guided Convolution Networks for Crowd Counting
Authors	Zhaoyi Yan, Yuchen Yuan, Wangmeng Zuo, Xiao Tan, Yezhen Wang, Shilei Wen, Errui Ding
Abstract	In this paper, we propose a novel perspective-guided convolution (PGC) for convolutional neural network (CNN) based crowd counting (i.e. PGCNet), which aims to overcome the dramatic intra-scene scale variations of people due to the perspective effect. While most state-of-the-arts adopt multi-scale or multi-column architectures to address such issue, they generally fail in modeling continuous scale variations since only discrete representative scales are considered. PGCNet, on the other hand, utilizes perspective information to guide the spatially variant smoothing of feature maps before feeding them to the successive convolutions. An effective perspective estimation branch is also introduced to PGCNet, which can be trained in either supervised setting or weakly-supervised setting when the branch has been pre-trained. Our PGCNet is single-column with moderate increase in computation, and extensive experimental results on four benchmark datasets show the improvements of our method against the state-of-the-arts. Additionally, we also introduce Crowd Surveillance, a large scale dataset for crowd counting that contains 13,000+ high-resolution images with challenging scenarios.
Tasks	Crowd Counting
Published	2019-09-16
URL	https://arxiv.org/abs/1909.06966v1
PDF	https://arxiv.org/pdf/1909.06966v1.pdf
PWC	https://paperswithcode.com/paper/perspective-guided-convolution-networks-for
Repo
Framework

Robust Visual Tracking Using Dynamic Classifier Selection with Sparse Representation of Label Noise


Title	Robust Visual Tracking Using Dynamic Classifier Selection with Sparse Representation of Label Noise
Authors	Yuefeng Chen, Qing Wang
Abstract	Recently a category of tracking methods based on “tracking-by-detection” is widely used in visual tracking problem. Most of these methods update the classifier online using the samples generated by the tracker to handle the appearance changes. However, the self-updating scheme makes these methods suffer from drifting problem because of the incorrect labels of weak classifiers in training samples. In this paper, we split the class labels into true labels and noise labels and model them by sparse representation. A novel dynamic classifier selection method, robust to noisy training data, is proposed. Moreover, we apply the proposed classifier selection algorithm to visual tracking by integrating a part based online boosting framework. We have evaluated our proposed method on 12 challenging sequences involving severe occlusions, significant illumination changes and large pose variations. Both the qualitative and quantitative evaluations demonstrate that our approach tracks objects accurately and robustly and outperforms state-of-the-art trackers.
Tasks	Visual Tracking
Published	2019-03-19
URL	http://arxiv.org/abs/1903.07801v1
PDF	http://arxiv.org/pdf/1903.07801v1.pdf
PWC	https://paperswithcode.com/paper/robust-visual-tracking-using-dynamic
Repo
Framework