February 1, 2020

3393 words 16 mins read

Paper Group AWR 263

A Scalable Hybrid Research Paper Recommender System for Microsoft Academic. A Bi-directional Transformer for Musical Chord Recognition. Exploring Deep Anomaly Detection Methods Based on Capsule Net. Topic-Enhanced Memory Networks for Personalised Point-of-Interest Recommendation. More Efficient Policy Learning via Optimal Retargeting. INFER: INterm …

A Scalable Hybrid Research Paper Recommender System for Microsoft Academic


Title	A Scalable Hybrid Research Paper Recommender System for Microsoft Academic
Authors	Anshul Kanakia, Zhihong Shen, Darrin Eide, Kuansan Wang
Abstract	We present the design and methodology for the large scale hybrid paper recommender system used by Microsoft Academic. The system provides recommendations for approximately 160 million English research papers and patents. Our approach handles incomplete citation information while also alleviating the cold-start problem that often affects other recommender systems. We use the Microsoft Academic Graph (MAG), titles, and available abstracts of research papers to build a recommendation list for all documents, thereby combining co-citation and content based approaches. Tuning system parameters also allows for blending and prioritization of each approach which, in turn, allows us to balance paper novelty versus authority in recommendation results. We evaluate the generated recommendations via a user study of 40 participants, with over 2400 recommendation pairs graded and discuss the quality of the results using P@10 and nDCG scores. We see that there is a strong correlation between participant scores and the similarity rankings produced by our system but that additional focus needs to be put towards improving recommender precision, particularly for content based recommendations. The results of the user survey and associated analysis scripts are made available via GitHub and the recommendations produced by our system are available as part of the MAG on Azure to facilitate further research and light up novel research paper recommendation applications.
Tasks	Recommendation Systems
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08880v1
PDF	https://arxiv.org/pdf/1905.08880v1.pdf
PWC	https://paperswithcode.com/paper/a-scalable-hybrid-research-paper-recommender
Repo	https://github.com/akanakia/microsoft-academic-paper-recommender-user-study
Framework	tf

A Bi-directional Transformer for Musical Chord Recognition


Title	A Bi-directional Transformer for Musical Chord Recognition
Authors	Jonggwon Park, Kyoyun Choi, Sungwook Jeon, Dokyun Kim, Jonghun Park
Abstract	Chord recognition is an important task since chords are highly abstract and descriptive features of music. For effective chord recognition, it is essential to utilize relevant context in audio sequence. While various machine learning models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been employed for the task, most of them have limitations in capturing long-term dependency or require training of an additional model. In this work, we utilize a self-attention mechanism for chord recognition to focus on certain regions of chords. Training of the proposed bi-directional Transformer for chord recognition (BTC) consists of a single phase while showing competitive performance. Through an attention map analysis, we have visualized how attention was performed. It turns out that the model was able to divide segments of chords by utilizing adaptive receptive field of the attention mechanism. Furthermore, it was observed that the model was able to effectively capture long-term dependencies, making use of essential information regardless of distance.
Tasks	Chord Recognition
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02698v1
PDF	https://arxiv.org/pdf/1907.02698v1.pdf
PWC	https://paperswithcode.com/paper/a-bi-directional-transformer-for-musical
Repo	https://github.com/jayg996/BTC-ISMIR19
Framework	pytorch

Exploring Deep Anomaly Detection Methods Based on Capsule Net


Title	Exploring Deep Anomaly Detection Methods Based on Capsule Net
Authors	Xiaoyan Li, Iluju Kiringa, Tet Yeap, Xiaodan Zhu, Yifeng Li
Abstract	In this paper, we develop and explore deep anomaly detection techniques based on the capsule network (CapsNet) for image data. Being able to encoding intrinsic spatial relationship between parts and a whole, CapsNet has been applied as both a classifier and deep autoencoder. This inspires us to design a prediction-probability-based and a reconstruction-error-based normality score functions for evaluating the “outlierness” of unseen images. Our results on three datasets demonstrate that the prediction-probability-based method performs consistently well, while the reconstruction-error-based approach is relatively sensitive to the similarity between labeled and unlabeled images. Furthermore, both of the CapsNet-based methods outperform the principled benchmark methods in many cases.
Tasks	Anomaly Detection
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06312v1
PDF	https://arxiv.org/pdf/1907.06312v1.pdf
PWC	https://paperswithcode.com/paper/exploring-deep-anomaly-detection-methods
Repo	https://github.com/bakirillov/capsules
Framework	pytorch

Topic-Enhanced Memory Networks for Personalised Point-of-Interest Recommendation


Title	Topic-Enhanced Memory Networks for Personalised Point-of-Interest Recommendation
Authors	Xiao Zhou, Cecilia Mascolo, Zhongxiang Zhao
Abstract	Point-of-Interest (POI) recommender systems play a vital role in people’s lives by recommending unexplored POIs to users and have drawn extensive attention from both academia and industry. Despite their value, however, they still suffer from the challenges of capturing complicated user preferences and fine-grained user-POI relationship for spatio-temporal sensitive POI recommendation. Existing recommendation algorithms, including both shallow and deep approaches, usually embed the visiting records of a user into a single latent vector to model user preferences: this has limited power of representation and interpretability. In this paper, we propose a novel topic-enhanced memory network (TEMN), a deep architecture to integrate the topic model and memory network capitalising on the strengths of both the global structure of latent patterns and local neighbourhood-based features in a nonlinear fashion. We further incorporate a geographical module to exploit user-specific spatial preference and POI-specific spatial influence to enhance recommendations. The proposed unified hybrid model is widely applicable to various POI recommendation scenarios. Extensive experiments on real-world WeChat datasets demonstrate its effectiveness (improvement ratio of 3.25% and 29.95% for context-aware and sequential recommendation, respectively). Also, qualitative analysis of the attention weights and topic modeling provides insight into the model’s recommendation process and results.
Tasks	Recommendation Systems
Published	2019-05-19
URL	https://arxiv.org/abs/1905.13127v1
PDF	https://arxiv.org/pdf/1905.13127v1.pdf
PWC	https://paperswithcode.com/paper/190513127
Repo	https://github.com/xiaominglalala/Session_based_Recommendation
Framework	none

More Efficient Policy Learning via Optimal Retargeting


Title	More Efficient Policy Learning via Optimal Retargeting
Authors	Nathan Kallus
Abstract	Policy learning can be used to extract individualized treatment regimes from observational data in healthcare, civics, e-commerce, and beyond. One big hurdle to policy learning is a commonplace lack of overlap in the data for different actions, which can lead to unwieldy policy evaluation and poorly performing learned policies. We study a solution to this problem based on retargeting, that is, changing the population on which policies are optimized. We first argue that at the population level, retargeting may induce little to no bias. We then characterize the optimal reference policy centering and retargeting weights in both binary-action and multi-action settings. We do this in terms of the asymptotic efficient estimation variance of the new learning objective. We further consider bias regularization. Extensive empirical results in a simulation study and a case study of targeted job counseling demonstrate that retargeting is a fairly easy way to significantly improve any policy learning procedure.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08611v1
PDF	https://arxiv.org/pdf/1906.08611v1.pdf
PWC	https://paperswithcode.com/paper/more-efficient-policy-learning-via-optimal
Repo	https://github.com/CausalML/RetargetedPolicyLearning
Framework	none

INFER: INtermediate representations for FuturE pRediction


Title	INFER: INtermediate representations for FuturE pRediction
Authors	Shashank Srikanth, Junaid Ahmed Ansari, Karnik Ram R, Sarthak Sharma, Krishna Murthy J., Madhava Krishna K
Abstract	In urban driving scenarios, forecasting future trajectories of surrounding vehicles is of paramount importance. While several approaches for the problem have been proposed, the best-performing ones tend to require extremely detailed input representations (eg. image sequences). But, such methods do not generalize to datasets they have not been trained on. We propose intermediate representations that are particularly well-suited for future prediction. As opposed to using texture (color) information, we rely on semantics and train an autoregressive model to accurately predict future trajectories of traffic participants (vehicles) (see fig. above). We demonstrate that using semantics provides a significant boost over techniques that operate over raw pixel intensities/disparities. Uncharacteristic of state-of-the-art approaches, our representations and models generalize to completely different datasets, collected across several cities, and also across countries where people drive on opposite sides of the road (left-handed vs right-handed driving). Additionally, we demonstrate an application of our approach in multi-object tracking (data association). To foster further research in transferrable representations and ensure reproducibility, we release all our code and data.
Tasks	Activity Prediction, Future prediction, Multi-Object Tracking, Object Tracking, Trajectory Prediction
Published	2019-03-26
URL	http://arxiv.org/abs/1903.10641v1
PDF	http://arxiv.org/pdf/1903.10641v1.pdf
PWC	https://paperswithcode.com/paper/infer-intermediate-representations-for-future
Repo	https://github.com/talsperre/INFER
Framework	pytorch

A Gentle Introduction to Deep Learning for Graphs


Title	A Gentle Introduction to Deep Learning for Graphs
Authors	Davide Bacciu, Federico Errica, Alessio Micheli, Marco Podda
Abstract	The adaptive processing of graph data is a long-standing research topic which has been lately consolidated as a theme of major interest in the deep learning community. The snap increase in the amount and breadth of related research has come at the price of little systematization of knowledge and attention to earlier literature. This work is designed as a tutorial introduction to the field of deep learning for graphs. It favours a consistent and progressive introduction of the main concepts and architectural aspects over an exposition of the most recent literature, for which the reader is referred to available surveys. The paper takes a top-down view to the problem, introducing a generalized formulation of graph representation learning based on a local and iterative approach to structured information processing. It introduces the basic building blocks that can be combined to design novel and effective neural models for graphs. The methodological exposition is complemented by a discussion of interesting research challenges and applications in the field.
Tasks	Graph Representation Learning, Representation Learning
Published	2019-12-29
URL	https://arxiv.org/abs/1912.12693v1
PDF	https://arxiv.org/pdf/1912.12693v1.pdf
PWC	https://paperswithcode.com/paper/a-gentle-introduction-to-deep-learning-for
Repo	https://github.com/diningphil/gnn-comparison
Framework	pytorch

AlignNet-3D: Fast Point Cloud Registration of Partially Observed Objects


Title	AlignNet-3D: Fast Point Cloud Registration of Partially Observed Objects
Authors	Johannes Groß, Aljosa Osep, Bastian Leibe
Abstract	Methods tackling multi-object tracking need to estimate the number of targets in the sensing area as well as to estimate their continuous state. While the majority of existing methods focus on data association, precise state (3D pose) estimation is often only coarsely estimated by approximating targets with centroids or (3D) bounding boxes. However, in automotive scenarios, motion perception of surrounding agents is critical and inaccuracies in the vehicle close-range can have catastrophic consequences. In this work, we focus on precise 3D track state estimation and propose a learning-based approach for object-centric relative motion estimation of partially observed objects. Instead of approximating targets with their centroids, our approach is capable of utilizing noisy 3D point segments of objects to estimate their motion. To that end, we propose a simple, yet effective and efficient network, \method, that learns to align point clouds. Our evaluation on two different datasets demonstrates that our method outperforms computationally expensive, global 3D registration methods while being significantly more efficient. We make our data, code, and models available at https://www.vision.rwth-aachen.de/page/alignnet.
Tasks	3D Pose Estimation, Motion Estimation, Multi-Object Tracking, Object Tracking, Point Cloud Registration, Pose Estimation
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04668v1
PDF	https://arxiv.org/pdf/1910.04668v1.pdf
PWC	https://paperswithcode.com/paper/alignnet-3d-fast-point-cloud-registration-of
Repo	https://github.com/grossjohannes/AlignNet-3D
Framework	tf

Communication-based Evaluation for Natural Language Generation


Title	Communication-based Evaluation for Natural Language Generation
Authors	Benjamin Newman, Reuben Cohn-Gordon, Christopher Potts
Abstract	Natural language generation (NLG) systems are commonly evaluated using n-gram overlap measures (e.g. BLEU, ROUGE). These measures do not directly capture semantics or speaker intentions, and so they often turn out to be misaligned with our true goals for NLG. In this work, we argue instead for communication-based evaluations: assuming the purpose of an NLG system is to convey information to a reader/listener, we can directly evaluate its effectiveness at this task using the Rational Speech Acts model of pragmatic language use. We illustrate with a color reference dataset that contains descriptions in pre-defined quality categories, showing that our method better aligns with these quality categories than do any of the prominent n-gram overlap methods.
Tasks	Text Generation
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07290v2
PDF	https://arxiv.org/pdf/1909.07290v2.pdf
PWC	https://paperswithcode.com/paper/communication-based-evaluation-for-natural
Repo	https://github.com/bnewm0609/comm-eval
Framework	pytorch

Multi-Garment Net: Learning to Dress 3D People from Images


Title	Multi-Garment Net: Learning to Dress 3D People from Images
Authors	Bharat Lal Bhatnagar, Garvita Tiwari, Christian Theobalt, Gerard Pons-Moll
Abstract	We present Multi-Garment Network (MGN), a method to predict body shape and clothing, layered on top of the SMPL model from a few frames (1-8) of a video. Several experiments demonstrate that this representation allows higher level of control when compared to single mesh or voxel representations of shape. Our model allows to predict garment geometry, relate it to the body shape, and transfer it to new body shapes and poses. To train MGN, we leverage a digital wardrobe containing 712 digital garments in correspondence, obtained with a novel method to register a set of clothing templates to a dataset of real 3D scans of people in different clothing and poses. Garments from the digital wardrobe, or predicted by MGN, can be used to dress any body shape in arbitrary poses. We will make publicly available the digital wardrobe, the MGN model, and code to dress SMPL with the garments.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06903v2
PDF	https://arxiv.org/pdf/1908.06903v2.pdf
PWC	https://paperswithcode.com/paper/multi-garment-net-learning-to-dress-3d-people
Repo	https://github.com/minar09/MultiGarmentNetworkPython3
Framework	tf

Autoregressive Energy Machines


Title	Autoregressive Energy Machines
Authors	Charlie Nash, Conor Durkan
Abstract	Neural density estimators are flexible families of parametric models which have seen widespread use in unsupervised machine learning in recent years. Maximum-likelihood training typically dictates that these models be constrained to specify an explicit density. However, this limitation can be overcome by instead using a neural network to specify an energy function, or unnormalized density, which can subsequently be normalized to obtain a valid distribution. The challenge with this approach lies in accurately estimating the normalizing constant of the high-dimensional energy function. We propose the Autoregressive Energy Machine, an energy-based model which simultaneously learns an unnormalized density and computes an importance-sampling estimate of the normalizing constant for each conditional in an autoregressive decomposition. The Autoregressive Energy Machine achieves state-of-the-art performance on a suite of density-estimation tasks.
Tasks	Density Estimation
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05626v1
PDF	http://arxiv.org/pdf/1904.05626v1.pdf
PWC	https://paperswithcode.com/paper/autoregressive-energy-machines
Repo	https://github.com/conormdurkan/autoregressive-energy-machines
Framework	pytorch

Online Multi-Object Tracking Framework with the GMPHD Filter and Occlusion Group Management


Title	Online Multi-Object Tracking Framework with the GMPHD Filter and Occlusion Group Management
Authors	Young-min Song, Kwangjin Yoon, Young-Chul Yoon, Kin-Choong Yow, Moongu Jeon
Abstract	In this paper, we propose an efficient online multi-object tracking framework based on the GMPHD filter and occlusion group management scheme where the GMPHD filter utilizes hierarchical data association to reduce the false negatives caused by miss detection. The hierarchical data association consists of two steps: detection-to-track and track-to-track associations, which can recover the lost tracks and their switched IDs. In addition, the proposed framework is equipped with an object grouping management scheme which handles occlusion problems with two main parts. The first part is “track merging” which can merge the false positive tracks caused by false positive detections from occlusions, where the false positive tracks are usually occluded with a measure. The measure is the occlusion ratio between visual objects, sum-of-intersection-over-area (SIOA) we defined instead of the IOU metric. The second part is “occlusion group energy minimization (OGEM)” which prevents the occluded true positive tracks from false “track merging”. We define each group of the occluded objects as an energy function and find an optimal hypothesis which makes the energy minimal. We evaluate the proposed tracker in benchmark datasets such as MOT15 and MOT17 which are built for multi-person tracking. An ablation study in training dataset shows that not only “track merging” and “OGEM” complement each other but also the proposed tracking method has more robust performance and less sensitive to parameters than baseline methods. Also, SIOA works better than IOU for various sizes of false positives. Experimental results show that the proposed tracker efficiently handles occlusion situations and achieves competitive performance compared to the state-of-the-art methods. Especially, our method shows the best multi-object tracking accuracy among the online and real-time executable methods.
Tasks	Multi-Object Tracking, Object Tracking, Online Multi-Object Tracking
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13347v1
PDF	https://arxiv.org/pdf/1907.13347v1.pdf
PWC	https://paperswithcode.com/paper/online-multi-object-tracking-framework-with
Repo	https://github.com/SonginCV/GMPHD-OGM
Framework	none

Voronoi-based Efficient Surrogate-assisted Evolutionary Algorithm for Very Expensive Problems


Title	Voronoi-based Efficient Surrogate-assisted Evolutionary Algorithm for Very Expensive Problems
Authors	Hao Tong, Changwu Huang, Jialin Liu, Xin Yao
Abstract	Very expensive problems are very common in practical system that one fitness evaluation costs several hours or even days. Surrogate assisted evolutionary algorithms (SAEAs) have been widely used to solve this crucial problem in the past decades. However, most studied SAEAs focus on solving problems with a budget of at least ten times of the dimension of problems which is unacceptable in many very expensive real-world problems. In this paper, we employ Voronoi diagram to boost the performance of SAEAs and propose a novel framework named Voronoi-based efficient surrogate assisted evolutionary algorithm (VESAEA) for very expensive problems, in which the optimization budget, in terms of fitness evaluations, is only 5 times of the problem’s dimension. In the proposed framework, the Voronoi diagram divides the whole search space into several subspace and then the local search is operated in some potentially better subspace. Additionally, in order to trade off the exploration and exploitation, the framework involves a global search stage developed by combining leave-one-out cross-validation and radial basis function surrogate model. A performance selector is designed to switch the search dynamically and automatically between the global and local search stages. The empirical results on a variety of benchmark problems demonstrate that the proposed framework significantly outperforms several state-of-art algorithms with extremely limited fitness evaluations. Besides, the efficacy of Voronoi-diagram is furtherly analyzed, and the results show its potential to optimize very expensive problems.
Tasks
Published	2019-01-17
URL	https://arxiv.org/abs/1901.05755v2
PDF	https://arxiv.org/pdf/1901.05755v2.pdf
PWC	https://paperswithcode.com/paper/voronoi-based-efficient-surrogate-assisted
Repo	https://github.com/HawkTom/VESAEA
Framework	tf

GASL: Guided Attention for Sparsity Learning in Deep Neural Networks


Title	GASL: Guided Attention for Sparsity Learning in Deep Neural Networks
Authors	Amirsina Torfi, Rouzbeh A. Shirvani, Sobhan Soleymani, Naser M. Nasrabadi
Abstract	The main goal of network pruning is imposing sparsity on the neural network by increasing the number of parameters with zero value in order to reduce the architecture size and the computational speedup. In most of the previous research works, sparsity is imposed stochastically without considering any prior knowledge of the weights distribution or other internal network characteristics. Enforcing too much sparsity may induce accuracy drop due to the fact that a lot of important elements might have been eliminated. In this paper, we propose Guided Attention for Sparsity Learning (GASL) to achieve (1) model compression by having less number of elements and speed-up; (2) prevent the accuracy drop by supervising the sparsity operation via a guided attention mechanism and (3) introduce a generic mechanism that can be adapted for any type of architecture; Our work is aimed at providing a framework based on interpretable attention mechanisms for imposing structured and non-structured sparsity in deep neural networks. For Cifar-100 experiments, we achieved the state-of-the-art sparsity level and 2.91x speedup with competitive accuracy compared to the best method. For MNIST and LeNet architecture we also achieved the highest sparsity and speedup level.
Tasks	Model Compression, Network Pruning
Published	2019-01-07
URL	http://arxiv.org/abs/1901.01939v2
PDF	http://arxiv.org/pdf/1901.01939v2.pdf
PWC	https://paperswithcode.com/paper/gasl-guided-attention-for-sparsity-learning
Repo	https://github.com/astorfi/attention-guided-sparsity
Framework	tf

Efficient Global Multi-object Tracking Under Minimum-cost Circulation Framework


Title	Efficient Global Multi-object Tracking Under Minimum-cost Circulation Framework
Authors	Congchao Wang, Yizhi Wang, Guoqiang Yu
Abstract	We developed a minimum-cost circulation framework for solving the global data association problem, which plays a key role in the tracking-by-detection paradigm of multi-object tracking. The global data association problem was extensively studied under the minimum-cost flow framework, which is theoretically attractive as being flexible and globally solvable. However, the high computational burden has been a long-standing obstacle to its wide adoption in practice. While enjoying the same theoretical advantages and maintaining the same optimal solution as the minimum-cost flow framework, our new framework has a better theoretical complexity bound and leads to orders of practical efficiency improvement. This new framework is motivated by the observation that minimum-cost flow only partially models the data association problem and must be accompanied by an additional and time-consuming searching scheme to determine the optimal object number. By employing a minimum-cost circulation framework, we eliminate the searching step and naturally integrate the number of objects into the optimization problem. By exploring the special property of the associated graph, that is, an overwhelming majority of the vertices are with unit capacity, we designed an implementation of the framework and proved it has the best theoretical complexity so far for the global data association problem. We evaluated our method with 40 experiments on five MOT benchmark datasets. Our method was always the most efficient and averagely 53 to 1,192 times faster than the three state-of-the-art methods. When our method served as a sub-module for global data association methods using higher-order constraints, similar efficiency improvement was attained. We further illustrated through several case studies how the improved computational efficiency enables more sophisticated tracking models and yields better tracking accuracy.
Tasks	Multi-Object Tracking, Object Tracking
Published	2019-11-02
URL	https://arxiv.org/abs/1911.00796v2
PDF	https://arxiv.org/pdf/1911.00796v2.pdf
PWC	https://paperswithcode.com/paper/efficient-global-multi-object-tracking-under
Repo	https://github.com/yu-lab-vt/CINDA
Framework	none