February 1, 2020

3378 words 16 mins read

Paper Group AWR 291

What the Constant Velocity Model Can Teach Us About Pedestrian Motion Prediction. CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity. Detecting drug-drug interactions using artificial neural networks and classic graph similarity measures. Machine Learning and Deep Learning Algorithms for Bearing Fault Diagnostics – A Comprehensive …

What the Constant Velocity Model Can Teach Us About Pedestrian Motion Prediction


Title	What the Constant Velocity Model Can Teach Us About Pedestrian Motion Prediction
Authors	Christoph Schöller, Vincent Aravantinos, Florian Lay, Alois Knoll
Abstract	Pedestrian motion prediction is a fundamental task for autonomous robots and vehicles to operate safely. In recent years many complex approaches based on neural networks have been proposed to address this problem. In this work we show that - surprisingly - a simple Constant Velocity Model can outperform even state-of-the-art neural models. This indicates that either neural networks are not able to make use of the additional information they are provided with, or that this information is not as relevant as commonly believed. Therefore, we analyze how neural networks process their input and how it impacts their predictions. Our analysis reveals pitfalls in training neural networks for pedestrian motion prediction and clarifies false assumptions about the problem itself. In particular, neural networks implicitly learn environmental priors that negatively impact their generalization capability, the motion history of pedestrians is irrelevant and interactions are too complex to predict. Our work shows how neural networks for pedestrian motion prediction can be thoroughly evaluated and our results indicate which research directions for neural motion prediction are promising in future.
Tasks	motion prediction
Published	2019-03-19
URL	https://arxiv.org/abs/1903.07933v3
PDF	https://arxiv.org/pdf/1903.07933v3.pdf
PWC	https://paperswithcode.com/paper/the-simpler-the-better-constant-velocity-for
Repo	https://github.com/elbuco1/AttentionMechanismsTrajectoryPrediction
Framework	pytorch

CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity


Title	CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity
Authors	Konpat Preechakul, Boonserm Kijsirikul
Abstract	Most optimizers including stochastic gradient descent (SGD) and its adaptive gradient derivatives face the same problem where an effective learning rate during the training is vastly different. A learning rate scheduling, mostly tuned by hand, is usually employed in practice. In this paper, we propose CProp, a gradient scaling method, which acts as a second-level learning rate adapting throughout the training process based on cues from past gradient conformity. When the past gradients agree on direction, CProp keeps the original learning rate. On the contrary, if the gradients do not agree on direction, CProp scales down the gradient proportionally to its uncertainty. Since it works by scaling, it could apply to any existing optimizer extending its learning rate scheduling capability. We put CProp to a series of tests showing significant gain in training speed on both SGD and adaptive gradient method like Adam. Codes are available at https://github.com/phizaz/cprop .
Tasks	Stochastic Optimization
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11493v1
PDF	https://arxiv.org/pdf/1912.11493v1.pdf
PWC	https://paperswithcode.com/paper/cprop-adaptive-learning-rate-scaling-from
Repo	https://github.com/phizaz/cprop
Framework	pytorch

Detecting drug-drug interactions using artificial neural networks and classic graph similarity measures


Title	Detecting drug-drug interactions using artificial neural networks and classic graph similarity measures
Authors	Guy Shtar, Lior Rokach, Bracha Shapira
Abstract	Drug-drug interactions are preventable causes of medical injuries and often result in doctor and emergency room visits. Computational techniques can be used to predict potential drug-drug interactions. We approach the drug-drug interaction prediction problem as a link prediction problem and present two novel methods for drug-drug interaction prediction based on artificial neural networks and factor propagation over graph nodes: adjacency matrix factorization (AMF) and adjacency matrix factorization with propagation (AMFP). We conduct a retrospective analysis by training our models on a previous release of the DrugBank database with 1,141 drugs and 45,296 drug-drug interactions and evaluate the results on a later version of DrugBank with 1,440 drugs and 248,146 drug-drug interactions. Additionally, we perform a holdout analysis using DrugBank. We report an area under the receiver operating characteristic curve score of 0.807 and 0.990 for the retrospective and holdout analyses respectively. Finally, we create an ensemble-based classifier using AMF, AMFP, and existing link prediction methods and obtain an area under the receiver operating characteristic curve of 0.814 and 0.991 for the retrospective and the holdout analyses. We demonstrate that AMF and AMFP provide state of the art results compared to existing methods and that the ensemble-based classifier improves the performance by combining various predictors. These results suggest that AMF, AMFP, and the proposed ensemble-based classifier can provide important information during drug development and regarding drug prescription given only partial or noisy data. These methods can also be used to solve other link prediction problems. Drug embeddings (compressed representations) created when training our models using the interaction network have been made public.
Tasks	Graph Similarity, Link Prediction
Published	2019-03-11
URL	https://arxiv.org/abs/1903.04571v2
PDF	https://arxiv.org/pdf/1903.04571v2.pdf
PWC	https://paperswithcode.com/paper/detecting-drug-drug-interactions-using
Repo	https://github.com/goolig/DDI_prediction
Framework	none

Machine Learning and Deep Learning Algorithms for Bearing Fault Diagnostics – A Comprehensive Review


Title	Machine Learning and Deep Learning Algorithms for Bearing Fault Diagnostics – A Comprehensive Review
Authors	Shen Zhang, Shibo Zhang, Bingnan Wang, Thomas G. Habetler
Abstract	In this survey paper, we systematically summarize existing literature on bearing fault diagnostics with machine learning (ML) and data mining techniques. While conventional ML methods, including artificial neural network (ANN), principal component analysis (PCA), support vector machines (SVM), etc., have been successfully applied to the detection and categorization of bearing faults for decades, recent developments in deep learning (DL) algorithms in the last five years have sparked renewed interest in both industry and academia for intelligent machine health monitoring. In this paper, we first provide a brief review of conventional ML methods, before taking a deep dive into the state-of-the-art DL algorithms for bearing fault applications. Specifically, the superiority of DL based methods over conventional ML methods are analyzed in terms of fault feature extraction and classification performances; many new functionalities enabled by DL techniques are also summarized. In addition, to obtain a more intuitive insight, a comparative study is conducted on the classification accuracy of different algorithms utilizing the open-source Case Western Reserve University (CWRU) bearing dataset. Finally, to facilitate the transition on applying various DL algorithms to bearing fault diagnostics, detailed recommendations and suggestions are provided for specific application conditions such as the setup environment, the data size, and the number of sensors and sensor types. Future research directions to further enhance the performance of DL algorithms on health monitoring are also discussed.
Tasks	Time Series
Published	2019-01-24
URL	https://arxiv.org/abs/1901.08247v3
PDF	https://arxiv.org/pdf/1901.08247v3.pdf
PWC	https://paperswithcode.com/paper/machine-learning-and-deep-learning-algorithms
Repo	https://github.com/hustcxl/Deep-learning-in-PHM
Framework	none

Investigating Deep Neural Transformations for Spectrogram-based Musical Source Separation


Title	Investigating Deep Neural Transformations for Spectrogram-based Musical Source Separation
Authors	Woosung Choi, Minseok Kim, Jaehwa Chung, Daewon Lee, Soonyoung Jung
Abstract	Musical Source Separation (MSS) is a signal processing task that tries to separate the mixed musical signal into each acoustic sound source, such as singing voice or drums. Recently many machine learning-based methods have been proposed for the MSS task, but there were no existing works that evaluate and directly compare various types of networks. In this paper, we aim to design a variety of neural transformation methods, including time-invariant methods, time-frequency methods, and mixtures of two different transformations. Our experiments provide abundant material for future works by comparing several transformation methods. We train our models on raw complex-valued STFT outputs and achieve state-of-the-art SDR performance on the MUSDB singing voice separation task by a large margin of 1.0 dB.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.02591v2
PDF	https://arxiv.org/pdf/1912.02591v2.pdf
PWC	https://paperswithcode.com/paper/investigating-deep-neural-transformations-for
Repo	https://github.com/Intelligence-Engineering-LAB-KU/Musical-Source-Separation
Framework	pytorch

Unsupervised Deep Transfer Learning for Intelligent Fault Diagnosis: An Open Source and Comparative Study


Title	Unsupervised Deep Transfer Learning for Intelligent Fault Diagnosis: An Open Source and Comparative Study
Authors	Zhibin Zhao, Qiyang Zhang, Xiaolei Yu, Chuang Sun, Shibin Wang, Ruqiang Yan, Xuefeng Chen
Abstract	Recent progress on intelligent fault diagnosis has greatly depended on the deep learning and plenty of labeled data. However, the machine often operates with various working conditions or the target task has different distributions with the collected data used for training (we called the domain shift problem). This leads to the deep transfer learning based (DTL-based) intelligent fault diagnosis which attempts to remit this domain shift problem. Besides, the newly collected testing data are usually unlabeled, which results in the subclass DTL-based methods called unsupervised deep transfer learning based (UDTL-based) intelligent fault diagnosis. Although it has achieved huge development in the field of fault diagnosis, a standard and open source code framework and a comparative study for UDTL-based intelligent fault diagnosis are not yet established. In this paper, commonly used UDTL-based algorithms in intelligent fault diagnosis are integrated into a unified testing framework and the framework is tested on five datasets. Extensive experiments are performed to provide a systematically comparative analysis and the benchmark accuracy for more comparable and meaningful further studies. To emphasize the importance and reproducibility of UDTL-based intelligent fault diagnosis, the testing framework with source codes will be released to the research community to facilitate future research. Finally, comparative analysis of results also reveals some open and essential issues in DTL for intelligent fault diagnosis which are rarely studied including transferability of features, influence of backbones, negative transfer, and physical priors. In summary, the released framework and comparative study can serve as an extended interface and the benchmark results to carry out new studies on UDTL-based intelligent fault diagnosis. The code framework is available at https://github.com/ZhaoZhibin/UDTL.
Tasks	Transfer Learning
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12528v1
PDF	https://arxiv.org/pdf/1912.12528v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-deep-transfer-learning-for
Repo	https://github.com/ZhaoZhibin/UDTL
Framework	pytorch

Probing the phonetic and phonological knowledge of tones in Mandarin TTS models


Title	Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
Authors	Jian Zhu
Abstract	This study probes the phonetic and phonological knowledge of lexical tones in TTS models through two experiments. Controlled stimuli for testing tonal coarticulation and tone sandhi in Mandarin were fed into Tacotron 2 and WaveGlow to generate speech samples, which were subject to acoustic analysis and human evaluation. Results show that both baseline Tacotron 2 and Tacotron 2 with BERT embeddings capture the surface tonal coarticulation patterns well but fail to consistently apply the Tone-3 sandhi rule to novel sentences. Incorporating pre-trained BERT embeddings into Tacotron 2 improves the naturalness and prosody performance, and yields better generalization of Tone-3 sandhi rules to novel complex sentences, although the overall accuracy for Tone-3 sandhi was still low. Given that TTS models do capture some linguistic phenomena, it is argued that they can be used to generate and validate certain linguistic hypotheses. On the other hand, it is also suggested that linguistically informed stimuli should be included in the training and the evaluation of TTS models.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10915v1
PDF	https://arxiv.org/pdf/1912.10915v1.pdf
PWC	https://paperswithcode.com/paper/probing-the-phonetic-and-phonological
Repo	https://github.com/lingjzhu/probing-TTS-models
Framework	pytorch

Fair Decisions Despite Imperfect Predictions


Title	Fair Decisions Despite Imperfect Predictions
Authors	Niki Kilbertus, Manuel Gomez-Rodriguez, Bernhard Schölkopf, Krikamol Muandet, Isabel Valera
Abstract	Consequential decisions are increasingly informed by sophisticated data-driven predictive models. However, consistently learning accurate predictive models requires access to ground truth labels. Unfortunately, in practice, labels may only exist conditional on certain decisions—if a loan is denied, there is not even an option for the individual to pay back the loan. In this paper, we show that, in this selective labels setting, learning to predict is suboptimal in terms of both fairness and utility. To avoid this undesirable behavior, we propose to directly learn stochastic decision policies that maximize utility under fairness constraints. In the context of fair machine learning, our results suggest the need for a paradigm shift from “learning to predict” to “learning to decide”. Experiments on synthetic and real-world data illustrate the favorable properties of learning to decide, in terms of both utility and fairness.
Tasks	Decision Making
Published	2019-02-08
URL	https://arxiv.org/abs/1902.02979v3
PDF	https://arxiv.org/pdf/1902.02979v3.pdf
PWC	https://paperswithcode.com/paper/improving-consequential-decision-making-under
Repo	https://github.com/nikikilbertus/fair-decisions
Framework	none

Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets


Title	Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets
Authors	Robert Cornish, Paul Vanetti, Alexandre Bouchard-Côté, George Deligiannidis, Arnaud Doucet
Abstract	Bayesian inference via standard Markov Chain Monte Carlo (MCMC) methods is too computationally intensive to handle large datasets, since the cost per step usually scales like $\Theta(n)$ in the number of data points $n$. We propose the Scalable Metropolis-Hastings (SMH) kernel that exploits Gaussian concentration of the posterior to require processing on average only $O(1)$ or even $O(1/\sqrt{n})$ data points per step. This scheme is based on a combination of factorized acceptance probabilities, procedures for fast simulation of Bernoulli processes, and control variate ideas. Contrary to many MCMC subsampling schemes such as fixed step-size Stochastic Gradient Langevin Dynamics, our approach is exact insofar as the invariant distribution is the true posterior and not an approximation to it. We characterise the performance of our algorithm theoretically, and give realistic and verifiable conditions under which it is geometrically ergodic. This theory is borne out by empirical results that demonstrate overall performance benefits over standard Metropolis-Hastings and various subsampling algorithms.
Tasks	Bayesian Inference
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09881v3
PDF	https://arxiv.org/pdf/1901.09881v3.pdf
PWC	https://paperswithcode.com/paper/scalable-metropolis-hastings-for-exact
Repo	https://github.com/pjcv/smh
Framework	none

Deep weakly-supervised learning methods for classification and localization in histology images: a survey


Title	Deep weakly-supervised learning methods for classification and localization in histology images: a survey
Authors	Jérôme Rony, Soufiane Belharbi, Jose Dolz, Ismail Ben Ayed, Luke McCaffrey, Eric Granger
Abstract	Using state-of-the-art deep learning models for the computer-assisted diagnosis of diseases like cancer raises several challenges related to the nature and availability of labeled histology images. In particular, cancer grading and localization in these images normally relies on both image- and pixel-level labels, the latter requiring a costly annotation process. In this survey, deep weakly-supervised learning (WSL) architectures are investigated to identify and locate diseases in histology image, without the need for pixel-level annotations. Given a training dataset with globally-annotated images, these models allow to simultaneously classify histology images, while localizing the corresponding regions of interest. These models are organized into two main approaches – (1) bottom-up approaches (based on forward-pass information through a network, either by spatial pooling of representations/scores, or by detecting class regions), and (2) top-down approaches (based on backward-pass information within a network, inspired by human visual attention). Since relevant WSL models have mainly been developed in the computer vision community, and validated on natural scene images, we assess the extent to which they apply to histology images which have challenging properties, e.g., large size, non-salient and highly unstructured regions, stain heterogeneity, and coarse/ambiguous labels. The most relevant deep WSL models (e.g., CAM, WILDCAT and Deep MIL) are compared experimentally in terms of accuracy (classification and pixel-level localization) on several public benchmark histology datasets for breast and colon cancer (BACH ICIAR 2018, BreakHis, CAMELYON16, and GlaS). Results indicate that several deep learning models, and in particular WILDCAT and deep MIL can provide a high level of classification accuracy, although pixel-wise localization of cancer regions remains an issue for such images.
Tasks
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03354v2
PDF	https://arxiv.org/pdf/1909.03354v2.pdf
PWC	https://paperswithcode.com/paper/deep-weakly-supervised-learning-methods-for
Repo	https://github.com/jeromerony/survey_wsl_histology
Framework	pytorch

Tracking Holistic Object Representations


Title	Tracking Holistic Object Representations
Authors	Axel Sauer, Elie Aljalbout, Sami Haddadin
Abstract	Recent advances in visual tracking are based on siamese feature extractors and template matching. For this category of trackers, latest research focuses on better feature embeddings and similarity measures. In this work, we focus on building holistic object representations for tracking. We propose a framework that is designed to be used on top of previous trackers without any need for further training of the siamese network. The framework leverages the idea of obtaining additional object templates during the tracking process. Since the number of stored templates is limited, our method only keeps the most diverse ones. We achieve this by providing a new diversity measure in the space of siamese features. The obtained representation contains information beyond the ground truth object location provided to the system. It is then useful for tracking itself but also for further tasks which require a visual understanding of objects. Strong empirical results on tracking benchmarks indicate that our method can improve the performance and robustness of the underlying trackers while barely reducing their speed. In addition, our method is able to match current state-of-the-art results, while using a simpler and older network architecture and running three times faster.
Tasks	Visual Object Tracking, Visual Tracking
Published	2019-07-21
URL	https://arxiv.org/abs/1907.12920v2
PDF	https://arxiv.org/pdf/1907.12920v2.pdf
PWC	https://paperswithcode.com/paper/tracking-holistic-object-representations
Repo	https://github.com/xl-sr/THOR
Framework	pytorch

Learning Complex Basis Functions for Invariant Representations of Audio


Title	Learning Complex Basis Functions for Invariant Representations of Audio
Authors	Stefan Lattner, Monika Dörfler, Andreas Arzt
Abstract	Learning features from data has shown to be more successful than using hand-crafted features for many machine learning tasks. In music information retrieval (MIR), features learned from windowed spectrograms are highly variant to transformations like transposition or time-shift. Such variances are undesirable when they are irrelevant for the respective MIR task. We propose an architecture called Complex Autoencoder (CAE) which learns features invariant to orthogonal transformations. Mapping signals onto complex basis functions learned by the CAE results in a transformation-invariant “magnitude space” and a transformation-variant “phase space”. The phase space is useful to infer transformations between data pairs. When exploiting the invariance-property of the magnitude space, we achieve state-of-the-art results in audio-to-score alignment and repeated section discovery for audio. A PyTorch implementation of the CAE, including the repeated section discovery method, is available online.
Tasks	Information Retrieval, Music Information Retrieval
Published	2019-07-13
URL	https://arxiv.org/abs/1907.05982v1
PDF	https://arxiv.org/pdf/1907.05982v1.pdf
PWC	https://paperswithcode.com/paper/learning-complex-basis-functions-for
Repo	https://github.com/SonyCSLParis/cae-invar
Framework	pytorch

A Distributed Fair Machine Learning Framework with Private Demographic Data Protection


Title	A Distributed Fair Machine Learning Framework with Private Demographic Data Protection
Authors	Hui Hu, Yijun Liu, Zhen Wang, Chao Lan
Abstract	Fair machine learning has become a significant research topic with broad societal impact. However, most fair learning methods require direct access to personal demographic data, which is increasingly restricted to use for protecting user privacy (e.g. by the EU General Data Protection Regulation). In this paper, we propose a distributed fair learning framework for protecting the privacy of demographic data. We assume this data is privately held by a third party, which can communicate with the data center (responsible for model development) without revealing the demographic information. We propose a principled approach to design fair learning methods under this framework, exemplify four methods and show they consistently outperform their existing counterparts in both fairness and accuracy across three real-world data sets. We theoretically analyze the framework, and prove it can learn models with high fairness or high accuracy, with their trade-offs balanced by a threshold variable.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08081v1
PDF	https://arxiv.org/pdf/1909.08081v1.pdf
PWC	https://paperswithcode.com/paper/a-distributed-fair-machine-learning-framework
Repo	https://github.com/HuiHu1/Privacy-of-Distributed-Fair-Learning-Framework
Framework	none

Contrastive Predictive Coding Based Feature for Automatic Speaker Verification


Title	Contrastive Predictive Coding Based Feature for Automatic Speaker Verification
Authors	Cheng-I Lai
Abstract	This thesis describes our ongoing work on Contrastive Predictive Coding (CPC) features for speaker verification. CPC is a recently proposed representation learning framework based on predictive coding and noise contrastive estimation. We focus on incorporating CPC features into the standard automatic speaker verification systems, and we present our methods, experiments, and analysis. This thesis also details necessary background knowledge in past and recent work on automatic speaker verification systems, conventional speech features, and the motivation and techniques behind CPC.
Tasks	Representation Learning, Speaker Verification
Published	2019-04-01
URL	http://arxiv.org/abs/1904.01575v1
PDF	http://arxiv.org/pdf/1904.01575v1.pdf
PWC	https://paperswithcode.com/paper/contrastive-predictive-coding-based-feature
Repo	https://github.com/jefflai108/Contrastive-Predictive-Coding-PyTorch
Framework	pytorch

Behaviour Suite for Reinforcement Learning


Title	Behaviour Suite for Reinforcement Learning
Authors	Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado Van Hasselt
Abstract	This paper introduces the Behaviour Suite for Reinforcement Learning, or bsuite for short. bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives. First, to collect clear, informative and scalable problems that capture key issues in the design of general and efficient learning algorithms. Second, to study agent behaviour through their performance on these shared benchmarks. To complement this effort, we open source github.com/deepmind/bsuite, which automates evaluation and analysis of any agent on bsuite. This library facilitates reproducible and accessible research on the core issues in RL, and ultimately the design of superior learning algorithms. Our code is Python, and easy to use within existing projects. We include examples with OpenAI Baselines, Dopamine as well as new reference implementations. Going forward, we hope to incorporate more excellent experiments from the research community, and commit to a periodic review of bsuite from a committee of prominent researchers.
Tasks
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03568v3
PDF	https://arxiv.org/pdf/1908.03568v3.pdf
PWC	https://paperswithcode.com/paper/behaviour-suite-for-reinforcement-learning
Repo	https://github.com/deepmind/bsuite
Framework	tf