October 21, 2019

3253 words 16 mins read

Paper Group AWR 54

Characterizing Well-Behaved vs. Pathological Deep Neural Networks. Learning Depth with Convolutional Spatial Propagation Network. Hamiltonian Descent Methods. On the Complexity of Opinions and Online Discussions. Action-Agnostic Human Pose Forecasting. CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++. Two-Stage Co …

Characterizing Well-Behaved vs. Pathological Deep Neural Networks


Title	Characterizing Well-Behaved vs. Pathological Deep Neural Networks
Authors	Antoine Labatie
Abstract	We introduce a novel approach, requiring only mild assumptions, for the characterization of deep neural networks at initialization. Our approach applies both to fully-connected and convolutional networks and easily incorporates batch normalization and skip-connections. Our key insight is to consider the evolution with depth of statistical moments of signal and noise, thereby characterizing the presence or absence of pathologies in the hypothesis space encoded by the choice of hyperparameters. We establish: (i) for feedforward networks, with and without batch normalization, the multiplicativity of layer composition inevitably leads to ill-behaved moments and pathologies; (ii) for residual networks with batch normalization, on the other hand, skip-connections induce power-law rather than exponential behaviour, leading to well-behaved moments and no pathology.
Tasks
Published	2018-11-07
URL	https://arxiv.org/abs/1811.03087v5
PDF	https://arxiv.org/pdf/1811.03087v5.pdf
PWC	https://paperswithcode.com/paper/characterizing-well-behaved-vs-pathological
Repo	https://github.com/alabatie/moments-dnns
Framework	tf

Learning Depth with Convolutional Spatial Propagation Network


Title	Learning Depth with Convolutional Spatial Propagation Network
Authors	Xinjing Cheng, Peng Wang, Ruigang Yang
Abstract	Depth prediction is one of the fundamental problems in computer vision. In this paper, we propose a simple yet effective convolutional spatial propagation network (CSPN) to learn the affinity matrix for various depth estimation tasks. Specifically, it is an efficient linear propagation model, in which the propagation is performed with a manner of recurrent convolutional operation, and the affinity among neighboring pixels is learned through a deep convolutional neural network (CNN). We can append this module to any output from a state-of-the-art (SOTA) depth estimation networks to improve their performances. In practice, we further extend CSPN in two aspects: 1) take sparse depth map as additional input, which is useful for the task of depth completion; 2) similar to commonly used 3D convolution operation in CNNs, we propose 3D CSPN to handle features with one additional dimension, which is effective in the task of stereo matching using 3D cost volume. For the tasks of sparse to dense, a.k.a depth completion. We experimented the proposed CPSN conjunct algorithms over the popular NYU v2 and KITTI datasets, where we show that our proposed algorithms not only produce high quality (e.g., 30% more reduction in depth error), but also run faster (e.g., 2 to 5x faster) than previous SOTA spatial propagation network. We also evaluated our stereo matching algorithm on the Scene Flow and KITTI Stereo datasets, and rank 1st on both the KITTI Stereo 2012 and 2015 benchmarks, which demonstrates the effectiveness of the proposed module. The code of CSPN proposed in this work will be released at https://github.com/XinJCheng/CSPN.
Tasks	Depth Completion, Depth Estimation, Stereo Matching, Stereo Matching Hand
Published	2018-10-04
URL	https://arxiv.org/abs/1810.02695v3
PDF	https://arxiv.org/pdf/1810.02695v3.pdf
PWC	https://paperswithcode.com/paper/learning-depth-with-convolutional-spatial
Repo	https://github.com/XinJCheng/CSPN
Framework	pytorch

Hamiltonian Descent Methods


Title	Hamiltonian Descent Methods
Authors	Chris J. Maddison, Daniel Paulin, Yee Whye Teh, Brendan O’Donoghue, Arnaud Doucet
Abstract	We propose a family of optimization methods that achieve linear convergence using first-order gradient information and constant step sizes on a class of convex functions much larger than the smooth and strongly convex ones. This larger class includes functions whose second derivatives may be singular or unbounded at their minima. Our methods are discretizations of conformal Hamiltonian dynamics, which generalize the classical momentum method to model the motion of a particle with non-standard kinetic energy exposed to a dissipative force and the gradient field of the function of interest. They are first-order in the sense that they require only gradient computation. Yet, crucially the kinetic gradient map can be designed to incorporate information about the convex conjugate in a fashion that allows for linear convergence on convex functions that may be non-smooth or non-strongly convex. We study in detail one implicit and two explicit methods. For one explicit method, we provide conditions under which it converges to stationary points of non-convex functions. For all, we provide conditions on the convex function and kinetic energy pair that guarantee linear convergence, and show that these conditions can be satisfied by functions with power growth. In sum, these methods expand the class of convex functions on which linear convergence is possible with first-order computation.
Tasks
Published	2018-09-13
URL	http://arxiv.org/abs/1809.05042v1
PDF	http://arxiv.org/pdf/1809.05042v1.pdf
PWC	https://paperswithcode.com/paper/hamiltonian-descent-methods
Repo	https://github.com/takyamamoto/FirstExplicitMethod-HDM
Framework	none

On the Complexity of Opinions and Online Discussions


Title	On the Complexity of Opinions and Online Discussions
Authors	Utkarsh Upadhyay, Abir De, Aasish Pappu, Manuel Gomez-Rodriguez
Abstract	In an increasingly polarized world, demagogues who reduce complexity down to simple arguments based on emotion are gaining in popularity. Are opinions and online discussions falling into demagoguery? In this work, we aim to provide computational tools to investigate this question and, by doing so, explore the nature and complexity of online discussions and their space of opinions, uncovering where each participant lies. More specifically, we present a modeling framework to construct latent representations of opinions in online discussions which are consistent with human judgements, as measured by online voting. If two opinions are close in the resulting latent space of opinions, it is because humans think they are similar. Our modeling framework is theoretically grounded and establishes a surprising connection between opinions and voting models and the sign-rank of a matrix. Moreover, it also provides a set of practical algorithms to both estimate the dimension of the latent space of opinions and infer where opinions expressed by the participants of an online discussion lie in this space. Experiments on a large dataset from Yahoo! News, Yahoo! Finance, Yahoo! Sports, and the Newsroom app suggest that unidimensional opinion models may often be unable to accurately represent online discussions, provide insights into human judgements and opinions, and show that our framework is able to circumvent language nuances such as sarcasm or humor by relying on human judgements instead of textual analysis.
Tasks
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06807v2
PDF	http://arxiv.org/pdf/1802.06807v2.pdf
PWC	https://paperswithcode.com/paper/on-the-complexity-of-opinions-and-online
Repo	https://github.com/Networks-Learning/discussion-complexity
Framework	none

Action-Agnostic Human Pose Forecasting


Title	Action-Agnostic Human Pose Forecasting
Authors	Hsu-kuang Chiu, Ehsan Adeli, Borui Wang, De-An Huang, Juan Carlos Niebles
Abstract	Predicting and forecasting human dynamics is a very interesting but challenging task with several prospective applications in robotics, health-care, etc. Recently, several methods have been developed for human pose forecasting; however, they often introduce a number of limitations in their settings. For instance, previous work either focused only on short-term or long-term predictions, while sacrificing one or the other. Furthermore, they included the activity labels as part of the training process, and require them at testing time. These limitations confine the usage of pose forecasting models for real-world applications, as often there are no activity-related annotations for testing scenarios. In this paper, we propose a new action-agnostic method for short- and long-term human pose forecasting. To this end, we propose a new recurrent neural network for modeling the hierarchical and multi-scale characteristics of the human dynamics, denoted by triangular-prism RNN (TP-RNN). Our model captures the latent hierarchical structure embedded in temporal human pose sequences by encoding the temporal dependencies with different time-scales. For evaluation, we run an extensive set of experiments on Human 3.6M and Penn Action datasets and show that our method outperforms baseline and state-of-the-art methods quantitatively and qualitatively. Codes are available at https://github.com/eddyhkchiu/pose_forecast_wacv/
Tasks	Human Dynamics, Human Pose Forecasting
Published	2018-10-23
URL	http://arxiv.org/abs/1810.09676v1
PDF	http://arxiv.org/pdf/1810.09676v1.pdf
PWC	https://paperswithcode.com/paper/action-agnostic-human-pose-forecasting
Repo	https://github.com/eddyhkchiu/pose_forecast_wacv
Framework	tf

CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++


Title	CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
Authors	Xiaolin Wang
Abstract	This paper presents an open-source enforcement learning toolkit named CytonRL (https://github.com/arthurxlw/cytonRL). The toolkit implements four recent advanced deep Q-learning algorithms from scratch using C++ and NVIDIA’s GPU-accelerated libraries. The code is simple and elegant, owing to an open-source general-purpose neural network library named CytonLib. Benchmark shows that the toolkit achieves competitive performances on the popular Atari game of Breakout.
Tasks	Q-Learning
Published	2018-04-14
URL	http://arxiv.org/abs/1804.05834v1
PDF	http://arxiv.org/pdf/1804.05834v1.pdf
PWC	https://paperswithcode.com/paper/cytonrl-an-efficient-reinforcement-learning
Repo	https://github.com/arthurxlw/cytonRL
Framework	none

Two-Stage Convolutional Neural Network for Breast Cancer Histology Image Classification


Title	Two-Stage Convolutional Neural Network for Breast Cancer Histology Image Classification
Authors	Kamyar Nazeri, Azad Aminpour, Mehran Ebrahimi
Abstract	This paper explores the problem of breast tissue classification of microscopy images. Based on the predominant cancer type the goal is to classify images into four categories of normal, benign, in situ carcinoma, and invasive carcinoma. Given a suitable training dataset, we utilize deep learning techniques to address the classification problem. Due to the large size of each image in the training dataset, we propose a patch-based technique which consists of two consecutive convolutional neural networks. The first “patch-wise” network acts as an auto-encoder that extracts the most salient features of image patches while the second “image-wise” network performs classification of the whole image. The first network is pre-trained and aimed at extracting local information while the second network obtains global information of an input image. We trained the networks using the ICIAR 2018 grand challenge on BreAst Cancer Histology (BACH) dataset. The proposed method yields 95 % accuracy on the validation set compared to previously reported 77 % accuracy rates in the literature. Our code is publicly available at https://github.com/ImagingLab/ICIAR2018
Tasks	Breast Cancer Histology Image Classification, Image Classification
Published	2018-03-11
URL	http://arxiv.org/abs/1803.04054v2
PDF	http://arxiv.org/pdf/1803.04054v2.pdf
PWC	https://paperswithcode.com/paper/two-stage-convolutional-neural-network-for
Repo	https://github.com/ImagingLab/ICIAR2018
Framework	pytorch

DialogueRNN: An Attentive RNN for Emotion Detection in Conversations


Title	DialogueRNN: An Attentive RNN for Emotion Detection in Conversations
Authors	Navonil Majumder, Soujanya Poria, Devamanyu Hazarika, Rada Mihalcea, Alexander Gelbukh, Erik Cambria
Abstract	Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, etc. Currently, systems do not treat the parties in the conversation individually by adapting to the speaker of each utterance. In this paper, we describe a new method based on recurrent neural networks that keeps track of the individual party states throughout the conversation and uses this information for emotion classification. Our model outperforms the state of the art by a significant margin on two different datasets.
Tasks	Emotion Classification, Emotion Recognition in Conversation, Multimodal Emotion Recognition, Opinion Mining
Published	2018-11-01
URL	https://arxiv.org/abs/1811.00405v4
PDF	https://arxiv.org/pdf/1811.00405v4.pdf
PWC	https://paperswithcode.com/paper/dialoguernn-an-attentive-rnn-for-emotion
Repo	https://github.com/SenticNet/conv-emotion
Framework	pytorch

Beyond Markov Logic: Efficient Mining of Prediction Rules in Large Graphs


Title	Beyond Markov Logic: Efficient Mining of Prediction Rules in Large Graphs
Authors	Tommaso Soru, André Valdestilhas, Edgard Marx, Axel-Cyrille Ngonga Ngomo
Abstract	Graph representations of large knowledge bases may comprise billions of edges. Usually built upon human-generated ontologies, several knowledge bases do not feature declared ontological rules and are far from being complete. Current rule mining approaches rely on schemata or store the graph in-memory, which can be unfeasible for large graphs. In this paper, we introduce HornConcerto, an algorithm to discover Horn clauses in large graphs without the need of a schema. Using a standard fact-based confidence score, we can mine close Horn rules having an arbitrary body size. We show that our method can outperform existing approaches in terms of runtime and memory consumption and mine high-quality rules for the link prediction task, achieving state-of-the-art results on a widely-used benchmark. Moreover, we find that rules alone can perform inference significantly faster than embedding-based methods and achieve accuracies on link prediction comparable to resource-demanding approaches such as Markov Logic Networks.
Tasks	Link Prediction
Published	2018-02-10
URL	http://arxiv.org/abs/1802.03638v2
PDF	http://arxiv.org/pdf/1802.03638v2.pdf
PWC	https://paperswithcode.com/paper/beyond-markov-logic-efficient-mining-of
Repo	https://github.com/mommi84/horn-concerto
Framework	none

The UEA multivariate time series classification archive, 2018


Title	The UEA multivariate time series classification archive, 2018
Authors	Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, Eamonn Keogh
Abstract	In 2002, the UCR time series classification archive was first released with sixteen datasets. It gradually expanded, until 2015 when it increased in size from 45 datasets to 85 datasets. In October 2018 more datasets were added, bringing the total to 128. The new archive contains a wide range of problems, including variable length series, but it still only contains univariate time series classification problems. One of the motivations for introducing the archive was to encourage researchers to perform a more rigorous evaluation of newly proposed time series classification (TSC) algorithms. It has worked: most recent research into TSC uses all 85 datasets to evaluate algorithmic advances. Research into multivariate time series classification, where more than one series are associated with each class label, is in a position where univariate TSC research was a decade ago. Algorithms are evaluated using very few datasets and claims of improvement are not based on statistical comparisons. We aim to address this problem by forming the first iteration of the MTSC archive, to be hosted at the website www.timeseriesclassification.com. Like the univariate archive, this formulation was a collaborative effort between researchers at the University of East Anglia (UEA) and the University of California, Riverside (UCR). The 2018 vintage consists of 30 datasets with a wide range of cases, dimensions and series lengths. For this first iteration of the archive we format all data to be of equal length, include no series with missing data and provide train/test splits.
Tasks	Time Series, Time Series Classification
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00075v1
PDF	http://arxiv.org/pdf/1811.00075v1.pdf
PWC	https://paperswithcode.com/paper/the-uea-multivariate-time-series
Repo	https://github.com/FlorentF9/DeepTemporalClustering
Framework	tf

Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset


Title	Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset
Authors	Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau
Abstract	One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill. While it is straightforward for humans to recognize and acknowledge others’ feelings in a conversation, this is a significant challenge for AI systems due to the paucity of suitable publicly-available datasets for training and evaluation. This work proposes a new benchmark for empathetic dialogue generation and EmpatheticDialogues, a novel dataset of 25k conversations grounded in emotional situations. Our experiments indicate that dialogue models that use our dataset are perceived to be more empathetic by human evaluators, compared to models merely trained on large-scale Internet conversation data. We also present empirical comparisons of dialogue model adaptations for empathetic responding, leveraging existing models or datasets without requiring lengthy re-training of the full model.
Tasks	Dialogue Generation
Published	2018-11-01
URL	https://arxiv.org/abs/1811.00207v5
PDF	https://arxiv.org/pdf/1811.00207v5.pdf
PWC	https://paperswithcode.com/paper/i-know-the-feeling-learning-to-converse-with
Repo	https://github.com/facebookresearch/EmpatheticDialogues
Framework	pytorch

Representer Point Selection for Explaining Deep Neural Networks


Title	Representer Point Selection for Explaining Deep Neural Networks
Authors	Chih-Kuan Yeh, Joon Sik Kim, Ian E. H. Yen, Pradeep Ravikumar
Abstract	We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction. Specifically, we show that we can decompose the pre-activation prediction of a neural network into a linear combination of activations of training points, with the weights corresponding to what we call representer values, which thus capture the importance of that training point on the learned parameters of the network. But it provides a deeper understanding of the network than simply training point influence: with positive representer values corresponding to excitatory training points, and negative values corresponding to inhibitory points, which as we show provides considerably more insight. Our method is also much more scalable, allowing for real-time feedback in a manner not feasible with influence functions.
Tasks
Published	2018-11-23
URL	http://arxiv.org/abs/1811.09720v1
PDF	http://arxiv.org/pdf/1811.09720v1.pdf
PWC	https://paperswithcode.com/paper/representer-point-selection-for-explaining
Repo	https://github.com/chihkuanyeh/Representer_Point_Selection
Framework	tf

City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug Interactions


Title	City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug Interactions
Authors	Rion Brattig Correia, Luciana P. de Araújo, Mauro M. Mattos, Luis M. Rocha
Abstract	The occurrence of drug-drug-interactions (DDI) from multiple drug dispensations is a serious problem, both for individuals and health-care systems, since patients with complications due to DDI are likely to reenter the system at a costlier level. We present a large-scale longitudinal study (18 months) of the DDI phenomenon at the primary- and secondary-care level using electronic health records (EHR) from the city of Blumenau in Southern Brazil (pop. $\approx 340,000$). We found that 181 distinct drug pairs known to interact were dispensed concomitantly to 12% of the patients in the city’s public health-care system. Further, 4% of the patients were dispensed drug pairs that are likely to result in major adverse drug reactions (ADR)—with costs estimated to be much larger than previously reported in smaller studies. The large-scale analysis reveals that women have a 60% increased risk of DDI as compared to men; the increase becomes 90% when considering only DDI known to lead to major ADR. Furthermore, DDI risk increases substantially with age; patients aged 70-79 years have a 34% risk of DDI when they are dispensed two or more drugs concomitantly. Interestingly, a statistical null model demonstrates that age- and female-specific risks from increased polypharmacy fail by far to explain the observed DDI risks in those populations, suggesting unknown social or biological causes. We also provide a network visualization of drugs and demographic factors that characterize the DDI phenomenon and demonstrate that accurate DDI prediction can be included in healthcare and public-health management, to reduce DDI-related ADR and costs.
Tasks
Published	2018-03-09
URL	https://arxiv.org/abs/1803.03571v4
PDF	https://arxiv.org/pdf/1803.03571v4.pdf
PWC	https://paperswithcode.com/paper/city-wide-analysis-of-electronic-health
Repo	https://github.com/rionbr/DDIBlumenau
Framework	none

Trace your sources in large-scale data: one ring to find them all


Title	Trace your sources in large-scale data: one ring to find them all
Authors	Alexander Böttcher, Wieland Brendel, Bernhard Englitz, Matthias Bethge
Abstract	An important preprocessing step in most data analysis pipelines aims to extract a small set of sources that explain most of the data. Currently used algorithms for blind source separation (BSS), however, often fail to extract the desired sources and need extensive cross-validation. In contrast, their rarely used probabilistic counterparts can get away with little cross-validation and are more accurate and reliable but no simple and scalable implementations are available. Here we present a novel probabilistic BSS framework (DECOMPOSE) that can be flexibly adjusted to the data, is extensible and easy to use, adapts to individual sources and handles large-scale data through algorithmic efficiency. DECOMPOSE encompasses and generalises many traditional BSS algorithms such as PCA, ICA and NMF and we demonstrate substantial improvements in accuracy and robustness on artificial and real data.
Tasks
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08882v1
PDF	http://arxiv.org/pdf/1803.08882v1.pdf
PWC	https://paperswithcode.com/paper/trace-your-sources-in-large-scale-data-one
Repo	https://github.com/bethgelab/decompose
Framework	tf

Game-Based Video-Context Dialogue


Title	Game-Based Video-Context Dialogue
Authors	Ramakanth Pasunuru, Mohit Bansal
Abstract	Current dialogue systems focus more on textual and speech context knowledge and are usually based on two speakers. Some recent work has investigated static image-based dialogue. However, several real-world human interactions also involve dynamic visual context (similar to videos) as well as dialogue exchanges among multiple speakers. To move closer towards such multimodal conversational skills and visually-situated applications, we introduce a new video-context, many-speaker dialogue dataset based on live-broadcast soccer game videos and chats from Twitch.tv. This challenging testbed allows us to develop visually-grounded dialogue models that should generate relevant temporal and spatial event language from the live video, while also being relevant to the chat history. For strong baselines, we also present several discriminative and generative models, e.g., based on tridirectional attention flow (TriDAF). We evaluate these models via retrieval ranking-recall, automatic phrase-matching metrics, as well as human evaluation studies. We also present dataset analyses, model ablations, and visualizations to understand the contribution of different modalities and model components.
Tasks
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04560v2
PDF	http://arxiv.org/pdf/1809.04560v2.pdf
PWC	https://paperswithcode.com/paper/game-based-video-context-dialogue
Repo	https://github.com/ramakanth-pasunuru/video-dialogue
Framework	tf