October 20, 2019

3372 words 16 mins read

Paper Group AWR 252

Probabilistic PARAFAC2. A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues. Adaptive Deep Learning through Visual Domain Localization. GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation. Pedestrian Detection with Autoregressive Network Phases. Joint Action Unit localisation and intensi …

Probabilistic PARAFAC2


Title	Probabilistic PARAFAC2
Authors	Philip J. H. Jørgensen, Søren F. V. Nielsen, Jesper L. Hinrich, Mikkel N. Schmidt, Kristoffer H. Madsen, Morten Mørup
Abstract	The PARAFAC2 is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable in order to improve robustness to noise and provide a well founded principle for determining the number of factors, but challenging because the factor loadings are constrained to be orthogonal. We develop two probabilistic formulations of the PARAFAC2 along with variational procedures for inference: In the one approach, the mean values of the factor loadings are orthogonal leading to closed form variational updates, and in the other, the factor loadings themselves are orthogonal using a matrix Von Mises-Fisher distribution. We contrast our probabilistic formulation to the conventional direct fitting algorithm based on maximum likelihood. On simulated data and real fluorescence spectroscopy and gas chromatography-mass spectrometry data, we compare our approach to the conventional PARAFAC2 model estimation and find that the probabilistic formulation is more robust to noise and model order misspecification. The probabilistic PARAFAC2 thus forms a promising framework for modeling multi-way data accounting for uncertainty.
Tasks
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08195v1
PDF	http://arxiv.org/pdf/1806.08195v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-parafac2
Repo	https://github.com/philipjhj/VBParafac2
Framework	none

A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues


Title	A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues
Authors	Zhouxing Shi, Minlie Huang
Abstract	Discourse structures are beneficial for various NLP tasks such as dialogue understanding, question answering, sentiment analysis, and so on. This paper presents a deep sequential model for parsing discourse dependency structures of multi-party dialogues. The proposed model aims to construct a discourse dependency tree by predicting dependency relations and constructing the discourse structure jointly and alternately. It makes a sequential scan of the Elementary Discourse Units (EDUs) in a dialogue. For each EDU, the model decides to which previous EDU the current one should link and what the corresponding relation type is. The predicted link and relation type are then used to build the discourse structure incrementally with a structured encoder. During link prediction and relation classification, the model utilizes not only local information that represents the concerned EDUs, but also global information that encodes the EDU sequence and the discourse structure that is already built at the current step. Experiments show that the proposed model outperforms all the state-of-the-art baselines.
Tasks	Dialogue Understanding, Link Prediction, Question Answering, Relation Classification, Sentiment Analysis
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00176v1
PDF	http://arxiv.org/pdf/1812.00176v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-sequential-model-for-discourse-parsing
Repo	https://github.com/shizhouxing/DialogueDiscourseParsing
Framework	tf

Adaptive Deep Learning through Visual Domain Localization


Title	Adaptive Deep Learning through Visual Domain Localization
Authors	Gabriele Angeletti, Barbara Caputo, Tatiana Tommasi
Abstract	A commercial robot, trained by its manufacturer to recognize a predefined number and type of objects, might be used in many settings, that will in general differ in their illumination conditions, background, type and degree of clutter, and so on. Recent computer vision works tackle this generalization issue through domain adaptation methods, assuming as source the visual domain where the system is trained and as target the domain of deployment. All approaches assume to have access to images from all classes of the target during training, an unrealistic condition in robotics applications. We address this issue proposing an algorithm that takes into account the specific needs of robot vision. Our intuition is that the nature of the domain shift experienced mostly in robotics is local. We exploit this through the learning of maps that spatially ground the domain and quantify the degree of shift, embedded into an end-to-end deep domain adaptation architecture. By explicitly localizing the roots of the domain shift we significantly reduce the number of parameters of the architecture to tune, we gain the flexibility necessary to deal with subset of categories in the target domain at training time, and we provide a clear feedback on the rationale behind any classification decision, which can be exploited in human-robot interactions. Experiments on two different settings of the iCub World database confirm the suitability of our method for robot vision.
Tasks	Domain Adaptation
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08833v1
PDF	http://arxiv.org/pdf/1802.08833v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-deep-learning-through-visual-domain
Repo	https://github.com/blackecho/LoAd-Network
Framework	pytorch

GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation


Title	GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation
Authors	Minsuk Kahng, Nikhil Thorat, Duen Horng Chau, Fernanda Viégas, Martin Wattenberg
Abstract	Recent success in deep learning has generated immense interest among practitioners and students, inspiring many to learn about this new technology. While visual and interactive approaches have been successfully developed to help people more easily learn deep learning, most existing tools focus on simpler models. In this work, we present GAN Lab, the first interactive visualization tool designed for non-experts to learn and experiment with Generative Adversarial Networks (GANs), a popular class of complex deep learning models. With GAN Lab, users can interactively train generative models and visualize the dynamic training process’s intermediate results. GAN Lab tightly integrates an model overview graph that summarizes GAN’s structure, and a layered distributions view that helps users interpret the interplay between submodels. GAN Lab introduces new interactive experimentation features for learning complex deep learning models, such as step-by-step training at multiple levels of abstraction for understanding intricate training dynamics. Implemented using TensorFlow.js, GAN Lab is accessible to anyone via modern web browsers, without the need for installation or specialized hardware, overcoming a major practical challenge in deploying interactive tools for deep learning.
Tasks
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01587v1
PDF	http://arxiv.org/pdf/1809.01587v1.pdf
PWC	https://paperswithcode.com/paper/gan-lab-understanding-complex-deep-generative
Repo	https://github.com/poloclub/ganlab
Framework	tf

Pedestrian Detection with Autoregressive Network Phases


Title	Pedestrian Detection with Autoregressive Network Phases
Authors	Garrick Brazil, Xiaoming Liu
Abstract	We present an autoregressive pedestrian detection framework with cascaded phases designed to progressively improve precision. The proposed framework utilizes a novel lightweight stackable decoder-encoder module which uses convolutional re-sampling layers to improve features while maintaining efficient memory and runtime cost. Unlike previous cascaded detection systems, our proposed framework is designed within a region proposal network and thus retains greater context of nearby detections compared to independently processed RoI systems. We explicitly encourage increasing levels of precision by assigning strict labeling policies to each consecutive phase such that early phases develop features primarily focused on achieving high recall and later on accurate precision. In consequence, the final feature maps form more peaky radial gradients emulating from the centroids of unique pedestrians. Using our proposed autoregressive framework leads to new state-of-the-art performance on the reasonable and occlusion settings of the Caltech pedestrian dataset, and achieves competitive state-of-the-art performance on the KITTI dataset.
Tasks	Pedestrian Detection
Published	2018-12-02
URL	http://arxiv.org/abs/1812.00440v1
PDF	http://arxiv.org/pdf/1812.00440v1.pdf
PWC	https://paperswithcode.com/paper/pedestrian-detection-with-autoregressive
Repo	https://github.com/garrickbrazil/AR-Ped
Framework	none

Joint Action Unit localisation and intensity estimation through heatmap regression


Title	Joint Action Unit localisation and intensity estimation through heatmap regression
Authors	Enrique Sanchez-Lozano, Georgios Tzimiropoulos, Michel Valstar
Abstract	This paper proposes a supervised learning approach to jointly perform facial Action Unit (AU) localisation and intensity estimation. Contrary to previous works that try to learn an unsupervised representation of the Action Unit regions, we propose to directly and jointly estimate all AU intensities through heatmap regression, along with the location in the face where they cause visible changes. Our approach aims to learn a pixel-wise regression function returning a score per AU, which indicates an AU intensity at a given spatial location. Heatmap regression then generates an image, or channel, per AU, in which each pixel indicates the corresponding AU intensity. To generate the ground-truth heatmaps for a target AU, the facial landmarks are first estimated, and a 2D Gaussian is drawn around the points where the AU is known to cause changes. The amplitude and size of the Gaussian is determined by the intensity of the AU. We show that using a single Hourglass network suffices to attain new state of the art results, demonstrating the effectiveness of such a simple approach. The use of heatmap regression allows learning of a shared representation between AUs without the need to rely on latent representations, as these are implicitly learned from the data. We validate the proposed approach on the BP4D dataset, showing a modest improvement on recent, complex, techniques, as well as robustness against misalignment errors. Code for testing and models will be available to download from https://github.com/ESanchezLozano/Action-Units-Heatmaps.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03487v2
PDF	http://arxiv.org/pdf/1805.03487v2.pdf
PWC	https://paperswithcode.com/paper/joint-action-unit-localisation-and-intensity
Repo	https://github.com/ESanchezLozano/Action-Units-Heatmaps
Framework	pytorch

Disparity Sliding Window: Object Proposals From Disparity Images


Title	Disparity Sliding Window: Object Proposals From Disparity Images
Authors	Julian Müller, Andreas Fregin, Klaus Dietmayer
Abstract	Sliding window approaches have been widely used for object recognition tasks in recent years. They guarantee an investigation of the entire input image for the object to be detected and allow a localization of that object. Despite the current trend towards deep neural networks, sliding window methods are still used in combination with convolutional neural networks. The risk of overlooking an object is clearly reduced compared to alternative detection approaches which detect objects based on shape, edges or color. Nevertheless, the sliding window technique strongly increases the computational effort as the classifier has to verify a large number of object candidates. This paper proposes a sliding window approach which also uses depth information from a stereo camera. This leads to a greatly decreased number of object candidates without significantly reducing the detection accuracy. A theoretical investigation of the conventional sliding window approach is presented first. Other publications to date only mentioned rough estimations of the computational cost. A mathematical derivation clarifies the number of object candidates with respect to parameters such as image and object size. Subsequently, the proposed disparity sliding window approach is presented in detail. The approach is evaluated on pedestrian detection with annotations and images from the KITTI object detection benchmark. Furthermore, a comparison with two state-of-the-art methods is made. Code is available in C++ and Python https://github.com/julimueller/ disparity-sliding-window.
Tasks	Object Detection, Object Recognition, Pedestrian Detection
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06830v2
PDF	http://arxiv.org/pdf/1805.06830v2.pdf
PWC	https://paperswithcode.com/paper/disparity-sliding-window-object-proposals
Repo	https://github.com/julimueller/disparity-sliding-window
Framework	none

Efficient parametrization of multi-domain deep neural networks


Title	Efficient parametrization of multi-domain deep neural networks
Authors	Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
Abstract	A practical limitation of deep neural networks is their high degree of specialization to a single task and visual domain. Recently, inspired by the successes of transfer learning, several authors have proposed to learn instead universal, fixed feature extractors that, used as the first stage of any deep network, work well for several tasks and domains simultaneously. Nevertheless, such universal features are still somewhat inferior to specialized networks. To overcome this limitation, in this paper we propose to consider instead universal parametric families of neural networks, which still contain specialized problem-specific models, but differing only by a small number of parameters. We study different designs for such parametrizations, including series and parallel residual adapters, joint adapter compression, and parameter allocations, and empirically identify the ones that yield the highest compression. We show that, in order to maximize performance, it is necessary to adapt both shallow and deep layers of a deep network, but the required changes are very small. We also show that these universal parametrization are very effective for transfer learning, where they outperform traditional fine-tuning techniques.
Tasks	Transfer Learning
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10082v1
PDF	http://arxiv.org/pdf/1803.10082v1.pdf
PWC	https://paperswithcode.com/paper/efficient-parametrization-of-multi-domain
Repo	https://github.com/srebuffi/residual_adapters
Framework	pytorch

Weakly-Supervised Neural Text Classification


Title	Weakly-Supervised Neural Text Classification
Authors	Yu Meng, Jiaming Shen, Chao Zhang, Jiawei Han
Abstract	Deep neural networks are gaining increasing popularity for the classic text classification task, due to their strong expressive power and less requirement for feature engineering. Despite such attractiveness, neural text classification models suffer from the lack of training data in many real-world applications. Although many semi-supervised and weakly-supervised text classification models exist, they cannot be easily applied to deep neural models and meanwhile support limited supervision types. In this paper, we propose a weakly-supervised method that addresses the lack of training data in neural text classification. Our method consists of two modules: (1) a pseudo-document generator that leverages seed information to generate pseudo-labeled documents for model pre-training, and (2) a self-training module that bootstraps on real unlabeled data for model refinement. Our method has the flexibility to handle different types of weak supervision and can be easily integrated into existing deep neural models for text classification. We have performed extensive experiments on three real-world datasets from different domains. The results demonstrate that our proposed method achieves inspiring performance without requiring excessive training data and outperforms baseline methods significantly.
Tasks	Feature Engineering, Text Classification
Published	2018-09-02
URL	http://arxiv.org/abs/1809.01478v2
PDF	http://arxiv.org/pdf/1809.01478v2.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-neural-text-classification
Repo	https://github.com/yumeng5/WeSTClass
Framework	none

A Tutorial on Bayesian Optimization


Title	A Tutorial on Bayesian Optimization
Authors	Peter I. Frazier
Abstract	Bayesian optimization is an approach to optimizing objective functions that take a long time (minutes or hours) to evaluate. It is best-suited for optimization over continuous domains of less than 20 dimensions, and tolerates stochastic noise in function evaluations. It builds a surrogate for the objective and quantifies the uncertainty in that surrogate using a Bayesian machine learning technique, Gaussian process regression, and then uses an acquisition function defined from this surrogate to decide where to sample. In this tutorial, we describe how Bayesian optimization works, including Gaussian process regression and three common acquisition functions: expected improvement, entropy search, and knowledge gradient. We then discuss more advanced techniques, including running multiple function evaluations in parallel, multi-fidelity and multi-information source optimization, expensive-to-evaluate constraints, random environmental conditions, multi-task Bayesian optimization, and the inclusion of derivative information. We conclude with a discussion of Bayesian optimization software and future research directions in the field. Within our tutorial material we provide a generalization of expected improvement to noisy evaluations, beyond the noise-free setting where it is more commonly applied. This generalization is justified by a formal decision-theoretic argument, standing in contrast to previous ad hoc modifications.
Tasks	Hyperparameter Optimization
Published	2018-07-08
URL	http://arxiv.org/abs/1807.02811v1
PDF	http://arxiv.org/pdf/1807.02811v1.pdf
PWC	https://paperswithcode.com/paper/a-tutorial-on-bayesian-optimization
Repo	https://github.com/wujian16/Cornell-MOE
Framework	none

BigDL: A Distributed Deep Learning Framework for Big Data


Title	BigDL: A Distributed Deep Learning Framework for Big Data
Authors	Jason Dai, Yiheng Wang, Xin Qiu, Ding Ding, Yao Zhang, Yanzhang Wang, Xianyan Jia, Cherry Zhang, Yan Wan, Zhichao Li, Jiao Wang, Shengsheng Huang, Zhongyuan Wu, Yang Wang, Yuhao Yang, Bowen She, Dongjie Shi, Qi Lu, Kai Huang, Guoqiong Song
Abstract	This paper presents BigDL (a distributed deep learning framework for Apache Spark), which has been used by a variety of users in the industry for building deep learning applications on production big data platforms. It allows deep learning applications to run on the Apache Hadoop/Spark cluster so as to directly process the production data, and as a part of the end-to-end data analysis pipeline for deployment and management. Unlike existing deep learning frameworks, BigDL implements distributed, data parallel training directly on top of the functional compute model (with copy-on-write and coarse-grained operations) of Spark. We also share real-world experience and “war stories” of users that have adopted BigDL to address their challenges(i.e., how to easily build end-to-end data analysis and deep learning pipelines for their production data).
Tasks	Fraud Detection, Object Detection
Published	2018-04-16
URL	https://arxiv.org/abs/1804.05839v4
PDF	https://arxiv.org/pdf/1804.05839v4.pdf
PWC	https://paperswithcode.com/paper/bigdl-a-distributed-deep-learning-framework
Repo	https://github.com/intel-analytics/BigDL
Framework	torch

Learning Named Entity Tagger using Domain-Specific Dictionary


Title	Learning Named Entity Tagger using Domain-Specific Dictionary
Authors	Jingbo Shang, Liyuan Liu, Xiang Ren, Xiaotao Gu, Teng Ren, Jiawei Han
Abstract	Recent advances in deep neural models allow us to build reliable named entity recognition (NER) systems without handcrafting features. However, such methods require large amounts of manually-labeled training data. There have been efforts on replacing human annotations with distant supervision (in conjunction with external dictionaries), but the generated noisy labels pose significant challenges on learning effective neural models. Here we propose two neural models to suit noisy distant supervision from the dictionary. First, under the traditional sequence labeling framework, we propose a revised fuzzy CRF layer to handle tokens with multiple possible labels. After identifying the nature of noisy labels in distant supervision, we go beyond the traditional framework and propose a novel, more effective neural model AutoNER with a new Tie or Break scheme. In addition, we discuss how to refine distant supervision for better NER performance. Extensive experiments on three benchmark datasets demonstrate that AutoNER achieves the best performance when only using dictionaries with no additional human effort, and delivers competitive results with state-of-the-art supervised benchmarks.
Tasks	Named Entity Recognition
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03599v1
PDF	http://arxiv.org/pdf/1809.03599v1.pdf
PWC	https://paperswithcode.com/paper/learning-named-entity-tagger-using-domain
Repo	https://github.com/shangjingbo1226/AutoNER
Framework	pytorch

Unbiased Implicit Variational Inference


Title	Unbiased Implicit Variational Inference
Authors	Michalis K. Titsias, Francisco J. R. Ruiz
Abstract	We develop unbiased implicit variational inference (UIVI), a method that expands the applicability of variational inference by defining an expressive variational family. UIVI considers an implicit variational distribution obtained in a hierarchical manner using a simple reparameterizable distribution whose variational parameters are defined by arbitrarily flexible deep neural networks. Unlike previous works, UIVI directly optimizes the evidence lower bound (ELBO) rather than an approximation to the ELBO. We demonstrate UIVI on several models, including Bayesian multinomial logistic regression and variational autoencoders, and show that UIVI achieves both tighter ELBO and better predictive performance than existing approaches at a similar computational cost.
Tasks
Published	2018-08-06
URL	http://arxiv.org/abs/1808.02078v3
PDF	http://arxiv.org/pdf/1808.02078v3.pdf
PWC	https://paperswithcode.com/paper/unbiased-implicit-variational-inference
Repo	https://github.com/franrruiz/uivi
Framework	none

Graph-based Selective Outlier Ensembles


Title	Graph-based Selective Outlier Ensembles
Authors	Hamed Sarvari, Carlotta Domeniconi, Giovanni Stilo
Abstract	An ensemble technique is characterized by the mechanism that generates the components and by the mechanism that combines them. A common way to achieve the consensus is to enable each component to equally participate in the aggregation process. A problem with this approach is that poor components are likely to negatively affect the quality of the consensus result. To address this issue, alternatives have been explored in the literature to build selective classifier and cluster ensembles, where only a subset of the components contributes to the computation of the consensus. Of the family of ensemble methods, outlier ensembles are the least studied. Only recently, the selection problem for outlier ensembles has been discussed. In this work we define a new graph-based class of ranking selection methods. A method in this class is characterized by two main steps: (1) Mapping the rankings onto a graph structure; and (2) Mining the resulting graph to identify a subset of rankings. We define a specific instance of the graph-based ranking selection class. Specifically, we map the problem of selecting ensemble components onto a mining problem in a graph. An extensive evaluation was conducted on a variety of heterogeneous data and methods. Our empirical results show that our approach outperforms state-of-the-art selective outlier ensemble techniques.
Tasks	outlier ensembles
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06378v1
PDF	http://arxiv.org/pdf/1804.06378v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-selective-outlier-ensembles
Repo	https://github.com/HamedSarvari/Graph-Based-Selective-Outlier-Ensembles
Framework	none

Benchmark Analysis of Representative Deep Neural Network Architectures


Title	Benchmark Analysis of Representative Deep Neural Network Architectures
Authors	Simone Bianco, Remi Cadene, Luigi Celona, Paolo Napoletano
Abstract	This work presents an in-depth analysis of the majority of the deep neural networks (DNNs) proposed in the state of the art for image recognition. For each DNN multiple performance indices are observed, such as recognition accuracy, model complexity, computational complexity, memory usage, and inference time. The behavior of such performance indices and some combinations of them are analyzed and discussed. To measure the indices we experiment the use of DNNs on two different computer architectures, a workstation equipped with a NVIDIA Titan X Pascal and an embedded system based on a NVIDIA Jetson TX1 board. This experimentation allows a direct comparison between DNNs running on machines with very different computational capacity. This study is useful for researchers to have a complete view of what solutions have been explored so far and in which research directions are worth exploring in the future; and for practitioners to select the DNN architecture(s) that better fit the resource constraints of practical deployments and applications. To complete this work, all the DNNs, as well as the software used for the analysis, are available online.
Tasks
Published	2018-10-01
URL	http://arxiv.org/abs/1810.00736v2
PDF	http://arxiv.org/pdf/1810.00736v2.pdf
PWC	https://paperswithcode.com/paper/benchmark-analysis-of-representative-deep
Repo	https://github.com/deyingk/Interesting_Papers
Framework	none