October 18, 2019

3252 words 16 mins read

Paper Group ANR 437

EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue. Pull Message Passing for Nonparametric Belief Propagation. Optimal Bayesian Transfer Learning. Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model. Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Managemen …

EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue


Title	EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue
Authors	Linkai Luo, Haiqing Yang, Francis Y. L. Chin
Abstract	In this paper, we propose a self-attentive bidirectional long short-term memory (SA-BiLSTM) network to predict multiple emotions for the EmotionX challenge. The BiLSTM exhibits the power of modeling the word dependencies, and extracting the most relevant features for emotion classification. Building on top of BiLSTM, the self-attentive network can model the contextual dependencies between utterances which are helpful for classifying the ambiguous emotions. We achieve 59.6 and 55.0 unweighted accuracy scores in the \textit{Friends} and the \textit{EmotionPush} test sets, respectively.
Tasks	Emotion Classification
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07039v2
PDF	http://arxiv.org/pdf/1806.07039v2.pdf
PWC	https://paperswithcode.com/paper/emotionx-dlc-self-attentive-bilstm-for-1
Repo
Framework

Pull Message Passing for Nonparametric Belief Propagation


Title	Pull Message Passing for Nonparametric Belief Propagation
Authors	Karthik Desingh, Anthony Opipari, Odest Chadwicke Jenkins
Abstract	We present a “pull” approach to approximate products of Gaussian mixtures within message updates for Nonparametric Belief Propagation (NBP) inference. Existing NBP methods often represent messages between continuous-valued latent variables as Gaussian mixture models. To avoid computational intractability in loopy graphs, NBP necessitates an approximation of the product of such mixtures. Sampling-based product approximations have shown effectiveness for NBP inference. However, such approximations used within the traditional “push” message update procedures quickly become computationally prohibitive for multi-modal distributions over high-dimensional variables. In contrast, we propose a “pull” method, as the Pull Message Passing for Nonparametric Belief propagation (PMPNBP) algorithm, and demonstrate its viability for efficient inference. We report results using an experiment from an existing NBP method, PAMPAS, for inferring the pose of an articulated structure in clutter. Results from this illustrative problem found PMPNBP has a greater ability to efficiently scale the number of components in its mixtures and, consequently, improve inference accuracy.
Tasks
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10487v1
PDF	http://arxiv.org/pdf/1807.10487v1.pdf
PWC	https://paperswithcode.com/paper/pull-message-passing-for-nonparametric-belief
Repo
Framework

Optimal Bayesian Transfer Learning


Title	Optimal Bayesian Transfer Learning
Authors	Alireza Karbalayghareh, Xiaoning Qian, Edward R. Dougherty
Abstract	Transfer learning has recently attracted significant research attention, as it simultaneously learns from different source domains, which have plenty of labeled data, and transfers the relevant knowledge to the target domain with limited labeled data to improve the prediction performance. We propose a Bayesian transfer learning framework where the source and target domains are related through the joint prior density of the model parameters. The modeling of joint prior densities enables better understanding of the “transferability” between domains. We define a joint Wishart density for the precision matrices of the Gaussian feature-label distributions in the source and target domains to act like a bridge that transfers the useful information of the source domain to help classification in the target domain by improving the target posteriors. Using several theorems in multivariate statistics, the posteriors and posterior predictive densities are derived in closed forms with hypergeometric functions of matrix argument, leading to our novel closed-form and fast Optimal Bayesian Transfer Learning (OBTL) classifier. Experimental results on both synthetic and real-world benchmark data confirm the superb performance of the OBTL compared to the other state-of-the-art transfer learning and domain adaptation methods.
Tasks	Domain Adaptation, Transfer Learning
Published	2018-01-02
URL	http://arxiv.org/abs/1801.00857v2
PDF	http://arxiv.org/pdf/1801.00857v2.pdf
PWC	https://paperswithcode.com/paper/optimal-bayesian-transfer-learning
Repo
Framework

Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model


Title	Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model
Authors	Daniel C. Castro, Sebastian Nowozin
Abstract	Current face recognition systems typically operate via classification into known identities obtained from supervised identity annotations. There are two problems with this paradigm: (1) current systems are unable to benefit from often abundant unlabelled data; and (2) they equate successful recognition with labelling a given input image. Humans, on the other hand, regularly perform identification of individuals completely unsupervised, recognising the identity of someone they have seen before even without being able to name that individual. How can we go beyond the current classification paradigm towards a more human understanding of identities? In previous work, we proposed an integrated Bayesian model that coherently reasons about the observed images, identities, partial knowledge about names, and the situational context of each observation. Here, we propose extensions of the contextual component of this model, enabling unsupervised discovery of an unbounded number of contexts for improved face recognition.
Tasks	Face Recognition
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07753v1
PDF	http://arxiv.org/pdf/1811.07753v1.pdf
PWC	https://paperswithcode.com/paper/contextual-face-recognition-with-a-nested
Repo
Framework

Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management


Title	Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management
Authors	Ursula Challita, Walid Saad, Christian Bettstetter
Abstract	In this paper, an interference-aware path planning scheme for a network of cellular-connected unmanned aerial vehicles (UAVs) is proposed. In particular, each UAV aims at achieving a tradeoff between maximizing energy efficiency and minimizing both wireless latency and the interference level caused on the ground network along its path. The problem is cast as a dynamic game among UAVs. To solve this game, a deep reinforcement learning algorithm, based on echo state network (ESN) cells, is proposed. The introduced deep ESN architecture is trained to allow each UAV to map each observation of the network state to an action, with the goal of minimizing a sequence of time-dependent utility functions. Each UAV uses ESN to learn its optimal path, transmission power level, and cell association vector at different locations along its path. The proposed algorithm is shown to reach a subgame perfect Nash equilibrium (SPNE) upon convergence. Moreover, an upper and lower bound for the altitude of the UAVs is derived thus reducing the computational complexity of the proposed algorithm. Simulation results show that the proposed scheme achieves better wireless latency per UAV and rate per ground user (UE) while requiring a number of steps that is comparable to a heuristic baseline that considers moving via the shortest distance towards the corresponding destinations. The results also show that the optimal altitude of the UAVs varies based on the ground network density and the UE data rate requirements and plays a vital role in minimizing the interference level on the ground UEs as well as the wireless transmission delay of the UAV.
Tasks
Published	2018-01-16
URL	http://arxiv.org/abs/1801.05500v1
PDF	http://arxiv.org/pdf/1801.05500v1.pdf
PWC	https://paperswithcode.com/paper/cellular-connected-uavs-over-5g-deep
Repo
Framework

Understanding and Comparing Scalable Gaussian Process Regression for Big Data


Title	Understanding and Comparing Scalable Gaussian Process Regression for Big Data
Authors	Haitao Liu, Jianfei Cai, Yew-Soon Ong, Yi Wang
Abstract	As a non-parametric Bayesian model which produces informative predictive distribution, Gaussian process (GP) has been widely used in various fields, like regression, classification and optimization. The cubic complexity of standard GP however leads to poor scalability, which poses challenges in the era of big data. Hence, various scalable GPs have been developed in the literature in order to improve the scalability while retaining desirable prediction accuracy. This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness. The numerical experiments on two toy examples and five real-world datasets with up to 250K points offer the following findings. In terms of scalability, most of the scalable GPs own a time complexity that is linear to the training size. In terms of capability, the sparse approximations capture the long-term spatial correlations, the local aggregations capture the local patterns but suffer from over-fitting in some scenarios. In terms of controllability, we could improve the performance of sparse approximations by simply increasing the inducing size. But this is not the case for local aggregations. In terms of robustness, local aggregations are robust to various initializations of hyperparameters due to the local attention mechanism. Finally, we highlight that the proper hybrid of global and local scalable GPs may be a promising way to improve both the model capability and scalability for big data.
Tasks
Published	2018-11-03
URL	http://arxiv.org/abs/1811.01159v1
PDF	http://arxiv.org/pdf/1811.01159v1.pdf
PWC	https://paperswithcode.com/paper/understanding-and-comparing-scalable-gaussian
Repo
Framework

Motion deblurring of faces


Title	Motion deblurring of faces
Authors	Grigorios G. Chrysos, Paolo Favaro, Stefanos Zafeiriou
Abstract	Face analysis is a core part of computer vision, in which remarkable progress has been observed in the past decades. Current methods achieve recognition and tracking with invariance to fundamental modes of variation such as illumination, 3D pose, expressions. Notwithstanding, a much less standing mode of variation is motion deblurring, which however presents substantial challenges in face analysis. Recent approaches either make oversimplifying assumptions, e.g. in cases of joint optimization with other tasks, or fail to preserve the highly structured shape/identity information. Therefore, we propose a data-driven method that encourages identity preservation. The proposed model includes two parallel streams (sub-networks): the first deblurs the image, the second implicitly extracts and projects the identity of both the sharp and the blurred image in similar subspaces. We devise a method for creating realistic motion blur by averaging a variable number of frames to train our model. The averaged images originate from a 2MF2 dataset with 10 million facial frames, which we introduce for the task. Considering deblurring as an intermediate step, we utilize the deblurred outputs to conduct a thorough experimentation on high-level face analysis tasks, i.e. landmark localization and face verification. The experimental evaluation demonstrates the superiority of our method.
Tasks	Deblurring, Face Verification
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03330v1
PDF	http://arxiv.org/pdf/1803.03330v1.pdf
PWC	https://paperswithcode.com/paper/motion-deblurring-of-faces
Repo
Framework

Algorithmic Bidding for Virtual Trading in Electricity Markets


Title	Algorithmic Bidding for Virtual Trading in Electricity Markets
Authors	Sevi Baltaoglu, Lang Tong, Qing Zhao
Abstract	We consider the problem of optimal bidding for virtual trading in two-settlement electricity markets. A virtual trader aims to arbitrage on the differences between day-ahead and real-time market prices; both prices, however, are random and unknown to market participants. An online learning algorithm is proposed to maximize the cumulative payoff over a finite number of trading sessions by allocating the trader’s budget among his bids for K options in each session. It is shown that the proposed algorithm converges, with an almost optimal convergence rate, to the global optimal corresponding to the case when the underlying price distribution is known. The proposed algorithm is also generalized for trading strategies with a risk measure. By using both cumulative payoff and Sharpe ratio as performance metrics, evaluations were performed based on historical data spanning ten year period of NYISO and PJM markets. It was shown that the proposed strategy outperforms standard benchmarks and the S&P 500 index over the same period.
Tasks
Published	2018-02-08
URL	http://arxiv.org/abs/1802.03010v2
PDF	http://arxiv.org/pdf/1802.03010v2.pdf
PWC	https://paperswithcode.com/paper/algorithmic-bidding-for-virtual-trading-in
Repo
Framework

Improving Fast Segmentation With Teacher-student Learning


Title	Improving Fast Segmentation With Teacher-student Learning
Authors	Jiafeng Xie, Bing Shuai, Jian-Fang Hu, Jingyang Lin, Wei-Shi Zheng
Abstract	Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks. However, these models are very heavy and generally suffer from low inference speed, which limits their application scenarios in practice. Meanwhile, existing fast segmentation models usually fail to obtain satisfactory segmentation accuracies on public benchmarks. In this paper, we propose a teacher-student learning framework that transfers the knowledge gained by a heavy and better performed segmentation network (i.e. teacher) to guide the learning of fast segmentation networks (i.e. student). Specifically, both zero-order and first-order knowledge depicted in the fine annotated images and unlabeled auxiliary data are transferred to regularize our student learning. The proposed method can improve existing fast segmentation models without incurring extra computational overhead, so it can still process images with the same fast speed. Extensive experiments on the Pascal Context, Cityscape and VOC 2012 datasets demonstrate that the proposed teacher-student learning framework is able to significantly boost the performance of student network.
Tasks
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08476v1
PDF	http://arxiv.org/pdf/1810.08476v1.pdf
PWC	https://paperswithcode.com/paper/improving-fast-segmentation-with-teacher
Repo
Framework

Invariant properties of a locally salient dither pattern with a spatial-chromatic histogram


Title	Invariant properties of a locally salient dither pattern with a spatial-chromatic histogram
Authors	A. M. R. R. Bandara, L. Ranathunga, N. A. Abdullah
Abstract	Compacted Dither Pattern Code (CDPC) is a recently found feature which is successful in irregular shapes based visual depiction. Locally salient dither pattern feature is an attempt to expand the capability of CDPC for both regular and irregular shape based visual depiction. This paper presents an analysis of rotational and scale invariance property of locally salient dither pattern feature with a two dimensional spatialchromatic histogram, which expands the applicability of the visual feature. Experiments were conducted to exhibit rotational and scale invariance of the feature. These experiments were conducted by combining linear Support Vector Machine (SVM) classifier to the new feature. The experimental results revealed that the locally salient dither pattern feature with the spatialchromatic histogram is rotationally and scale invariant.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00037v1
PDF	http://arxiv.org/pdf/1803.00037v1.pdf
PWC	https://paperswithcode.com/paper/invariant-properties-of-a-locally-salient
Repo
Framework

EdgeSpeechNets: Highly Efficient Deep Neural Networks for Speech Recognition on the Edge


Title	EdgeSpeechNets: Highly Efficient Deep Neural Networks for Speech Recognition on the Edge
Authors	Zhong Qiu Lin, Audrey G. Chung, Alexander Wong
Abstract	Despite showing state-of-the-art performance, deep learning for speech recognition remains challenging to deploy in on-device edge scenarios such as mobile and other consumer devices. Recently, there have been greater efforts in the design of small, low-footprint deep neural networks (DNNs) that are more appropriate for edge devices, with much of the focus on design principles for hand-crafting efficient network architectures. In this study, we explore a human-machine collaborative design strategy for building low-footprint DNN architectures for speech recognition through a marriage of human-driven principled network design prototyping and machine-driven design exploration. The efficacy of this design strategy is demonstrated through the design of a family of highly-efficient DNNs (nicknamed EdgeSpeechNets) for limited-vocabulary speech recognition. Experimental results using the Google Speech Commands dataset for limited-vocabulary speech recognition showed that EdgeSpeechNets have higher accuracies than state-of-the-art DNNs (with the best EdgeSpeechNet achieving ~97% accuracy), while achieving significantly smaller network sizes (as much as 7.8x smaller) and lower computational cost (as much as 36x fewer multiply-add operations, 10x lower prediction latency, and 16x smaller memory footprint on a Motorola Moto E phone), making them very well-suited for on-device edge voice interface applications.
Tasks	Speech Recognition
Published	2018-10-18
URL	http://arxiv.org/abs/1810.08559v2
PDF	http://arxiv.org/pdf/1810.08559v2.pdf
PWC	https://paperswithcode.com/paper/edgespeechnets-highly-efficient-deep-neural
Repo
Framework

Differential Diagnosis for Pancreatic Cysts in CT Scans Using Densely-Connected Convolutional Networks


Title	Differential Diagnosis for Pancreatic Cysts in CT Scans Using Densely-Connected Convolutional Networks
Authors	Hongwei Li, Kanru Lin, Maximilian Reichert, Lina Xu, Rickmer Braren, Deliang Fu, Roland Schmid, Ji Li, Bjoern Menze, Kuangyu Shi
Abstract	The lethal nature of pancreatic ductal adenocarcinoma (PDAC) calls for early differential diagnosis of pancreatic cysts, which are identified in up to 16% of normal subjects, and some of which may develop into PDAC. Previous computer-aided developments have achieved certain accuracy for classification on segmented cystic lesions in CT. However, pancreatic cysts have a large variation in size and shape, and the precise segmentation of them remains rather challenging, which restricts the computer-aided interpretation of CT images acquired for differential diagnosis. We propose a computer-aided framework for early differential diagnosis of pancreatic cysts without pre-segmenting the lesions using densely-connected convolutional networks (Dense-Net). The Dense-Net learns high-level features from whole abnormal pancreas and builds mappings between medical imaging appearance to different pathological types of pancreatic cysts. To enhance the clinical applicability, we integrate saliency maps in the framework to assist the physicians to understand the decision of the deep learning method. The test on a cohort of 206 patients with 4 pathologically confirmed subtypes of pancreatic cysts has achieved an overall accuracy of 72.8%, which is significantly higher than the baseline accuracy of 48.1%, which strongly supports the clinical potential of our developed method.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01023v3
PDF	http://arxiv.org/pdf/1806.01023v3.pdf
PWC	https://paperswithcode.com/paper/differential-diagnosis-for-pancreatic-cysts
Repo
Framework

Large-Scale Visual Relationship Understanding


Title	Large-Scale Visual Relationship Understanding
Authors	Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny
Abstract	Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples. In real-world scenarios with large numbers of objects and relations, some are seen very commonly while others are barely seen. We develop a new relationship detection model that embeds objects and relations into two vector spaces where both discriminative capability and semantic affinity are preserved. We learn both a visual and a semantic module that map features from the two modalities into a shared space, where matched pairs of features have to discriminate against those unmatched, but also maintain close distances to semantically similar ones. Benefiting from that, our model can achieve superior performance even when the visual entity categories scale up to more than 80,000, with extremely skewed class distribution. We demonstrate the efficacy of our model on a large and imbalanced benchmark based of Visual Genome that comprises 53,000+ objects and 29,000+ relations, a scale at which no previous work has ever been evaluated at. We show superiority of our model over carefully designed baselines on the original Visual Genome dataset with 80,000+ categories. We also show state-of-the-art performance on the VRD dataset and the scene graph dataset which is a subset of Visual Genome with 200 categories.
Tasks
Published	2018-04-27
URL	https://arxiv.org/abs/1804.10660v4
PDF	https://arxiv.org/pdf/1804.10660v4.pdf
PWC	https://paperswithcode.com/paper/large-scale-visual-relationship-understanding
Repo
Framework

A Spectral Regularizer for Unsupervised Disentanglement


Title	A Spectral Regularizer for Unsupervised Disentanglement
Authors	Aditya Ramesh, Youngduck Choi, Yann LeCun
Abstract	A generative model with a disentangled representation allows for independent control over different aspects of the output. Learning disentangled representations has been a recent topic of great interest, but it remains poorly understood. We show that even for GANs that do not possess disentangled representations, one can find curved trajectories in latent space over which local disentanglement occurs. These trajectories are found by iteratively following the leading right-singular vectors of the Jacobian of the generator with respect to its input. Based on this insight, we describe an efficient regularizer that aligns these vectors with the coordinate axes, and show that it can be used to induce disentangled representations in GANs, in a completely unsupervised manner.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01161v2
PDF	http://arxiv.org/pdf/1812.01161v2.pdf
PWC	https://paperswithcode.com/paper/a-spectral-regularizer-for-unsupervised
Repo
Framework

FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design


Title	FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design
Authors	Jeff Zhang, Siddharth Garg
Abstract	Deep neural networks (DNN) are increasingly being accelerated on application-specific hardware such as the Google TPU designed especially for deep learning. Timing speculation is a promising approach to further increase the energy efficiency of DNN accelerators. Architectural exploration for timing speculation requires detailed gate-level timing simulations that can be time-consuming for large DNNs that execute millions of multiply-and-accumulate (MAC) operations. In this paper we propose FATE, a new methodology for fast and accurate timing simulations of DNN accelerators like the Google TPU. FATE proposes two novel ideas: (i) DelayNet, a DNN based timing model for MAC units; and (ii) a statistical sampling methodology that reduces the number of MAC operations for which timing simulations are performed. We show that FATE results in between 8 times-58 times speed-up in timing simulations, while introducing less than 2% error in classification accuracy estimates. We demonstrate the use of FATE by comparing to conventional DNN accelerator that uses 2’s complement (2C) arithmetic with an alternative implementation that uses signed magnitude representations (SMR). We show that that the SMR implementation provides 18% more energy savings for the same classification accuracy than 2C, a result that might be of independent interest.
Tasks
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00480v1
PDF	http://arxiv.org/pdf/1807.00480v1.pdf
PWC	https://paperswithcode.com/paper/fate-fast-and-accurate-timing-error
Repo
Framework