Paper Group ANR 437
EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue. Pull Message Passing for Nonparametric Belief Propagation. Optimal Bayesian Transfer Learning. Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model. Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Managemen …
EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue
Title | EmotionX-DLC: Self-Attentive BiLSTM for Detecting Sequential Emotions in Dialogue |
Authors | Linkai Luo, Haiqing Yang, Francis Y. L. Chin |
Abstract | In this paper, we propose a self-attentive bidirectional long short-term memory (SA-BiLSTM) network to predict multiple emotions for the EmotionX challenge. The BiLSTM exhibits the power of modeling the word dependencies, and extracting the most relevant features for emotion classification. Building on top of BiLSTM, the self-attentive network can model the contextual dependencies between utterances which are helpful for classifying the ambiguous emotions. We achieve 59.6 and 55.0 unweighted accuracy scores in the \textit{Friends} and the \textit{EmotionPush} test sets, respectively. |
Tasks | Emotion Classification |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07039v2 |
http://arxiv.org/pdf/1806.07039v2.pdf | |
PWC | https://paperswithcode.com/paper/emotionx-dlc-self-attentive-bilstm-for-1 |
Repo | |
Framework | |
Pull Message Passing for Nonparametric Belief Propagation
Title | Pull Message Passing for Nonparametric Belief Propagation |
Authors | Karthik Desingh, Anthony Opipari, Odest Chadwicke Jenkins |
Abstract | We present a “pull” approach to approximate products of Gaussian mixtures within message updates for Nonparametric Belief Propagation (NBP) inference. Existing NBP methods often represent messages between continuous-valued latent variables as Gaussian mixture models. To avoid computational intractability in loopy graphs, NBP necessitates an approximation of the product of such mixtures. Sampling-based product approximations have shown effectiveness for NBP inference. However, such approximations used within the traditional “push” message update procedures quickly become computationally prohibitive for multi-modal distributions over high-dimensional variables. In contrast, we propose a “pull” method, as the Pull Message Passing for Nonparametric Belief propagation (PMPNBP) algorithm, and demonstrate its viability for efficient inference. We report results using an experiment from an existing NBP method, PAMPAS, for inferring the pose of an articulated structure in clutter. Results from this illustrative problem found PMPNBP has a greater ability to efficiently scale the number of components in its mixtures and, consequently, improve inference accuracy. |
Tasks | |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10487v1 |
http://arxiv.org/pdf/1807.10487v1.pdf | |
PWC | https://paperswithcode.com/paper/pull-message-passing-for-nonparametric-belief |
Repo | |
Framework | |
Optimal Bayesian Transfer Learning
Title | Optimal Bayesian Transfer Learning |
Authors | Alireza Karbalayghareh, Xiaoning Qian, Edward R. Dougherty |
Abstract | Transfer learning has recently attracted significant research attention, as it simultaneously learns from different source domains, which have plenty of labeled data, and transfers the relevant knowledge to the target domain with limited labeled data to improve the prediction performance. We propose a Bayesian transfer learning framework where the source and target domains are related through the joint prior density of the model parameters. The modeling of joint prior densities enables better understanding of the “transferability” between domains. We define a joint Wishart density for the precision matrices of the Gaussian feature-label distributions in the source and target domains to act like a bridge that transfers the useful information of the source domain to help classification in the target domain by improving the target posteriors. Using several theorems in multivariate statistics, the posteriors and posterior predictive densities are derived in closed forms with hypergeometric functions of matrix argument, leading to our novel closed-form and fast Optimal Bayesian Transfer Learning (OBTL) classifier. Experimental results on both synthetic and real-world benchmark data confirm the superb performance of the OBTL compared to the other state-of-the-art transfer learning and domain adaptation methods. |
Tasks | Domain Adaptation, Transfer Learning |
Published | 2018-01-02 |
URL | http://arxiv.org/abs/1801.00857v2 |
http://arxiv.org/pdf/1801.00857v2.pdf | |
PWC | https://paperswithcode.com/paper/optimal-bayesian-transfer-learning |
Repo | |
Framework | |
Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model
Title | Contextual Face Recognition with a Nested-Hierarchical Nonparametric Identity Model |
Authors | Daniel C. Castro, Sebastian Nowozin |
Abstract | Current face recognition systems typically operate via classification into known identities obtained from supervised identity annotations. There are two problems with this paradigm: (1) current systems are unable to benefit from often abundant unlabelled data; and (2) they equate successful recognition with labelling a given input image. Humans, on the other hand, regularly perform identification of individuals completely unsupervised, recognising the identity of someone they have seen before even without being able to name that individual. How can we go beyond the current classification paradigm towards a more human understanding of identities? In previous work, we proposed an integrated Bayesian model that coherently reasons about the observed images, identities, partial knowledge about names, and the situational context of each observation. Here, we propose extensions of the contextual component of this model, enabling unsupervised discovery of an unbounded number of contexts for improved face recognition. |
Tasks | Face Recognition |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07753v1 |
http://arxiv.org/pdf/1811.07753v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-face-recognition-with-a-nested |
Repo | |
Framework | |
Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management
Title | Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management |
Authors | Ursula Challita, Walid Saad, Christian Bettstetter |
Abstract | In this paper, an interference-aware path planning scheme for a network of cellular-connected unmanned aerial vehicles (UAVs) is proposed. In particular, each UAV aims at achieving a tradeoff between maximizing energy efficiency and minimizing both wireless latency and the interference level caused on the ground network along its path. The problem is cast as a dynamic game among UAVs. To solve this game, a deep reinforcement learning algorithm, based on echo state network (ESN) cells, is proposed. The introduced deep ESN architecture is trained to allow each UAV to map each observation of the network state to an action, with the goal of minimizing a sequence of time-dependent utility functions. Each UAV uses ESN to learn its optimal path, transmission power level, and cell association vector at different locations along its path. The proposed algorithm is shown to reach a subgame perfect Nash equilibrium (SPNE) upon convergence. Moreover, an upper and lower bound for the altitude of the UAVs is derived thus reducing the computational complexity of the proposed algorithm. Simulation results show that the proposed scheme achieves better wireless latency per UAV and rate per ground user (UE) while requiring a number of steps that is comparable to a heuristic baseline that considers moving via the shortest distance towards the corresponding destinations. The results also show that the optimal altitude of the UAVs varies based on the ground network density and the UE data rate requirements and plays a vital role in minimizing the interference level on the ground UEs as well as the wireless transmission delay of the UAV. |
Tasks | |
Published | 2018-01-16 |
URL | http://arxiv.org/abs/1801.05500v1 |
http://arxiv.org/pdf/1801.05500v1.pdf | |
PWC | https://paperswithcode.com/paper/cellular-connected-uavs-over-5g-deep |
Repo | |
Framework | |
Understanding and Comparing Scalable Gaussian Process Regression for Big Data
Title | Understanding and Comparing Scalable Gaussian Process Regression for Big Data |
Authors | Haitao Liu, Jianfei Cai, Yew-Soon Ong, Yi Wang |
Abstract | As a non-parametric Bayesian model which produces informative predictive distribution, Gaussian process (GP) has been widely used in various fields, like regression, classification and optimization. The cubic complexity of standard GP however leads to poor scalability, which poses challenges in the era of big data. Hence, various scalable GPs have been developed in the literature in order to improve the scalability while retaining desirable prediction accuracy. This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness. The numerical experiments on two toy examples and five real-world datasets with up to 250K points offer the following findings. In terms of scalability, most of the scalable GPs own a time complexity that is linear to the training size. In terms of capability, the sparse approximations capture the long-term spatial correlations, the local aggregations capture the local patterns but suffer from over-fitting in some scenarios. In terms of controllability, we could improve the performance of sparse approximations by simply increasing the inducing size. But this is not the case for local aggregations. In terms of robustness, local aggregations are robust to various initializations of hyperparameters due to the local attention mechanism. Finally, we highlight that the proper hybrid of global and local scalable GPs may be a promising way to improve both the model capability and scalability for big data. |
Tasks | |
Published | 2018-11-03 |
URL | http://arxiv.org/abs/1811.01159v1 |
http://arxiv.org/pdf/1811.01159v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-and-comparing-scalable-gaussian |
Repo | |
Framework | |
Motion deblurring of faces
Title | Motion deblurring of faces |
Authors | Grigorios G. Chrysos, Paolo Favaro, Stefanos Zafeiriou |
Abstract | Face analysis is a core part of computer vision, in which remarkable progress has been observed in the past decades. Current methods achieve recognition and tracking with invariance to fundamental modes of variation such as illumination, 3D pose, expressions. Notwithstanding, a much less standing mode of variation is motion deblurring, which however presents substantial challenges in face analysis. Recent approaches either make oversimplifying assumptions, e.g. in cases of joint optimization with other tasks, or fail to preserve the highly structured shape/identity information. Therefore, we propose a data-driven method that encourages identity preservation. The proposed model includes two parallel streams (sub-networks): the first deblurs the image, the second implicitly extracts and projects the identity of both the sharp and the blurred image in similar subspaces. We devise a method for creating realistic motion blur by averaging a variable number of frames to train our model. The averaged images originate from a 2MF2 dataset with 10 million facial frames, which we introduce for the task. Considering deblurring as an intermediate step, we utilize the deblurred outputs to conduct a thorough experimentation on high-level face analysis tasks, i.e. landmark localization and face verification. The experimental evaluation demonstrates the superiority of our method. |
Tasks | Deblurring, Face Verification |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.03330v1 |
http://arxiv.org/pdf/1803.03330v1.pdf | |
PWC | https://paperswithcode.com/paper/motion-deblurring-of-faces |
Repo | |
Framework | |
Algorithmic Bidding for Virtual Trading in Electricity Markets
Title | Algorithmic Bidding for Virtual Trading in Electricity Markets |
Authors | Sevi Baltaoglu, Lang Tong, Qing Zhao |
Abstract | We consider the problem of optimal bidding for virtual trading in two-settlement electricity markets. A virtual trader aims to arbitrage on the differences between day-ahead and real-time market prices; both prices, however, are random and unknown to market participants. An online learning algorithm is proposed to maximize the cumulative payoff over a finite number of trading sessions by allocating the trader’s budget among his bids for K options in each session. It is shown that the proposed algorithm converges, with an almost optimal convergence rate, to the global optimal corresponding to the case when the underlying price distribution is known. The proposed algorithm is also generalized for trading strategies with a risk measure. By using both cumulative payoff and Sharpe ratio as performance metrics, evaluations were performed based on historical data spanning ten year period of NYISO and PJM markets. It was shown that the proposed strategy outperforms standard benchmarks and the S&P 500 index over the same period. |
Tasks | |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.03010v2 |
http://arxiv.org/pdf/1802.03010v2.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-bidding-for-virtual-trading-in |
Repo | |
Framework | |
Improving Fast Segmentation With Teacher-student Learning
Title | Improving Fast Segmentation With Teacher-student Learning |
Authors | Jiafeng Xie, Bing Shuai, Jian-Fang Hu, Jingyang Lin, Wei-Shi Zheng |
Abstract | Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks. However, these models are very heavy and generally suffer from low inference speed, which limits their application scenarios in practice. Meanwhile, existing fast segmentation models usually fail to obtain satisfactory segmentation accuracies on public benchmarks. In this paper, we propose a teacher-student learning framework that transfers the knowledge gained by a heavy and better performed segmentation network (i.e. teacher) to guide the learning of fast segmentation networks (i.e. student). Specifically, both zero-order and first-order knowledge depicted in the fine annotated images and unlabeled auxiliary data are transferred to regularize our student learning. The proposed method can improve existing fast segmentation models without incurring extra computational overhead, so it can still process images with the same fast speed. Extensive experiments on the Pascal Context, Cityscape and VOC 2012 datasets demonstrate that the proposed teacher-student learning framework is able to significantly boost the performance of student network. |
Tasks | |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08476v1 |
http://arxiv.org/pdf/1810.08476v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-fast-segmentation-with-teacher |
Repo | |
Framework | |
Invariant properties of a locally salient dither pattern with a spatial-chromatic histogram
Title | Invariant properties of a locally salient dither pattern with a spatial-chromatic histogram |
Authors | A. M. R. R. Bandara, L. Ranathunga, N. A. Abdullah |
Abstract | Compacted Dither Pattern Code (CDPC) is a recently found feature which is successful in irregular shapes based visual depiction. Locally salient dither pattern feature is an attempt to expand the capability of CDPC for both regular and irregular shape based visual depiction. This paper presents an analysis of rotational and scale invariance property of locally salient dither pattern feature with a two dimensional spatialchromatic histogram, which expands the applicability of the visual feature. Experiments were conducted to exhibit rotational and scale invariance of the feature. These experiments were conducted by combining linear Support Vector Machine (SVM) classifier to the new feature. The experimental results revealed that the locally salient dither pattern feature with the spatialchromatic histogram is rotationally and scale invariant. |
Tasks | |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1803.00037v1 |
http://arxiv.org/pdf/1803.00037v1.pdf | |
PWC | https://paperswithcode.com/paper/invariant-properties-of-a-locally-salient |
Repo | |
Framework | |
EdgeSpeechNets: Highly Efficient Deep Neural Networks for Speech Recognition on the Edge
Title | EdgeSpeechNets: Highly Efficient Deep Neural Networks for Speech Recognition on the Edge |
Authors | Zhong Qiu Lin, Audrey G. Chung, Alexander Wong |
Abstract | Despite showing state-of-the-art performance, deep learning for speech recognition remains challenging to deploy in on-device edge scenarios such as mobile and other consumer devices. Recently, there have been greater efforts in the design of small, low-footprint deep neural networks (DNNs) that are more appropriate for edge devices, with much of the focus on design principles for hand-crafting efficient network architectures. In this study, we explore a human-machine collaborative design strategy for building low-footprint DNN architectures for speech recognition through a marriage of human-driven principled network design prototyping and machine-driven design exploration. The efficacy of this design strategy is demonstrated through the design of a family of highly-efficient DNNs (nicknamed EdgeSpeechNets) for limited-vocabulary speech recognition. Experimental results using the Google Speech Commands dataset for limited-vocabulary speech recognition showed that EdgeSpeechNets have higher accuracies than state-of-the-art DNNs (with the best EdgeSpeechNet achieving ~97% accuracy), while achieving significantly smaller network sizes (as much as 7.8x smaller) and lower computational cost (as much as 36x fewer multiply-add operations, 10x lower prediction latency, and 16x smaller memory footprint on a Motorola Moto E phone), making them very well-suited for on-device edge voice interface applications. |
Tasks | Speech Recognition |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.08559v2 |
http://arxiv.org/pdf/1810.08559v2.pdf | |
PWC | https://paperswithcode.com/paper/edgespeechnets-highly-efficient-deep-neural |
Repo | |
Framework | |
Differential Diagnosis for Pancreatic Cysts in CT Scans Using Densely-Connected Convolutional Networks
Title | Differential Diagnosis for Pancreatic Cysts in CT Scans Using Densely-Connected Convolutional Networks |
Authors | Hongwei Li, Kanru Lin, Maximilian Reichert, Lina Xu, Rickmer Braren, Deliang Fu, Roland Schmid, Ji Li, Bjoern Menze, Kuangyu Shi |
Abstract | The lethal nature of pancreatic ductal adenocarcinoma (PDAC) calls for early differential diagnosis of pancreatic cysts, which are identified in up to 16% of normal subjects, and some of which may develop into PDAC. Previous computer-aided developments have achieved certain accuracy for classification on segmented cystic lesions in CT. However, pancreatic cysts have a large variation in size and shape, and the precise segmentation of them remains rather challenging, which restricts the computer-aided interpretation of CT images acquired for differential diagnosis. We propose a computer-aided framework for early differential diagnosis of pancreatic cysts without pre-segmenting the lesions using densely-connected convolutional networks (Dense-Net). The Dense-Net learns high-level features from whole abnormal pancreas and builds mappings between medical imaging appearance to different pathological types of pancreatic cysts. To enhance the clinical applicability, we integrate saliency maps in the framework to assist the physicians to understand the decision of the deep learning method. The test on a cohort of 206 patients with 4 pathologically confirmed subtypes of pancreatic cysts has achieved an overall accuracy of 72.8%, which is significantly higher than the baseline accuracy of 48.1%, which strongly supports the clinical potential of our developed method. |
Tasks | |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.01023v3 |
http://arxiv.org/pdf/1806.01023v3.pdf | |
PWC | https://paperswithcode.com/paper/differential-diagnosis-for-pancreatic-cysts |
Repo | |
Framework | |
Large-Scale Visual Relationship Understanding
Title | Large-Scale Visual Relationship Understanding |
Authors | Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny |
Abstract | Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples. In real-world scenarios with large numbers of objects and relations, some are seen very commonly while others are barely seen. We develop a new relationship detection model that embeds objects and relations into two vector spaces where both discriminative capability and semantic affinity are preserved. We learn both a visual and a semantic module that map features from the two modalities into a shared space, where matched pairs of features have to discriminate against those unmatched, but also maintain close distances to semantically similar ones. Benefiting from that, our model can achieve superior performance even when the visual entity categories scale up to more than 80,000, with extremely skewed class distribution. We demonstrate the efficacy of our model on a large and imbalanced benchmark based of Visual Genome that comprises 53,000+ objects and 29,000+ relations, a scale at which no previous work has ever been evaluated at. We show superiority of our model over carefully designed baselines on the original Visual Genome dataset with 80,000+ categories. We also show state-of-the-art performance on the VRD dataset and the scene graph dataset which is a subset of Visual Genome with 200 categories. |
Tasks | |
Published | 2018-04-27 |
URL | https://arxiv.org/abs/1804.10660v4 |
https://arxiv.org/pdf/1804.10660v4.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-visual-relationship-understanding |
Repo | |
Framework | |
A Spectral Regularizer for Unsupervised Disentanglement
Title | A Spectral Regularizer for Unsupervised Disentanglement |
Authors | Aditya Ramesh, Youngduck Choi, Yann LeCun |
Abstract | A generative model with a disentangled representation allows for independent control over different aspects of the output. Learning disentangled representations has been a recent topic of great interest, but it remains poorly understood. We show that even for GANs that do not possess disentangled representations, one can find curved trajectories in latent space over which local disentanglement occurs. These trajectories are found by iteratively following the leading right-singular vectors of the Jacobian of the generator with respect to its input. Based on this insight, we describe an efficient regularizer that aligns these vectors with the coordinate axes, and show that it can be used to induce disentangled representations in GANs, in a completely unsupervised manner. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01161v2 |
http://arxiv.org/pdf/1812.01161v2.pdf | |
PWC | https://paperswithcode.com/paper/a-spectral-regularizer-for-unsupervised |
Repo | |
Framework | |
FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design
Title | FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design |
Authors | Jeff Zhang, Siddharth Garg |
Abstract | Deep neural networks (DNN) are increasingly being accelerated on application-specific hardware such as the Google TPU designed especially for deep learning. Timing speculation is a promising approach to further increase the energy efficiency of DNN accelerators. Architectural exploration for timing speculation requires detailed gate-level timing simulations that can be time-consuming for large DNNs that execute millions of multiply-and-accumulate (MAC) operations. In this paper we propose FATE, a new methodology for fast and accurate timing simulations of DNN accelerators like the Google TPU. FATE proposes two novel ideas: (i) DelayNet, a DNN based timing model for MAC units; and (ii) a statistical sampling methodology that reduces the number of MAC operations for which timing simulations are performed. We show that FATE results in between 8 times-58 times speed-up in timing simulations, while introducing less than 2% error in classification accuracy estimates. We demonstrate the use of FATE by comparing to conventional DNN accelerator that uses 2’s complement (2C) arithmetic with an alternative implementation that uses signed magnitude representations (SMR). We show that that the SMR implementation provides 18% more energy savings for the same classification accuracy than 2C, a result that might be of independent interest. |
Tasks | |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00480v1 |
http://arxiv.org/pdf/1807.00480v1.pdf | |
PWC | https://paperswithcode.com/paper/fate-fast-and-accurate-timing-error |
Repo | |
Framework | |