January 31, 2020

3207 words 16 mins read

Paper Group ANR 56

Lessons from Contextual Bandit Learning in a Customer Support Bot. Hyperbolic Multiplex Network Embedding with Maps of Random Walk. Adversarial Learning with Margin-based Triplet Embedding Regularization. Dual Encoder-Decoder based Generative Adversarial Networks for Disentangled Facial Representation Learning. Priority to unemployed immigrants? A …

Lessons from Contextual Bandit Learning in a Customer Support Bot


Title	Lessons from Contextual Bandit Learning in a Customer Support Bot
Authors	Nikos Karampatziakis, Sebastian Kochman, Jade Huang, Paul Mineiro, Kathy Osborne, Weizhu Chen
Abstract	In this work, we describe practical lessons we have learned from successfully using contextual bandits (CBs) to improve key business metrics of the Microsoft Virtual Agent for customer support. While our current use cases focus on single step einforcement learning (RL) and mostly in the domain of natural language processing and information retrieval we believe many of our findings are generally applicable. Through this article, we highlight certain issues that RL practitioners may encounter in similar types of applications as well as offer practical solutions to these challenges.
Tasks	Information Retrieval, Multi-Armed Bandits, Recommendation Systems
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02219v2
PDF	https://arxiv.org/pdf/1905.02219v2.pdf
PWC	https://paperswithcode.com/paper/lessons-from-real-world-reinforcement
Repo
Framework

Hyperbolic Multiplex Network Embedding with Maps of Random Walk


Title	Hyperbolic Multiplex Network Embedding with Maps of Random Walk
Authors	Peiyuan Sun
Abstract	Recent research on network embedding in hyperbolic space have proven successful in several applications. However, nodes in real world networks tend to interact through several distinct channels. Simple aggregation or ignorance of this multiplexity will lead to misleading results. On the other hand, there exists redundant information between different interaction patterns between nodes. Recent research reveals the analogy between the community structure and the hyperbolic coordinate. To learn each node’s effective embedding representation while reducing the redundancy of multiplex network, we then propose a unified framework combing multiplex network hyperbolic embedding and multiplex community detection. The intuitive rationale is that high order node embedding approach is expected to alleviate the observed network’s sparse and noisy structure which will benefit the community detection task. On the contrary, the improved community structure will also guide the node embedding task. To incorporate the common features between channels while preserving unique features, a random walk approach which traversing in latent multiplex hyperbolic space is proposed to detect the community across channels and bridge the connection between node embedding and community detection. The proposed framework is evaluated on several network tasks using different real world dataset. The results demonstrates that our framework is effective and efficiency compared with state-of-the-art approaches.
Tasks	Community Detection, Network Embedding
Published	2019-11-23
URL	https://arxiv.org/abs/1912.08927v2
PDF	https://arxiv.org/pdf/1912.08927v2.pdf
PWC	https://paperswithcode.com/paper/hyperbolic-multiplex-network-embedding-with
Repo
Framework

Adversarial Learning with Margin-based Triplet Embedding Regularization


Title	Adversarial Learning with Margin-based Triplet Embedding Regularization
Authors	Yaoyao Zhong, Weihong Deng
Abstract	The Deep neural networks (DNNs) have achieved great success on a variety of computer vision tasks, however, they are highly vulnerable to adversarial attacks. To address this problem, we propose to improve the local smoothness of the representation space, by integrating a margin-based triplet embedding regularization term into the classification objective, so that the obtained model learns to resist adversarial examples. The regularization term consists of two steps optimizations which find potential perturbations and punish them by a large margin in an iterative way. Experimental results on MNIST, CASIA-WebFace, VGGFace2 and MS-Celeb-1M reveal that our approach increases the robustness of the network against both feature and label adversarial attacks in simple object classification and deep face recognition.
Tasks	Face Recognition, Object Classification
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09481v1
PDF	https://arxiv.org/pdf/1909.09481v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-learning-with-margin-based
Repo
Framework

Dual Encoder-Decoder based Generative Adversarial Networks for Disentangled Facial Representation Learning


Title	Dual Encoder-Decoder based Generative Adversarial Networks for Disentangled Facial Representation Learning
Authors	Cong Hu, Zhen-Hua Feng, Xiao-Jun Wu, Josef Kittler
Abstract	To learn disentangled representations of facial images, we present a Dual Encoder-Decoder based Generative Adversarial Network (DED-GAN). In the proposed method, both the generator and discriminator are designed with deep encoder-decoder architectures as their backbones. To be more specific, the encoder-decoder structured generator is used to learn a pose disentangled face representation, and the encoder-decoder structured discriminator is tasked to perform real/fake classification, face reconstruction, determining identity and estimating face pose. We further improve the proposed network architecture by minimising the additional pixel-wise loss defined by the Wasserstein distance at the output of the discriminator so that the adversarial framework can be better trained. Additionally, we consider face pose variation to be continuous, rather than discrete in existing literature, to inject richer pose information into our model. The pose estimation task is formulated as a regression problem, which helps to disentangle identity information from pose variations. The proposed network is evaluated on the tasks of pose-invariant face recognition (PIFR) and face synthesis across poses. An extensive quantitative and qualitative evaluation carried out on several controlled and in-the-wild benchmarking datasets demonstrates the superiority of the proposed DED-GAN method over the state-of-the-art approaches.
Tasks	Face Generation, Face Recognition, Face Reconstruction, Pose Estimation, Representation Learning, Robust Face Recognition
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08797v1
PDF	https://arxiv.org/pdf/1909.08797v1.pdf
PWC	https://paperswithcode.com/paper/dual-encoder-decoder-based-generative
Repo
Framework

Priority to unemployed immigrants? A causal machine learning evaluation of training in Belgium


Title	Priority to unemployed immigrants? A causal machine learning evaluation of training in Belgium
Authors	Bart Cockx, Michael Lechner, Joost Bollens
Abstract	We investigate heterogenous employment effects of Flemish training programmes. Based on administrative individual data, we analyse programme effects at various aggregation levels using Modified Causal Forests (MCF), a causal machine learning estimator for multiple programmes. While all programmes have positive effects after the lock-in period, we find substantial heterogeneity across programmes and types of unemployed. Simulations show that assigning unemployed to programmes that maximise individual gains as identified in our estimation can considerably improve effectiveness. Simplified rules, such as one giving priority to unemployed with low employability, mostly recent migrants, lead to about half of the gains obtained by more sophisticated rules.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12864v1
PDF	https://arxiv.org/pdf/1912.12864v1.pdf
PWC	https://paperswithcode.com/paper/priority-to-unemployed-immigrants-a-causal
Repo
Framework

Interpretable Segmentation of Medical Free-Text Records Based on Word Embeddings


Title	Interpretable Segmentation of Medical Free-Text Records Based on Word Embeddings
Authors	Adam Gabriel Dobrakowski, Agnieszka Mykowiecka, Małgorzata Marciniak, Wojciech Jaworski, Przemysław Biecek
Abstract	Is it true that patients with similar conditions get similar diagnoses? In this paper we show NLP methods and a unique corpus of documents to validate this claim. We (1) introduce a method for representation of medical visits based on free-text descriptions recorded by doctors, (2) introduce a new method for clustering of patients’ visits and (3) present an~application of the proposed method on a corpus of 100,000 visits. With the proposed method we obtained stable and separated segments of visits which were positively validated against final medical diagnoses. We show how the presented algorithm may be used to aid doctors during their practice.
Tasks	Word Embeddings
Published	2019-07-03
URL	https://arxiv.org/abs/1907.04152v2
PDF	https://arxiv.org/pdf/1907.04152v2.pdf
PWC	https://paperswithcode.com/paper/clustering-of-medical-free-text-records-based
Repo
Framework

Learning to Manipulate Deformable Objects without Demonstrations


Title	Learning to Manipulate Deformable Objects without Demonstrations
Authors	Yilin Wu, Wilson Yan, Thanard Kurutach, Lerrel Pinto, Pieter Abbeel
Abstract	In this paper we tackle the problem of deformable object manipulation through model-free visual reinforcement learning (RL). In order to circumvent the sample inefficiency of RL, we propose two key ideas that accelerate learning. First, we propose an iterative pick-place action space that encodes the conditional relationship between picking and placing on deformable objects. The explicit structural encoding enables faster learning under complex object dynamics. Second, instead of jointly learning both the pick and the place locations, we only explicitly learn the placing policy conditioned on random pick points. Then, by selecting the pick point that has Maximal Value under Placing (MVP), we obtain our picking policy. This provides us with an informed picking policy during testing, while using only random pick points during training. Experimentally, this learning framework obtains an order of magnitude faster learning compared to independent action-spaces on our suite of deformable object manipulation tasks with visual RGB observations. Finally, using domain randomization, we transfer our policies to a real PR2 robot for challenging cloth and rope coverage tasks, and demonstrate significant improvements over standard RL techniques on average coverage.
Tasks	Deformable Object Manipulation
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13439v2
PDF	https://arxiv.org/pdf/1910.13439v2.pdf
PWC	https://paperswithcode.com/paper/191013439
Repo
Framework

Memory Augmented Deep Generative models for Forecasting the Next Shot Location in Tennis


Title	Memory Augmented Deep Generative models for Forecasting the Next Shot Location in Tennis
Authors	Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract	This paper presents a novel framework for predicting shot location and type in tennis. Inspired by recent neuroscience discoveries we incorporate neural memory modules to model the episodic and semantic memory components of a tennis player. We propose a Semi Supervised Generative Adversarial Network architecture that couples these memory models with the automatic feature learning power of deep neural networks and demonstrate methodologies for learning player level behavioural patterns with the proposed framework. We evaluate the effectiveness of the proposed model on tennis tracking data from the 2012 Australian Tennis open and exhibit applications of the proposed method in discovering how players adapt their style depending on the match context.
Tasks
Published	2019-01-16
URL	http://arxiv.org/abs/1901.05123v1
PDF	http://arxiv.org/pdf/1901.05123v1.pdf
PWC	https://paperswithcode.com/paper/memory-augmented-deep-generative-models-for
Repo
Framework

End-to-End Single Image Fog Removal using Enhanced Cycle Consistent Adversarial Networks


Title	End-to-End Single Image Fog Removal using Enhanced Cycle Consistent Adversarial Networks
Authors	Wei Liu, Xianxu Hou, Jiang Duan, Guoping Qiu
Abstract	Single image defogging is a classical and challenging problem in computer vision. Existing methods towards this problem mainly include handcrafted priors based methods that rely on the use of the atmospheric degradation model and learning based approaches that require paired fog-fogfree training example images. In practice, however, prior-based methods are prone to failure due to their own limitations and paired training data are extremely difficult to acquire. Inspired by the principle of CycleGAN network, we have developed an end-to-end learning system that uses unpaired fog and fogfree training images, adversarial discriminators and cycle consistency losses to automatically construct a fog removal system. Similar to CycleGAN, our system has two transformation paths; one maps fog images to a fogfree image domain and the other maps fogfree images to a fog image domain. Instead of one stage mapping, our system uses a two stage mapping strategy in each transformation path to enhance the effectiveness of fog removal. Furthermore, we make explicit use of prior knowledge in the networks by embedding the atmospheric degradation principle and a sky prior for mapping fogfree images to the fog images domain. In addition, we also contribute the first real world nature fog-fogfree image dataset for defogging research. Our multiple real fog images dataset (MRFID) contains images of 200 natural outdoor scenes. For each scene, there are one clear image and corresponding four foggy images of different fog densities manually selected from a sequence of images taken by a fixed camera over the course of one year. Qualitative and quantitative comparison against several state-of-the-art methods on both synthetic and real world images demonstrate that our approach is effective and performs favorably for recovering a clear image from a foggy image.
Tasks
Published	2019-02-04
URL	http://arxiv.org/abs/1902.01374v1
PDF	http://arxiv.org/pdf/1902.01374v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-single-image-fog-removal-using
Repo
Framework

Higher-Order Visualization of Causal Structures in Dynamics Graphs


Title	Higher-Order Visualization of Causal Structures in Dynamics Graphs
Authors	Vincenzo Perri, Ingo Scholtes
Abstract	Graph drawing and visualisation techniques are important tools for the exploratory analysis of complex systems. While these methods are regularly applied to visualise data on complex networks, we increasingly have access to time series data that can be modelled as temporal networks or dynamic graphs. In such dynamic graphs, the temporal ordering of time-stamped edges determines the causal topology of a system, i.e. which nodes can directly and indirectly influence each other via a so-called causal path. While this causal topology is crucial to understand dynamical processes, the role of nodes, or cluster structures, we lack graph drawing techniques that incorporate this information into static visualisations. Addressing this gap, we present a novel dynamic graph drawing algorithm that utilises higher-order graphical models of causal paths in time series data to compute time-aware static graph visualisations. These visualisations combine the simplicity of static graphs with a time-aware layout algorithm that highlights patterns in the causal topology that result from the temporal dynamics of edges.
Tasks	Time Series
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05976v1
PDF	https://arxiv.org/pdf/1908.05976v1.pdf
PWC	https://paperswithcode.com/paper/higher-order-visualization-of-causal
Repo
Framework

FakeSpotter: A Simple yet Robust Baseline for Spotting AI-Synthesized Fake Faces


Title	FakeSpotter: A Simple yet Robust Baseline for Spotting AI-Synthesized Fake Faces
Authors	Run Wang, Felix Juefei-Xu, Lei Ma, Xiaofei Xie, Yihao Huang, Jian Wang, Yang Liu
Abstract	In recent years, generative adversarial networks (GANs) and its variants have achieved unprecedented success in image synthesis. They are widely adopted in synthesizing facial images which brings potential security concerns to humans as the fakes spread and fuel the misinformation. However, robust detectors of these AI-synthesized fake faces are still in their infancy and are not ready to fully tackle this emerging challenge. In this work, we propose a novel approach, named \emph{FakeSpotter}, based on monitoring neuron behaviors to spot AI-synthesized fake faces. The studies on neuron coverage and interactions have successfully shown that they can be served as testing criteria for deep learning systems, especially under the settings of being exposed to adversarial attacks. Here, we conjecture that monitoring neuron behavior can also serve as an asset in detecting fake faces since layer-by-layer neuron activation patterns may capture more subtle features that are important for the fake detector. Experimental results on detecting four types of fake faces synthesized with state-of-the-art GANs (including just released StyleGAN2 and DFDC Dataset) and evading against four perturbation attacks show the effectiveness and robustness of our approach.
Tasks	Face Detection, Face Recognition, Image Generation
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06122v2
PDF	https://arxiv.org/pdf/1909.06122v2.pdf
PWC	https://paperswithcode.com/paper/fakespotter-a-simple-baseline-for-spotting-ai
Repo
Framework

Recognition of Handwritten Digit using Convolutional Neural Network in Python with Tensorflow and Comparison of Performance for Various Hidden Layers


Title	Recognition of Handwritten Digit using Convolutional Neural Network in Python with Tensorflow and Comparison of Performance for Various Hidden Layers
Authors	Fathma Siddique, Shadman Sakib, Md. Abu Bakr Siddique
Abstract	In recent times, with the increase of Artificial Neural Network (ANN), deep learning has brought a dramatic twist in the field of machine learning by making it more artificially intelligent. Deep learning is remarkably used in vast ranges of fields because of its diverse range of applications such as surveillance, health, medicine, sports, robotics, drones, etc. In deep learning, Convolutional Neural Network (CNN) is at the center of spectacular advances that mixes Artificial Neural Network (ANN) and up to date deep learning strategies. It has been used broadly in pattern recognition, sentence classification, speech recognition, face recognition, text categorization, document analysis, scene, and handwritten digit recognition. The goal of this paper is to observe the variation of accuracies of CNN to classify handwritten digits using various numbers of hidden layers and epochs and to make the comparison between the accuracies. For this performance evaluation of CNN, we performed our experiment using Modified National Institute of Standards and Technology (MNIST) dataset. Further, the network is trained using stochastic gradient descent and the backpropagation algorithm.
Tasks	Face Recognition, Handwritten Digit Recognition, Sentence Classification, Speech Recognition, Text Categorization
Published	2019-09-12
URL	https://arxiv.org/abs/1909.08490v1
PDF	https://arxiv.org/pdf/1909.08490v1.pdf
PWC	https://paperswithcode.com/paper/recognition-of-handwritten-digit-using
Repo
Framework

6D Pose Estimation with Correlation Fusion


Title	6D Pose Estimation with Correlation Fusion
Authors	Yi Cheng, Hongyuan Zhu, Cihan Acar, Wei Jing, Yan Wu, Liyuan Li, Cheston Tan, Joo-Hwee Lim
Abstract	6D object pose estimation is widely applied in robotic tasks such as grasping and manipulation. Prior methods using RGB-only images are vulnerable to heavy occlusion and poor illumination, so it is important to complement them with depth information. However, existing methods using RGB-D data don’t adequately exploit consistent and complementary information between two modalities. In this paper, we present a novel method to effectively consider the correlation within and across RGB and depth modalities with attention mechanism to learn discriminative multi-modal features. Then, effective fusion strategies for intra- and inter-correlation modules are explored to ensure efficient information flow between RGB and depth. To the best of our knowledge, this is the first work to explore effective intra- and inter-modality fusion in 6D pose estimation and experimental results show that our method can help achieve the state-of-the-art performance on LineMOD and YCB-Video datasets as well as benefit robot grasping task.
Tasks	6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published	2019-09-24
URL	https://arxiv.org/abs/1909.12936v1
PDF	https://arxiv.org/pdf/1909.12936v1.pdf
PWC	https://paperswithcode.com/paper/6d-pose-estimation-with-correlation-fusion
Repo
Framework

Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis


Title	Learning Relationships between Text, Audio, and Video via Deep Canonical Correlation for Multimodal Language Analysis
Authors	Zhongkai Sun, Prathusha Sarma, William Sethares, Yingyu Liang
Abstract	Multimodal language analysis often considers relationships between features based on text and those based on acoustical and visual properties. Text features typically outperform non-text features in sentiment analysis or emotion recognition tasks in part because the text features are derived from advanced language models or word embeddings trained on massive data sources while audio and video features are human-engineered and comparatively underdeveloped. Given that the text, audio, and video are describing the same utterance in different ways, we hypothesize that the multimodal sentiment analysis and emotion recognition can be improved by learning (hidden) correlations between features extracted from the outer product of text and audio (we call this text-based audio) and analogous text-based video. This paper proposes a novel model, the Interaction Canonical Correlation Network (ICCN), to learn such multimodal embeddings. ICCN learns correlations between all three modes via deep canonical correlation analysis (DCCA) and the proposed embeddings are then tested on several benchmark datasets and against other state-of-the-art multimodal embedding algorithms. Empirical results and ablation studies confirm the effectiveness of ICCN in capturing useful information from all three views.
Tasks	Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis, Word Embeddings
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05544v2
PDF	https://arxiv.org/pdf/1911.05544v2.pdf
PWC	https://paperswithcode.com/paper/learning-relationships-between-text-audio-and
Repo
Framework

Knowledge forest: a novel model to organize knowledge fragments


Title	Knowledge forest: a novel model to organize knowledge fragments
Authors	Qinghua Zheng, Jun Liu, Hongwei Zeng, Zhaotong Guo, Bei Wu, Bifan Wei
Abstract	With the rapid growth of knowledge, it shows a steady trend of knowledge fragmentization. Knowledge fragmentization manifests as that the knowledge related to a specific topic in a course is scattered in isolated and autonomous knowledge sources. We term the knowledge of a facet in a specific topic as a knowledge fragment. The problem of knowledge fragmentization brings two challenges: First, knowledge is scattered in various knowledge sources, which exerts users’ considerable efforts to search for the knowledge of their interested topics, thereby leading to information overload. Second, learning dependencies which refer to the precedence relationships between topics in the learning process are concealed by the isolation and autonomy of knowledge sources, thus causing learning disorientation. To solve the knowledge fragmentization problem, we propose a novel knowledge organization model, knowledge forest, which consists of facet trees and learning dependencies. Facet trees can organize knowledge fragments with facet hyponymy to alleviate information overload. Learning dependencies can organize disordered topics to cope with learning disorientation. We conduct extensive experiments on three manually constructed datasets from the Data Structure, Data Mining, and Computer Network courses, and the experimental results show that knowledge forest can effectively organize knowledge fragments, and alleviate information overload and learning disorientation.
Tasks
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06825v1
PDF	https://arxiv.org/pdf/1912.06825v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-forest-a-novel-model-to-organize
Repo
Framework