October 20, 2019

2867 words 14 mins read

Paper Group AWR 271

Paper Group AWR 271

Towards Automated Customer Support. Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation. Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem. 3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image. Fortified Networks: Improving the Rob …

Towards Automated Customer Support

Title Towards Automated Customer Support
Authors Momchil Hardalov, Ivan Koychev, Preslav Nakov
Abstract Recent years have seen growing interest in conversational agents, such as chatbots, which are a very good fit for automated customer support because the domain in which they need to operate is narrow. This interest was in part inspired by recent advances in neural machine translation, esp. the rise of sequence-to-sequence (seq2seq) and attention-based models such as the Transformer, which have been applied to various other tasks and have opened new research directions in question answering, chatbots, and conversational systems. Still, in many cases, it might be feasible and even preferable to use simple information retrieval techniques. Thus, here we compare three different models:(i) a retrieval model, (ii) a sequence-to-sequence model with attention, and (iii) Transformer. Our experiments with the Twitter Customer Support Dataset, which contains over two million posts from customer support services of twenty major brands, show that the seq2seq model outperforms the other two in terms of semantics and word overlap.
Tasks Information Retrieval, Machine Translation, Question Answering
Published 2018-09-02
URL http://arxiv.org/abs/1809.00303v1
PDF http://arxiv.org/pdf/1809.00303v1.pdf
PWC https://paperswithcode.com/paper/towards-automated-customer-support
Repo https://github.com/mhardalov/customer-support-chatbot
Framework none

Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation

Title Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation
Authors Jing Li, Rafal K. Mantiuk, Junle Wang, Suiyi Ling, Patrick Le Callet
Abstract In this paper we present a hybrid active sampling strategy for pairwise preference aggregation, which aims at recovering the underlying rating of the test candidates from sparse and noisy pairwise labelling. Our method employs Bayesian optimization framework and Bradley-Terry model to construct the utility function, then to obtain the Expected Information Gain (EIG) of each pair. For computational efficiency, Gaussian-Hermite quadrature is used for estimation of EIG. In this work, a hybrid active sampling strategy is proposed, either using Global Maximum (GM) EIG sampling or Minimum Spanning Tree (MST) sampling in each trial, which is determined by the test budget. The proposed method has been validated on both simulated and real-world datasets, where it shows higher preference aggregation ability than the state-of-the-art methods.
Tasks
Published 2018-10-20
URL http://arxiv.org/abs/1810.08851v1
PDF http://arxiv.org/pdf/1810.08851v1.pdf
PWC https://paperswithcode.com/paper/hybrid-mst-a-hybrid-active-sampling-strategy
Repo https://github.com/jingnantes/hybrid-mst
Framework none

Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem

Title Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem
Authors Savinay Nagendra, Nikhil Podila, Rashmi Ugarakhod, Koshy George
Abstract Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment and provide optimal control strategy. In this paper, RL is explored in the context of control of the benchmark cartpole dynamical system with no prior knowledge of the dynamics. RL algorithms such as temporal-difference, policy gradient actor-critic, and value function approximation are compared in this context with the standard LQR solution. Further, we propose a novel approach to integrate RL and swing-up controllers.
Tasks
Published 2018-10-03
URL http://arxiv.org/abs/1810.01940v1
PDF http://arxiv.org/pdf/1810.01940v1.pdf
PWC https://paperswithcode.com/paper/comparison-of-reinforcement-learning
Repo https://github.com/n1shetty/Cart-Pole-Balance-with-Reinforcement-Learning
Framework none

3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image

Title 3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image
Authors Priyanka Mandikal, K L Navaneet, Mayank Agarwal, R. Venkatesh Babu
Abstract 3D reconstruction from single view images is an ill-posed problem. Inferring the hidden regions from self-occluded images is both challenging and ambiguous. We propose a two-pronged approach to address these issues. To better incorporate the data prior and generate meaningful reconstructions, we propose 3D-LMNet, a latent embedding matching approach for 3D reconstruction. We first train a 3D point cloud auto-encoder and then learn a mapping from the 2D image to the corresponding learnt embedding. To tackle the issue of uncertainty in the reconstruction, we predict multiple reconstructions that are consistent with the input view. This is achieved by learning a probablistic latent space with a novel view-specific diversity loss. Thorough quantitative and qualitative analysis is performed to highlight the significance of the proposed approach. We outperform state-of-the-art approaches on the task of single-view 3D reconstruction on both real and synthetic datasets while generating multiple plausible reconstructions, demonstrating the generalizability and utility of our approach.
Tasks 3D Reconstruction, Single-View 3D Reconstruction
Published 2018-07-20
URL http://arxiv.org/abs/1807.07796v2
PDF http://arxiv.org/pdf/1807.07796v2.pdf
PWC https://paperswithcode.com/paper/3d-lmnet-latent-embedding-matching-for
Repo https://github.com/val-iisc/3d-lmnet
Framework tf

Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations

Title Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations
Authors Alex Lamb, Jonathan Binas, Anirudh Goyal, Dmitriy Serdyuk, Sandeep Subramanian, Ioannis Mitliagkas, Yoshua Bengio
Abstract Deep networks have achieved impressive results across a variety of important tasks. However a known weakness is a failure to perform well when evaluated on data which differ from the training distribution, even if these differences are very small, as is the case with adversarial examples. We propose Fortified Networks, a simple transformation of existing networks, which fortifies the hidden layers in a deep network by identifying when the hidden states are off of the data manifold, and maps these hidden states back to parts of the data manifold where the network performs well. Our principal contribution is to show that fortifying these hidden states improves the robustness of deep networks and our experiments (i) demonstrate improved robustness to standard adversarial attacks in both black-box and white-box threat models; (ii) suggest that our improvements are not primarily due to the gradient masking problem and (iii) show the advantage of doing this fortification in the hidden layers instead of the input space.
Tasks
Published 2018-04-07
URL http://arxiv.org/abs/1804.02485v1
PDF http://arxiv.org/pdf/1804.02485v1.pdf
PWC https://paperswithcode.com/paper/fortified-networks-improving-the-robustness
Repo https://github.com/jbinas/fortified-networks
Framework tf

Quasi-Monte Carlo for multivariate distributions via generative neural networks

Title Quasi-Monte Carlo for multivariate distributions via generative neural networks
Authors Marius Hofert, Avinash Prasad, Mu Zhu
Abstract Generative moment matching networks (GMMNs) are introduced as quasi-random number generators (QRNGs) for multivariate models with any underlying copula in order to estimate expectations with variance reduction. So far, QRNGs for multivariate distributions required a careful design, exploiting specific properties (such as conditional distributions) of the implied copula or the underlying quasi-Monte Carlo (QMC) point set, and were only tractable for a small number of models. Utilizing GMMNs allows one to construct QRNGs for a much larger variety of multivariate distributions without such restrictions. Once trained with a pseudo-random sample, these neural networks only require a multivariate standard uniform randomized QMC point set as input and are thus fast in estimating expectations of interest under dependence with variance reduction. Numerical examples are considered to demonstrate the approach, including applications inspired by risk management practice. All results are reproducible with the demo HPZ19 as part of the new R package gnn; select minimal working examples are provided in the demo GMMN_QMC of gnn
Tasks
Published 2018-11-01
URL https://arxiv.org/abs/1811.00683v2
PDF https://arxiv.org/pdf/1811.00683v2.pdf
PWC https://paperswithcode.com/paper/quasi-random-number-generators-for
Repo https://github.com/jinghuazhao/Caprion
Framework none

Learning Latent Fractional dynamics with Unknown Unknowns

Title Learning Latent Fractional dynamics with Unknown Unknowns
Authors Gaurav Gupta, Sergio Pequito, Paul Bogdan
Abstract Despite significant effort in understanding complex systems (CS), we lack a theory for modeling, inference, analysis and efficient control of time-varying complex networks (TVCNs) in uncertain environments. From brain activity dynamics to microbiome, and even chromatin interactions within the genome architecture, many such TVCNs exhibits a pronounced spatio-temporal fractality. Moreover, for many TVCNs only limited information (e.g., few variables) is accessible for modeling, which hampers the capabilities of analytical tools to uncover the true degrees of freedom and infer the CS model, the hidden states and their parameters. Another fundamental limitation is that of understanding and unveiling of unknown drivers of the dynamics that could sporadically excite the network in ways that straightforward modeling does not work due to our inability to model non-stationary processes. Towards addressing these challenges, in this paper, we consider the problem of learning the fractional dynamical complex networks under unknown unknowns (i.e., hidden drivers) and partial observability (i.e., only partial data is available). More precisely, we consider a generalized modeling approach of TVCNs consisting of discrete-time fractional dynamical equations and propose an iterative framework to determine the network parameterization and predict the state of the system. We showcase the performance of the proposed framework in the context of task classification using real electroencephalogram data.
Tasks
Published 2018-11-02
URL http://arxiv.org/abs/1811.00703v2
PDF http://arxiv.org/pdf/1811.00703v2.pdf
PWC https://paperswithcode.com/paper/learning-latent-fractional-dynamics-with
Repo https://github.com/gaurav71531/hiddenState
Framework none

Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction

Title Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction
Authors Christina Wadsworth, Francesca Vera, Chris Piech
Abstract Recidivism prediction scores are used across the USA to determine sentencing and supervision for hundreds of thousands of inmates. One such generator of recidivism prediction scores is Northpointe’s Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) score, used in states like California and Florida, which past research has shown to be biased against black inmates according to certain measures of fairness. To counteract this racial bias, we present an adversarially-trained neural network that predicts recidivism and is trained to remove racial bias. When comparing the results of our model to COMPAS, we gain predictive accuracy and get closer to achieving two out of three measures of fairness: parity and equality of odds. Our model can be generalized to any prediction and demographic. This piece of research contributes an example of scientific replication and simplification in a high-stakes real-world application like recidivism prediction.
Tasks
Published 2018-06-30
URL http://arxiv.org/abs/1807.00199v1
PDF http://arxiv.org/pdf/1807.00199v1.pdf
PWC https://paperswithcode.com/paper/achieving-fairness-through-adversarial
Repo https://github.com/dns43/fairness
Framework tf

Slimmable Neural Networks

Title Slimmable Neural Networks
Authors Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, Thomas Huang
Abstract We present a simple and general method to train a single neural network executable at different widths (number of channels in a layer), permitting instant and adaptive accuracy-efficiency trade-offs at runtime. Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization. At runtime, the network can adjust its width on the fly according to on-device benchmarks and resource constraints, rather than downloading and offloading different models. Our trained networks, named slimmable neural networks, achieve similar (and in many cases better) ImageNet classification accuracy than individually trained models of MobileNet v1, MobileNet v2, ShuffleNet and ResNet-50 at different widths respectively. We also demonstrate better performance of slimmable models compared with individual ones across a wide range of applications including COCO bounding-box object detection, instance segmentation and person keypoint detection without tuning hyper-parameters. Lastly we visualize and discuss the learned features of slimmable networks. Code and models are available at: https://github.com/JiahuiYu/slimmable_networks
Tasks Instance Segmentation, Keypoint Detection, Object Detection, Semantic Segmentation
Published 2018-12-21
URL http://arxiv.org/abs/1812.08928v1
PDF http://arxiv.org/pdf/1812.08928v1.pdf
PWC https://paperswithcode.com/paper/slimmable-neural-networks
Repo https://github.com/JiahuiYu/slimmable_networks
Framework pytorch

Accurate and Scalable Image Clustering Based On Sparse Representation of Camera Fingerprint

Title Accurate and Scalable Image Clustering Based On Sparse Representation of Camera Fingerprint
Authors Quoc-Tin Phan, Giulia Boato, Francesco G. B. De Natale
Abstract Clustering images according to their acquisition devices is a well-known problem in multimedia forensics, which is typically faced by means of camera Sensor Pattern Noise (SPN). Such an issue is challenging since SPN is a noise-like signal, hard to be estimated and easy to be attenuated or destroyed by many factors. Moreover, the high dimensionality of SPN hinders large-scale applications. Existing approaches are typically based on the correlation among SPNs in the pixel domain, which might not be able to capture intrinsic data structure in union of vector subspaces. In this paper, we propose an accurate clustering framework, which exploits linear dependencies among SPNs in their intrinsic vector subspaces. Such dependencies are encoded under sparse representations which are obtained by solving a LASSO problem with non-negativity constraint. The proposed framework is highly accurate in number of clusters estimation and image association. Moreover, our framework is scalable to the number of images and robust against double JPEG compression as well as the presence of outliers, owning big potential for real-world applications. Experimental results on Dresden and Vision database show that our proposed framework can adapt well to both medium-scale and large-scale contexts, and outperforms state-of-the-art methods.
Tasks Image Clustering
Published 2018-10-18
URL http://arxiv.org/abs/1810.07945v2
PDF http://arxiv.org/pdf/1810.07945v2.pdf
PWC https://paperswithcode.com/paper/accurate-and-scalable-image-clustering-based
Repo https://github.com/quoctin/residual-clustering
Framework none

Decipherment of Historical Manuscript Images

Title Decipherment of Historical Manuscript Images
Authors Xusen Yin, Nada Aldarrab, Beáta Megyesi, Kevin Knight
Abstract European libraries and archives are filled with enciphered manuscripts from the early modern period. These include military and diplomatic correspondence, records of secret societies, private letters, and so on. Although they are enciphered with classical cryptographic algorithms, their contents are unavailable to working historians. We therefore attack the problem of automatically converting cipher manuscript images into plaintext. We develop unsupervised models for character segmentation, character-image clustering, and decipherment of cluster sequences. We experiment with both pipelined and joint models, and we give empirical results for multiple ciphers.
Tasks Image Clustering
Published 2018-10-09
URL https://arxiv.org/abs/1810.04297v3
PDF https://arxiv.org/pdf/1810.04297v3.pdf
PWC https://paperswithcode.com/paper/decipherment-of-historical-manuscript-images
Repo https://github.com/yinxusen/decipherment-images
Framework none

Single Image Haze Removal using a Generative Adversarial Network

Title Single Image Haze Removal using a Generative Adversarial Network
Authors Bharath Raj N., Venkateswaran N
Abstract Single image haze removal is an under constrained problem due to lack of depth information. It is usually performed by estimating the transmission map directly or by using a prior. Other methods use predictive models to estimate the transmission map and perform guided dehazing. In this paper, we propose a conditional GAN, that can directly remove haze from an image, without explicitly estimating transmission map or haze relevant features. We find that, only one module, comprising of the generator and discriminator is enough. We replaced the classic U-Net with the Tiramisu model, yielding much higher parameter efficiency and performance. We also observe that the performance during inference is dependent on the diversity of the dataset used for training. Experiments on synthetic and real world hazy images prove that our model performs competitively with the state of the art models.
Tasks Single Image Haze Removal
Published 2018-10-22
URL http://arxiv.org/abs/1810.09479v1
PDF http://arxiv.org/pdf/1810.09479v1.pdf
PWC https://paperswithcode.com/paper/single-image-haze-removal-using-a-generative
Repo https://github.com/thatbrguy/Dehaze-GAN
Framework tf

On the Spectral Bias of Neural Networks

Title On the Spectral Bias of Neural Networks
Authors Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, Aaron Courville
Abstract Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100%$ accuracy. In this work, we present properties of neural networks that complement this aspect of expressivity. By using tools from Fourier analysis, we show that deep ReLU networks are biased towards low frequency functions, meaning that they cannot have local fluctuations without affecting their global behavior. Intuitively, this property is in line with the observation that over-parameterized networks find simple patterns that generalize across data samples. We also investigate how the shape of the data manifold affects expressivity by showing evidence that learning high frequencies gets \emph{easier} with increasing manifold complexity, and present a theoretical understanding of this behavior. Finally, we study the robustness of the frequency components with respect to parameter perturbation, to develop the intuition that the parameters must be finely tuned to express high frequency functions.
Tasks
Published 2018-06-22
URL https://arxiv.org/abs/1806.08734v3
PDF https://arxiv.org/pdf/1806.08734v3.pdf
PWC https://paperswithcode.com/paper/on-the-spectral-bias-of-neural-networks
Repo https://github.com/nasimrahaman/SpectralBias
Framework none

Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation

Title Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation
Authors Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz
Abstract Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem. In this paper, we show that by applying a straightforward modification to an existing photorealistic style transfer algorithm, we achieve state-of-the-art synthetic-to-real domain adaptation results. We conduct extensive experimental validations on four synthetic-to-real tasks for semantic segmentation and object detection, and show that our approach exceeds the performance of any current state-of-the-art GAN-based image translation approach as measured by segmentation and object detection metrics. Furthermore we offer a distance based analysis of our method which shows a dramatic reduction in Frechet Inception distance between the source and target domains, offering a quantitative metric that demonstrates the effectiveness of our algorithm in bridging the synthetic-to-real gap.
Tasks Domain Adaptation, Object Detection, Semantic Segmentation, Style Transfer
Published 2018-07-24
URL http://arxiv.org/abs/1807.09384v1
PDF http://arxiv.org/pdf/1807.09384v1.pdf
PWC https://paperswithcode.com/paper/domain-stylization-a-strong-simple-baseline
Repo https://github.com/smitheric95/domain_stylization
Framework pytorch

Lipschitz Continuity in Model-based Reinforcement Learning

Title Lipschitz Continuity in Model-based Reinforcement Learning
Authors Kavosh Asadi, Dipendra Misra, Michael L. Littman
Abstract We examine the impact of learning Lipschitz continuous models in the context of model-based reinforcement learning. We provide a novel bound on multi-step prediction error of Lipschitz models where we quantify the error using the Wasserstein metric. We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Lipschitz. We conclude with empirical results that show the benefits of controlling the Lipschitz constant of neural-network models.
Tasks
Published 2018-04-19
URL http://arxiv.org/abs/1804.07193v3
PDF http://arxiv.org/pdf/1804.07193v3.pdf
PWC https://paperswithcode.com/paper/lipschitz-continuity-in-model-based
Repo https://github.com/kavosh8/Lip
Framework none
comments powered by Disqus