Paper Group AWR 271
Towards Automated Customer Support. Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation. Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem. 3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image. Fortified Networks: Improving the Rob …
Towards Automated Customer Support
Title | Towards Automated Customer Support |
Authors | Momchil Hardalov, Ivan Koychev, Preslav Nakov |
Abstract | Recent years have seen growing interest in conversational agents, such as chatbots, which are a very good fit for automated customer support because the domain in which they need to operate is narrow. This interest was in part inspired by recent advances in neural machine translation, esp. the rise of sequence-to-sequence (seq2seq) and attention-based models such as the Transformer, which have been applied to various other tasks and have opened new research directions in question answering, chatbots, and conversational systems. Still, in many cases, it might be feasible and even preferable to use simple information retrieval techniques. Thus, here we compare three different models:(i) a retrieval model, (ii) a sequence-to-sequence model with attention, and (iii) Transformer. Our experiments with the Twitter Customer Support Dataset, which contains over two million posts from customer support services of twenty major brands, show that the seq2seq model outperforms the other two in terms of semantics and word overlap. |
Tasks | Information Retrieval, Machine Translation, Question Answering |
Published | 2018-09-02 |
URL | http://arxiv.org/abs/1809.00303v1 |
http://arxiv.org/pdf/1809.00303v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-automated-customer-support |
Repo | https://github.com/mhardalov/customer-support-chatbot |
Framework | none |
Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation
Title | Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation |
Authors | Jing Li, Rafal K. Mantiuk, Junle Wang, Suiyi Ling, Patrick Le Callet |
Abstract | In this paper we present a hybrid active sampling strategy for pairwise preference aggregation, which aims at recovering the underlying rating of the test candidates from sparse and noisy pairwise labelling. Our method employs Bayesian optimization framework and Bradley-Terry model to construct the utility function, then to obtain the Expected Information Gain (EIG) of each pair. For computational efficiency, Gaussian-Hermite quadrature is used for estimation of EIG. In this work, a hybrid active sampling strategy is proposed, either using Global Maximum (GM) EIG sampling or Minimum Spanning Tree (MST) sampling in each trial, which is determined by the test budget. The proposed method has been validated on both simulated and real-world datasets, where it shows higher preference aggregation ability than the state-of-the-art methods. |
Tasks | |
Published | 2018-10-20 |
URL | http://arxiv.org/abs/1810.08851v1 |
http://arxiv.org/pdf/1810.08851v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-mst-a-hybrid-active-sampling-strategy |
Repo | https://github.com/jingnantes/hybrid-mst |
Framework | none |
Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem
Title | Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem |
Authors | Savinay Nagendra, Nikhil Podila, Rashmi Ugarakhod, Koshy George |
Abstract | Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment and provide optimal control strategy. In this paper, RL is explored in the context of control of the benchmark cartpole dynamical system with no prior knowledge of the dynamics. RL algorithms such as temporal-difference, policy gradient actor-critic, and value function approximation are compared in this context with the standard LQR solution. Further, we propose a novel approach to integrate RL and swing-up controllers. |
Tasks | |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01940v1 |
http://arxiv.org/pdf/1810.01940v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-reinforcement-learning |
Repo | https://github.com/n1shetty/Cart-Pole-Balance-with-Reinforcement-Learning |
Framework | none |
3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image
Title | 3D-LMNet: Latent Embedding Matching for Accurate and Diverse 3D Point Cloud Reconstruction from a Single Image |
Authors | Priyanka Mandikal, K L Navaneet, Mayank Agarwal, R. Venkatesh Babu |
Abstract | 3D reconstruction from single view images is an ill-posed problem. Inferring the hidden regions from self-occluded images is both challenging and ambiguous. We propose a two-pronged approach to address these issues. To better incorporate the data prior and generate meaningful reconstructions, we propose 3D-LMNet, a latent embedding matching approach for 3D reconstruction. We first train a 3D point cloud auto-encoder and then learn a mapping from the 2D image to the corresponding learnt embedding. To tackle the issue of uncertainty in the reconstruction, we predict multiple reconstructions that are consistent with the input view. This is achieved by learning a probablistic latent space with a novel view-specific diversity loss. Thorough quantitative and qualitative analysis is performed to highlight the significance of the proposed approach. We outperform state-of-the-art approaches on the task of single-view 3D reconstruction on both real and synthetic datasets while generating multiple plausible reconstructions, demonstrating the generalizability and utility of our approach. |
Tasks | 3D Reconstruction, Single-View 3D Reconstruction |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07796v2 |
http://arxiv.org/pdf/1807.07796v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-lmnet-latent-embedding-matching-for |
Repo | https://github.com/val-iisc/3d-lmnet |
Framework | tf |
Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations
Title | Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations |
Authors | Alex Lamb, Jonathan Binas, Anirudh Goyal, Dmitriy Serdyuk, Sandeep Subramanian, Ioannis Mitliagkas, Yoshua Bengio |
Abstract | Deep networks have achieved impressive results across a variety of important tasks. However a known weakness is a failure to perform well when evaluated on data which differ from the training distribution, even if these differences are very small, as is the case with adversarial examples. We propose Fortified Networks, a simple transformation of existing networks, which fortifies the hidden layers in a deep network by identifying when the hidden states are off of the data manifold, and maps these hidden states back to parts of the data manifold where the network performs well. Our principal contribution is to show that fortifying these hidden states improves the robustness of deep networks and our experiments (i) demonstrate improved robustness to standard adversarial attacks in both black-box and white-box threat models; (ii) suggest that our improvements are not primarily due to the gradient masking problem and (iii) show the advantage of doing this fortification in the hidden layers instead of the input space. |
Tasks | |
Published | 2018-04-07 |
URL | http://arxiv.org/abs/1804.02485v1 |
http://arxiv.org/pdf/1804.02485v1.pdf | |
PWC | https://paperswithcode.com/paper/fortified-networks-improving-the-robustness |
Repo | https://github.com/jbinas/fortified-networks |
Framework | tf |
Quasi-Monte Carlo for multivariate distributions via generative neural networks
Title | Quasi-Monte Carlo for multivariate distributions via generative neural networks |
Authors | Marius Hofert, Avinash Prasad, Mu Zhu |
Abstract | Generative moment matching networks (GMMNs) are introduced as quasi-random number generators (QRNGs) for multivariate models with any underlying copula in order to estimate expectations with variance reduction. So far, QRNGs for multivariate distributions required a careful design, exploiting specific properties (such as conditional distributions) of the implied copula or the underlying quasi-Monte Carlo (QMC) point set, and were only tractable for a small number of models. Utilizing GMMNs allows one to construct QRNGs for a much larger variety of multivariate distributions without such restrictions. Once trained with a pseudo-random sample, these neural networks only require a multivariate standard uniform randomized QMC point set as input and are thus fast in estimating expectations of interest under dependence with variance reduction. Numerical examples are considered to demonstrate the approach, including applications inspired by risk management practice. All results are reproducible with the demo HPZ19 as part of the new R package gnn; select minimal working examples are provided in the demo GMMN_QMC of gnn |
Tasks | |
Published | 2018-11-01 |
URL | https://arxiv.org/abs/1811.00683v2 |
https://arxiv.org/pdf/1811.00683v2.pdf | |
PWC | https://paperswithcode.com/paper/quasi-random-number-generators-for |
Repo | https://github.com/jinghuazhao/Caprion |
Framework | none |
Learning Latent Fractional dynamics with Unknown Unknowns
Title | Learning Latent Fractional dynamics with Unknown Unknowns |
Authors | Gaurav Gupta, Sergio Pequito, Paul Bogdan |
Abstract | Despite significant effort in understanding complex systems (CS), we lack a theory for modeling, inference, analysis and efficient control of time-varying complex networks (TVCNs) in uncertain environments. From brain activity dynamics to microbiome, and even chromatin interactions within the genome architecture, many such TVCNs exhibits a pronounced spatio-temporal fractality. Moreover, for many TVCNs only limited information (e.g., few variables) is accessible for modeling, which hampers the capabilities of analytical tools to uncover the true degrees of freedom and infer the CS model, the hidden states and their parameters. Another fundamental limitation is that of understanding and unveiling of unknown drivers of the dynamics that could sporadically excite the network in ways that straightforward modeling does not work due to our inability to model non-stationary processes. Towards addressing these challenges, in this paper, we consider the problem of learning the fractional dynamical complex networks under unknown unknowns (i.e., hidden drivers) and partial observability (i.e., only partial data is available). More precisely, we consider a generalized modeling approach of TVCNs consisting of discrete-time fractional dynamical equations and propose an iterative framework to determine the network parameterization and predict the state of the system. We showcase the performance of the proposed framework in the context of task classification using real electroencephalogram data. |
Tasks | |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00703v2 |
http://arxiv.org/pdf/1811.00703v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-latent-fractional-dynamics-with |
Repo | https://github.com/gaurav71531/hiddenState |
Framework | none |
Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction
Title | Achieving Fairness through Adversarial Learning: an Application to Recidivism Prediction |
Authors | Christina Wadsworth, Francesca Vera, Chris Piech |
Abstract | Recidivism prediction scores are used across the USA to determine sentencing and supervision for hundreds of thousands of inmates. One such generator of recidivism prediction scores is Northpointe’s Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) score, used in states like California and Florida, which past research has shown to be biased against black inmates according to certain measures of fairness. To counteract this racial bias, we present an adversarially-trained neural network that predicts recidivism and is trained to remove racial bias. When comparing the results of our model to COMPAS, we gain predictive accuracy and get closer to achieving two out of three measures of fairness: parity and equality of odds. Our model can be generalized to any prediction and demographic. This piece of research contributes an example of scientific replication and simplification in a high-stakes real-world application like recidivism prediction. |
Tasks | |
Published | 2018-06-30 |
URL | http://arxiv.org/abs/1807.00199v1 |
http://arxiv.org/pdf/1807.00199v1.pdf | |
PWC | https://paperswithcode.com/paper/achieving-fairness-through-adversarial |
Repo | https://github.com/dns43/fairness |
Framework | tf |
Slimmable Neural Networks
Title | Slimmable Neural Networks |
Authors | Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, Thomas Huang |
Abstract | We present a simple and general method to train a single neural network executable at different widths (number of channels in a layer), permitting instant and adaptive accuracy-efficiency trade-offs at runtime. Instead of training individual networks with different width configurations, we train a shared network with switchable batch normalization. At runtime, the network can adjust its width on the fly according to on-device benchmarks and resource constraints, rather than downloading and offloading different models. Our trained networks, named slimmable neural networks, achieve similar (and in many cases better) ImageNet classification accuracy than individually trained models of MobileNet v1, MobileNet v2, ShuffleNet and ResNet-50 at different widths respectively. We also demonstrate better performance of slimmable models compared with individual ones across a wide range of applications including COCO bounding-box object detection, instance segmentation and person keypoint detection without tuning hyper-parameters. Lastly we visualize and discuss the learned features of slimmable networks. Code and models are available at: https://github.com/JiahuiYu/slimmable_networks |
Tasks | Instance Segmentation, Keypoint Detection, Object Detection, Semantic Segmentation |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1812.08928v1 |
http://arxiv.org/pdf/1812.08928v1.pdf | |
PWC | https://paperswithcode.com/paper/slimmable-neural-networks |
Repo | https://github.com/JiahuiYu/slimmable_networks |
Framework | pytorch |
Accurate and Scalable Image Clustering Based On Sparse Representation of Camera Fingerprint
Title | Accurate and Scalable Image Clustering Based On Sparse Representation of Camera Fingerprint |
Authors | Quoc-Tin Phan, Giulia Boato, Francesco G. B. De Natale |
Abstract | Clustering images according to their acquisition devices is a well-known problem in multimedia forensics, which is typically faced by means of camera Sensor Pattern Noise (SPN). Such an issue is challenging since SPN is a noise-like signal, hard to be estimated and easy to be attenuated or destroyed by many factors. Moreover, the high dimensionality of SPN hinders large-scale applications. Existing approaches are typically based on the correlation among SPNs in the pixel domain, which might not be able to capture intrinsic data structure in union of vector subspaces. In this paper, we propose an accurate clustering framework, which exploits linear dependencies among SPNs in their intrinsic vector subspaces. Such dependencies are encoded under sparse representations which are obtained by solving a LASSO problem with non-negativity constraint. The proposed framework is highly accurate in number of clusters estimation and image association. Moreover, our framework is scalable to the number of images and robust against double JPEG compression as well as the presence of outliers, owning big potential for real-world applications. Experimental results on Dresden and Vision database show that our proposed framework can adapt well to both medium-scale and large-scale contexts, and outperforms state-of-the-art methods. |
Tasks | Image Clustering |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.07945v2 |
http://arxiv.org/pdf/1810.07945v2.pdf | |
PWC | https://paperswithcode.com/paper/accurate-and-scalable-image-clustering-based |
Repo | https://github.com/quoctin/residual-clustering |
Framework | none |
Decipherment of Historical Manuscript Images
Title | Decipherment of Historical Manuscript Images |
Authors | Xusen Yin, Nada Aldarrab, Beáta Megyesi, Kevin Knight |
Abstract | European libraries and archives are filled with enciphered manuscripts from the early modern period. These include military and diplomatic correspondence, records of secret societies, private letters, and so on. Although they are enciphered with classical cryptographic algorithms, their contents are unavailable to working historians. We therefore attack the problem of automatically converting cipher manuscript images into plaintext. We develop unsupervised models for character segmentation, character-image clustering, and decipherment of cluster sequences. We experiment with both pipelined and joint models, and we give empirical results for multiple ciphers. |
Tasks | Image Clustering |
Published | 2018-10-09 |
URL | https://arxiv.org/abs/1810.04297v3 |
https://arxiv.org/pdf/1810.04297v3.pdf | |
PWC | https://paperswithcode.com/paper/decipherment-of-historical-manuscript-images |
Repo | https://github.com/yinxusen/decipherment-images |
Framework | none |
Single Image Haze Removal using a Generative Adversarial Network
Title | Single Image Haze Removal using a Generative Adversarial Network |
Authors | Bharath Raj N., Venkateswaran N |
Abstract | Single image haze removal is an under constrained problem due to lack of depth information. It is usually performed by estimating the transmission map directly or by using a prior. Other methods use predictive models to estimate the transmission map and perform guided dehazing. In this paper, we propose a conditional GAN, that can directly remove haze from an image, without explicitly estimating transmission map or haze relevant features. We find that, only one module, comprising of the generator and discriminator is enough. We replaced the classic U-Net with the Tiramisu model, yielding much higher parameter efficiency and performance. We also observe that the performance during inference is dependent on the diversity of the dataset used for training. Experiments on synthetic and real world hazy images prove that our model performs competitively with the state of the art models. |
Tasks | Single Image Haze Removal |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09479v1 |
http://arxiv.org/pdf/1810.09479v1.pdf | |
PWC | https://paperswithcode.com/paper/single-image-haze-removal-using-a-generative |
Repo | https://github.com/thatbrguy/Dehaze-GAN |
Framework | tf |
On the Spectral Bias of Neural Networks
Title | On the Spectral Bias of Neural Networks |
Authors | Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, Aaron Courville |
Abstract | Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100%$ accuracy. In this work, we present properties of neural networks that complement this aspect of expressivity. By using tools from Fourier analysis, we show that deep ReLU networks are biased towards low frequency functions, meaning that they cannot have local fluctuations without affecting their global behavior. Intuitively, this property is in line with the observation that over-parameterized networks find simple patterns that generalize across data samples. We also investigate how the shape of the data manifold affects expressivity by showing evidence that learning high frequencies gets \emph{easier} with increasing manifold complexity, and present a theoretical understanding of this behavior. Finally, we study the robustness of the frequency components with respect to parameter perturbation, to develop the intuition that the parameters must be finely tuned to express high frequency functions. |
Tasks | |
Published | 2018-06-22 |
URL | https://arxiv.org/abs/1806.08734v3 |
https://arxiv.org/pdf/1806.08734v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-spectral-bias-of-neural-networks |
Repo | https://github.com/nasimrahaman/SpectralBias |
Framework | none |
Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation
Title | Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation |
Authors | Aysegul Dundar, Ming-Yu Liu, Ting-Chun Wang, John Zedlewski, Jan Kautz |
Abstract | Deep neural networks have largely failed to effectively utilize synthetic data when applied to real images due to the covariate shift problem. In this paper, we show that by applying a straightforward modification to an existing photorealistic style transfer algorithm, we achieve state-of-the-art synthetic-to-real domain adaptation results. We conduct extensive experimental validations on four synthetic-to-real tasks for semantic segmentation and object detection, and show that our approach exceeds the performance of any current state-of-the-art GAN-based image translation approach as measured by segmentation and object detection metrics. Furthermore we offer a distance based analysis of our method which shows a dramatic reduction in Frechet Inception distance between the source and target domains, offering a quantitative metric that demonstrates the effectiveness of our algorithm in bridging the synthetic-to-real gap. |
Tasks | Domain Adaptation, Object Detection, Semantic Segmentation, Style Transfer |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.09384v1 |
http://arxiv.org/pdf/1807.09384v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-stylization-a-strong-simple-baseline |
Repo | https://github.com/smitheric95/domain_stylization |
Framework | pytorch |
Lipschitz Continuity in Model-based Reinforcement Learning
Title | Lipschitz Continuity in Model-based Reinforcement Learning |
Authors | Kavosh Asadi, Dipendra Misra, Michael L. Littman |
Abstract | We examine the impact of learning Lipschitz continuous models in the context of model-based reinforcement learning. We provide a novel bound on multi-step prediction error of Lipschitz models where we quantify the error using the Wasserstein metric. We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Lipschitz. We conclude with empirical results that show the benefits of controlling the Lipschitz constant of neural-network models. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07193v3 |
http://arxiv.org/pdf/1804.07193v3.pdf | |
PWC | https://paperswithcode.com/paper/lipschitz-continuity-in-model-based |
Repo | https://github.com/kavosh8/Lip |
Framework | none |