Paper Group ANR 126
A Survey on Practical Applications of Multi-Armed and Contextual Bandits. Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech. Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed SR1. A Novel BiLevel Paradigm for Image-to-Image Translation. Structured Output Learning with Conditional Gener …
A Survey on Practical Applications of Multi-Armed and Contextual Bandits
Title | A Survey on Practical Applications of Multi-Armed and Contextual Bandits |
Authors | Djallel Bouneffouf, Irina Rish |
Abstract | In recent years, multi-armed bandit (MAB) framework has attracted a lot of attention in various applications, from recommender systems and information retrieval to healthcare and finance, due to its stellar performance combined with certain attractive properties, such as learning from less feedback. The multi-armed bandit field is currently flourishing, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical bandit problem. This article aims to provide a comprehensive review of top recent developments in multiple real-life applications of the multi-armed bandit. Specifically, we introduce a taxonomy of common MAB-based applications and summarize state-of-art for each of those domains. Furthermore, we identify important current trends and provide new perspectives pertaining to the future of this exciting and fast-growing field. |
Tasks | Information Retrieval, Multi-Armed Bandits, Recommendation Systems |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.10040v1 |
http://arxiv.org/pdf/1904.10040v1.pdf | |
PWC | https://paperswithcode.com/paper/190410040 |
Repo | |
Framework | |
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech
Title | Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech |
Authors | Daniel Korzekwa, Roberto Barra-Chicote, Bozena Kostek, Thomas Drugman, Mateusz Lajszczak |
Abstract | This paper proposed a novel approach for the detection and reconstruction of dysarthric speech. The encoder-decoder model factorizes speech into a low-dimensional latent space and encoding of the input text. We showed that the latent space conveys interpretable characteristics of dysarthria, such as intelligibility and fluency of speech. MUSHRA perceptual test demonstrated that the adaptation of the latent space let the model generate speech of improved fluency. The multi-task supervised approach for predicting both the probability of dysarthric speech and the mel-spectrogram helps improve the detection of dysarthria with higher accuracy. This is thanks to a low-dimensional latent space of the auto-encoder as opposed to directly predicting dysarthria from a highly dimensional mel-spectrogram. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04743v1 |
https://arxiv.org/pdf/1907.04743v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-deep-learning-model-for-the |
Repo | |
Framework | |
Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed SR1
Title | Scaling Up Quasi-Newton Algorithms: Communication Efficient Distributed SR1 |
Authors | Majid Jahani, Mohammadreza Nazari, Sergey Rusakov, Albert S. Berahas, Martin Takáč |
Abstract | In this paper, we present a scalable distributed implementation of the sampled LSR1 (S-LSR1) algorithm. First, we show that a naive distributed implementation of S-LSR1 requires multiple rounds of expensive communications at every iteration and thus is inefficient. We then propose DS-LSR1, a communication-efficient variant of the S-LSR1 method, that drastically reduces the amount of data communicated at every iteration, that has favorable work-load balancing across nodes and that is matrix-free and inverse-free. The proposed method scales well in terms of both the dimension of the problem and the number of data points. Finally, we illustrate the performance of DS-LSR1 on standard neural network training tasks. |
Tasks | |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13096v1 |
https://arxiv.org/pdf/1905.13096v1.pdf | |
PWC | https://paperswithcode.com/paper/scaling-up-quasi-newton-algorithms |
Repo | |
Framework | |
A Novel BiLevel Paradigm for Image-to-Image Translation
Title | A Novel BiLevel Paradigm for Image-to-Image Translation |
Authors | Liqian Ma, Qianru Sun, Bernt Schiele, Luc Van Gool |
Abstract | Image-to-image (I2I) translation is a pixel-level mapping that requires a large number of paired training data and often suffers from the problems of high diversity and strong category bias in image scenes. In order to tackle these problems, we propose a novel BiLevel (BiL) learning paradigm that alternates the learning of two models, respectively at an instance-specific (IS) and a general-purpose (GP) level. In each scene, the IS model learns to maintain the specific scene attributes. It is initialized by the GP model that learns from all the scenes to obtain the generalizable translation knowledge. This GP initialization gives the IS model an efficient starting point, thus enabling its fast adaptation to the new scene with scarce training data. We conduct extensive I2I translation experiments on human face and street view datasets. Quantitative results validate that our approach can significantly boost the performance of classical I2I translation models, such as PG2 and Pix2Pix. Our visualization results show both higher image quality and more appropriate instance-specific details, e.g., the translated image of a person looks more like that person in terms of identity. |
Tasks | Image-to-Image Translation |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.09028v1 |
http://arxiv.org/pdf/1904.09028v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-bilevel-paradigm-for-image-to-image |
Repo | |
Framework | |
Structured Output Learning with Conditional Generative Flows
Title | Structured Output Learning with Conditional Generative Flows |
Authors | You Lu, Bert Huang |
Abstract | Traditional structured prediction models try to learn the conditional likelihood, i.e., p(yx), to capture the relationship between the structured output y and the input features x. For many models, computing the likelihood is intractable. These models are therefore hard to train, requiring the use of surrogate objectives or variational inference to approximate likelihood. In this paper, we propose conditional Glow (c-Glow), a conditional generative flow for structured output learning. C-Glow benefits from the ability of flow-based models to compute p(yx) exactly and efficiently. Learning with c-Glow does not require a surrogate objective or performing inference during training. Once trained, we can directly and efficiently generate conditional samples. We develop a sample-based prediction method, which can use this advantage to do efficient and effective inference. In our experiments, we test c-Glow on five different tasks. C-Glow outperforms the state-of-the-art baselines in some tasks and predicts comparable outputs in the other tasks. The results show that c-Glow is versatile and is applicable to many different structured prediction problems. |
Tasks | Structured Prediction |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13288v3 |
https://arxiv.org/pdf/1905.13288v3.pdf | |
PWC | https://paperswithcode.com/paper/structured-output-learning-with-conditional |
Repo | |
Framework | |
Improving the Explainability of Neural Sentiment Classifiers via Data Augmentation
Title | Improving the Explainability of Neural Sentiment Classifiers via Data Augmentation |
Authors | Hanjie Chen, Yangfeng Ji |
Abstract | Sentiment analysis has been widely used by businesses for social media opinion mining, especially in the financial services industry, where customers’ feedbacks are critical for companies. Recent progress of neural network models has achieved remarkable performance on sentiment classification, while the lack of classification interpretation may raise the trustworthy and many other issues in practice. In this work, we study the problem of improving the explainability of existing sentiment classifiers. We propose two data augmentation methods that create additional training examples to help improve model explainability: one method with a predefined sentiment word list as external knowledge and the other with adversarial examples. We test the proposed methods on both CNN and RNN classifiers with three benchmark sentiment datasets. The model explainability is assessed by both human evaluators and a simple automatic evaluation measurement. Experiments show the proposed data augmentation methods significantly improve the explainability of both neural classifiers. |
Tasks | Data Augmentation, Opinion Mining, Sentiment Analysis |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04225v3 |
https://arxiv.org/pdf/1909.04225v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-the-interpretability-of-neural |
Repo | |
Framework | |
Neural Network Attributions: A Causal Perspective
Title | Neural Network Attributions: A Causal Perspective |
Authors | Aditya Chattopadhyay, Piyushi Manupriya, Anirban Sarkar, Vineeth N Balasubramanian |
Abstract | We propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such). The neural network architecture is viewed as a Structural Causal Model, and a methodology to compute the causal effect of each feature on the output is presented. With reasonable assumptions on the causal structure of the input data, we propose algorithms to efficiently compute the causal effects, as well as scale the approach to data with large dimensionality. We also show how this method can be used for recurrent neural networks. We report experimental results on both simulated and real datasets showcasing the promise and usefulness of the proposed algorithm. |
Tasks | |
Published | 2019-02-06 |
URL | https://arxiv.org/abs/1902.02302v4 |
https://arxiv.org/pdf/1902.02302v4.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-attributions-a-causal |
Repo | |
Framework | |
The Generalized Likelihood Ratio Test meets klUCB: an Improved Algorithm for Piece-Wise Non-Stationary Bandits
Title | The Generalized Likelihood Ratio Test meets klUCB: an Improved Algorithm for Piece-Wise Non-Stationary Bandits |
Authors | Lilian Besson, Emilie Kaufmann |
Abstract | We propose a new algorithm for the piece-wise \iid{} non-stationary bandit problem with bounded rewards. Our proposal, GLR-klUCB, combines an efficient bandit algorithm, klUCB, with an efficient, parameter-free, change-point detector, the Bernoulli Generalized Likelihood Ratio Test, for which we provide new theoretical guarantees of independent interest. We analyze two variants of our strategy, based on local restarts and global restarts, and show that their regret is upper-bounded by $\mathcal{O}(\Upsilon_T \sqrt{T \log(T)})$ if the number of change-points $\Upsilon_T$ is unknown, and by $\mathcal{O}(\sqrt{\Upsilon_T T \log(T)})$ if $\Upsilon_T$ is known. This improves the state-of-the-art bounds, as our algorithm needs no tuning based on knowledge of the problem complexity other than $\Upsilon_T$. We present numerical experiments showing that GLR-klUCB outperforms passively and actively adaptive algorithms from the literature, and highlight the benefit of using local restarts. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01575v1 |
http://arxiv.org/pdf/1902.01575v1.pdf | |
PWC | https://paperswithcode.com/paper/the-generalized-likelihood-ratio-test-meets |
Repo | |
Framework | |
From the Token to the Review: A Hierarchical Multimodal approach to Opinion Mining
Title | From the Token to the Review: A Hierarchical Multimodal approach to Opinion Mining |
Authors | Alexandre Garcia, Pierre Colombo, Slim Essid, Florence d’Alché-Buc, Chloé Clavel |
Abstract | The task of predicting fine grained user opinion based on spontaneous spoken language is a key problem arising in the development of Computational Agents as well as in the development of social network based opinion miners. Unfortunately, gathering reliable data on which a model can be trained is notoriously difficult and existing works rely only on coarsely labeled opinions. In this work we aim at bridging the gap separating fine grained opinion models already developed for written language and coarse grained models developed for spontaneous multimodal opinion mining. We take advantage of the implicit hierarchical structure of opinions to build a joint fine and coarse grained opinion model that exploits different views of the opinion expression. The resulting model shares some properties with attention-based models and is shown to provide competitive results on a recently released multimodal fine grained annotated corpus. |
Tasks | Opinion Mining |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11216v3 |
https://arxiv.org/pdf/1908.11216v3.pdf | |
PWC | https://paperswithcode.com/paper/from-the-token-to-the-review-a-hierarchical |
Repo | |
Framework | |
Mixture Dense Regression for Object Detection and Human Pose Estimation
Title | Mixture Dense Regression for Object Detection and Human Pose Estimation |
Authors | Ali Varamesh, Tinne Tuytelaars |
Abstract | Mixture models are well-established machine learning approaches that, in computer vision, have mostly been applied to inverse or ill-defined problems. However, they are general-purpose divide-and-conquer techniques, splitting the input space into relatively homogeneous subsets, in a data-driven manner. Therefore, not only ill-defined but also well-defined complex problems should benefit from them. To this end, we devise a multi-modal solution for spatial regression using mixture density networks for dense object detection and human pose estimation. For both tasks, we show that a mixture model converges faster, yields higher accuracy, and divides the input space into interpretable modes. For object detection, mixture components learn to focus on object scale with the distribution of components closely following the distribution of ground truth object scale. For human pose estimation, a mixture model divides the data based on viewpoint and uncertainty – namely, front and back views, with back view imposing higher uncertainty. We conduct our experiments on the MS COCO dataset and do not face any mode collapse. However, to avoid numerical instabilities, we had to modify the activation function for the mixture variance terms slightly. |
Tasks | Dense Object Detection, Object Detection, Pose Estimation |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00821v1 |
https://arxiv.org/pdf/1912.00821v1.pdf | |
PWC | https://paperswithcode.com/paper/mixture-dense-regression-for-object-detection |
Repo | |
Framework | |
LNDb: A Lung Nodule Database on Computed Tomography
Title | LNDb: A Lung Nodule Database on Computed Tomography |
Authors | João Pedrosa, Guilherme Aresta, Carlos Ferreira, Márcio Rodrigues, Patrícia Leitão, André Silva Carvalho, João Rebelo, Eduardo Negrão, Isabel Ramos, António Cunha, Aurélio Campilho |
Abstract | Lung cancer is the deadliest type of cancer worldwide and late detection is the major factor for the low survival rate of patients. Low dose computed tomography has been suggested as a potential screening tool but manual screening is costly, time-consuming and prone to variability. This has fueled the development of automatic methods for the detection, segmentation and characterisation of pulmonary nodules but its application to clinical routine is challenging. In this study, a new database for the development and testing of pulmonary nodule computer-aided strategies is presented which intends to complement current databases by giving additional focus to radiologist variability and local clinical reality. State-of-the-art nodule detection, segmentation and characterization methods are tested and compared to manual annotations as well as collaborative strategies combining multiple radiologists and radiologists and computer-aided systems. It is shown that state-of-the-art methodologies can determine a patient’s follow-up recommendation as accurately as a radiologist, though the nodule detection method used shows decreased performance in this database. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08434v3 |
https://arxiv.org/pdf/1911.08434v3.pdf | |
PWC | https://paperswithcode.com/paper/lndb-a-lung-nodule-database-on-computed |
Repo | |
Framework | |
Linking emotions to behaviors through deep transfer learning
Title | Linking emotions to behaviors through deep transfer learning |
Authors | Haoqi Li, Brian Baucom, Panayiotis Georgiou |
Abstract | Human behavior refers to the way humans act and interact. Understanding human behavior is a cornerstone of observational practice, especially in psychotherapy. An important cue of behavior analysis is the dynamical changes of emotions during the conversation. Domain experts integrate emotional information in a highly nonlinear manner, thus, it is challenging to explicitly quantify the relationship between emotions and behaviors. In this work, we employ deep transfer learning to analyze their inferential capacity and contextual importance. We first train a network to quantify emotions from acoustic signals and then use information from the emotion recognition network as features for behavior recognition. We treat this emotion-related information as behavioral primitives and further train higher level layers towards behavior quantification. Through our analysis, we find that emotion-related information is an important cue for behavior recognition. Further, we investigate the importance of emotional-context in the expression of behavior by constraining (or not) the neural networks’ contextual view of the data. This demonstrates that the sequence of emotions is critical in behavior expression. To achieve these frameworks we employ hybrid architectures of convolutional networks and recurrent networks to extract emotion-related behavior primitives and facilitate automatic behavior recognition from speech. |
Tasks | Emotion Recognition, Transfer Learning |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03641v1 |
https://arxiv.org/pdf/1910.03641v1.pdf | |
PWC | https://paperswithcode.com/paper/linking-emotions-to-behaviors-through-deep |
Repo | |
Framework | |
Utilizing Eye Gaze to Enhance the Generalization of Imitation Networks to Unseen Environments
Title | Utilizing Eye Gaze to Enhance the Generalization of Imitation Networks to Unseen Environments |
Authors | Congcong Liu, Yuying Chen, Lei Tai, Ming Liu, Bertram Shi |
Abstract | Vision-based autonomous driving through imitation learning mimics the behaviors of human drivers by training on pairs of data of raw driver-view images and actions. However, there are other cues, e.g. gaze behavior, available from human drivers that have yet to be exploited. Previous research has shown that novice human learners can benefit from observing experts’ gaze patterns. We show here that deep neural networks can also benefit from this. We demonstrate different approaches to integrating gaze information into imitation networks. Our results show that the integration of gaze information improves the generalization performance of networks to unseen environments. |
Tasks | Autonomous Driving, Imitation Learning |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04728v2 |
https://arxiv.org/pdf/1907.04728v2.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-eye-gaze-to-enhance-the |
Repo | |
Framework | |
Learning Compact Target-Oriented Feature Representations for Visual Tracking
Title | Learning Compact Target-Oriented Feature Representations for Visual Tracking |
Authors | Chenglong Li, Yan Huang, Liang Wang, Jin Tang, Liang Lin |
Abstract | Many state-of-the-art trackers usually resort to the pretrained convolutional neural network (CNN) model for correlation filtering, in which deep features could usually be redundant, noisy and less discriminative for some certain instances, and the tracking performance might thus be affected. To handle this problem, we propose a novel approach, which takes both advantages of good generalization of generative models and excellent discrimination of discriminative models, for visual tracking. In particular, we learn compact, discriminative and target-oriented feature representations using the Laplacian coding algorithm that exploits the dependence among the input local features in a discriminative correlation filter framework. The feature representations and the correlation filter are jointly learnt to enhance to each other via a fast solver which only has very slight computational burden on the tracking speed. Extensive experiments on three benchmark datasets demonstrate that this proposed framework clearly outperforms baseline trackers with a modest impact on the frame rate, and performs comparably against the state-of-the-art methods. |
Tasks | Visual Tracking |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01442v1 |
https://arxiv.org/pdf/1908.01442v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-compact-target-oriented-feature |
Repo | |
Framework | |
Prediction of Horizontal Data Partitioning Through Query Execution Cost Estimation
Title | Prediction of Horizontal Data Partitioning Through Query Execution Cost Estimation |
Authors | Nino Arsov, Goran Velinov, Aleksandar S. Dimovski, Bojana Koteska, Dragan Sahpaski, Margina Kon-Popovska |
Abstract | The excessively increased volume of data in modern data management systems demands an improved system performance, frequently provided by data distribution, system scalability and performance optimization techniques. Optimized horizontal data partitioning has a significant influence of distributed data management systems. An optimally partitioned schema found in the early phase of logical database design without loading of real data in the system and its adaptation to changes of business environment are very important for a successful implementation, system scalability and performance improvement. In this paper we present a novel approach for finding an optimal horizontally partitioned schema that manifests a minimal total execution cost of a given database workload. Our approach is based on a formal model that enables abstraction of the predicates in the workload queries, and are subsequently used to define all relational fragments. This approach has predictive features acquired by simulation of horizontal partitioning, without loading any data into the partitions, but instead, altering the statistics in the database catalogs. We define an optimization problem and employ a genetic algorithm (GA) to find an approximately optimal horizontally partitioned schema. The solutions to the optimization problem are evaluated using PostgreSQL’s query optimizer. The initial experimental evaluation of our approach confirms its efficiency and correctness, and the numbers imply that the approach is effective in reducing the workload execution cost. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11725v1 |
https://arxiv.org/pdf/1911.11725v1.pdf | |
PWC | https://paperswithcode.com/paper/prediction-of-horizontal-data-partitioning |
Repo | |
Framework | |