Paper Group ANR 694
A Model-based Projection Technique for Segmenting Customers. Human Perception of Performance. Robust Budget Allocation via Continuous Submodular Functions. MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis. Privacy-Preserving Deep Inference for Rich User Data on The Cloud. End-to-End Learning for Structured Prediction Energy Netwo …
A Model-based Projection Technique for Segmenting Customers
Title | A Model-based Projection Technique for Segmenting Customers |
Authors | Srikanth Jagabathula, Lakshminarayanan Subramanian, Ashwin Venkataraman |
Abstract | We consider the problem of segmenting a large population of customers into non-overlapping groups with similar preferences, using diverse preference observations such as purchases, ratings, clicks, etc. over subsets of items. We focus on the setting where the universe of items is large (ranging from thousands to millions) and unstructured (lacking well-defined attributes) and each customer provides observations for only a few items. These data characteristics limit the applicability of existing techniques in marketing and machine learning. To overcome these limitations, we propose a model-based projection technique, which transforms the diverse set of observations into a more comparable scale and deals with missing data by projecting the transformed data onto a low-dimensional space. We then cluster the projected data to obtain the customer segments. Theoretically, we derive precise necessary and sufficient conditions that guarantee asymptotic recovery of the true customer segments. Empirically, we demonstrate the speed and performance of our method in two real-world case studies: (a) 84% improvement in the accuracy of new movie recommendations on the MovieLens data set and (b) 6% improvement in the performance of similar item recommendations algorithm on an offline dataset at eBay. We show that our method outperforms standard latent-class and demographic-based techniques. |
Tasks | |
Published | 2017-01-25 |
URL | http://arxiv.org/abs/1701.07483v1 |
http://arxiv.org/pdf/1701.07483v1.pdf | |
PWC | https://paperswithcode.com/paper/a-model-based-projection-technique-for |
Repo | |
Framework | |
Human Perception of Performance
Title | Human Perception of Performance |
Authors | Luca Pappalardo, Paolo Cintia, Dino Pedreschi, Fosca Giannotti, Albert-Laszlo Barabasi |
Abstract | Humans are routinely asked to evaluate the performance of other individuals, separating success from failure and affecting outcomes from science to education and sports. Yet, in many contexts, the metrics driving the human evaluation process remain unclear. Here we analyse a massive dataset capturing players’ evaluations by human judges to explore human perception of performance in soccer, the world’s most popular sport. We use machine learning to design an artificial judge which accurately reproduces human evaluation, allowing us to demonstrate how human observers are biased towards diverse contextual features. By investigating the structure of the artificial judge, we uncover the aspects of the players’ behavior which attract the attention of human judges, demonstrating that human evaluation is based on a noticeability heuristic where only feature values far from the norm are considered to rate an individual’s performance. |
Tasks | |
Published | 2017-12-05 |
URL | http://arxiv.org/abs/1712.02224v1 |
http://arxiv.org/pdf/1712.02224v1.pdf | |
PWC | https://paperswithcode.com/paper/human-perception-of-performance |
Repo | |
Framework | |
Robust Budget Allocation via Continuous Submodular Functions
Title | Robust Budget Allocation via Continuous Submodular Functions |
Authors | Matthew Staib, Stefanie Jegelka |
Abstract | The optimal allocation of resources for maximizing influence, spread of information or coverage, has gained attention in the past years, in particular in machine learning and data mining. But in applications, the parameters of the problem are rarely known exactly, and using wrong parameters can lead to undesirable outcomes. We hence revisit a continuous version of the Budget Allocation or Bipartite Influence Maximization problem introduced by Alon et al. (2012) from a robust optimization perspective, where an adversary may choose the least favorable parameters within a confidence set. The resulting problem is a nonconvex-concave saddle point problem (or game). We show that this nonconvex problem can be solved exactly by leveraging connections to continuous submodular functions, and by solving a constrained submodular minimization problem. Although constrained submodular minimization is hard in general, here, we establish conditions under which such a problem can be solved to arbitrary precision $\epsilon$. |
Tasks | |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08791v2 |
http://arxiv.org/pdf/1702.08791v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-budget-allocation-via-continuous |
Repo | |
Framework | |
MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis
Title | MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis |
Authors | Rushil Anirudh, Jayaraman J. Thiagarajan, Rahul Sridhar, Timo Bremer |
Abstract | Interpretability has emerged as a crucial aspect of machine learning, aimed at providing insights into the working of complex neural networks. However, existing solutions vary vastly based on the nature of the interpretability task, with each use case requiring substantial time and effort. This paper introduces MARGIN, a simple yet general approach to address a large set of interpretability tasks ranging from identifying prototypes to explaining image predictions. MARGIN exploits ideas rooted in graph signal analysis to determine influential nodes in a graph, which are defined as those nodes that maximally describe a function defined on the graph. By carefully defining task-specific graphs and functions, we demonstrate that MARGIN outperforms existing approaches in a number of disparate interpretability challenges. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05407v3 |
http://arxiv.org/pdf/1711.05407v3.pdf | |
PWC | https://paperswithcode.com/paper/margin-uncovering-deep-neural-networks-using |
Repo | |
Framework | |
Privacy-Preserving Deep Inference for Rich User Data on The Cloud
Title | Privacy-Preserving Deep Inference for Rich User Data on The Cloud |
Authors | Seyed Ali Osia, Ali Shahin Shamsabadi, Ali Taheri, Kleomenis Katevas, Hamid R. Rabiee, Nicholas D. Lane, Hamed Haddadi |
Abstract | Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing at the source for simple tasks and lighter models, though they remain a challenge for larger, and more complicated models. In this paper, we present a hybrid approach for breaking down large, complex deep models for cooperative, privacy-preserving analytics. We do this by breaking down the popular deep architectures and fine-tune them in a particular way. We then evaluate the privacy benefits of this approach based on the information exposed to the cloud service. We also asses the local inference cost of different layers on a modern handset for mobile applications. Our evaluations show that by using certain kind of fine-tuning and embedding techniques and at a small processing costs, we can greatly reduce the level of information available to unintended tasks applied to the data feature on the cloud, and hence achieving the desired tradeoff between privacy and performance. |
Tasks | |
Published | 2017-10-04 |
URL | http://arxiv.org/abs/1710.01727v3 |
http://arxiv.org/pdf/1710.01727v3.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-deep-inference-for-rich |
Repo | |
Framework | |
End-to-End Learning for Structured Prediction Energy Networks
Title | End-to-End Learning for Structured Prediction Energy Networks |
Authors | David Belanger, Bishan Yang, Andrew McCallum |
Abstract | Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of non-convex SPEN energies is superior to baseline methods that use simplistic energy functions that can be minimized exactly. |
Tasks | Denoising, Image Denoising, Semantic Role Labeling, Structured Prediction |
Published | 2017-03-16 |
URL | http://arxiv.org/abs/1703.05667v2 |
http://arxiv.org/pdf/1703.05667v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-learning-for-structured-prediction |
Repo | |
Framework | |
Combining Thesaurus Knowledge and Probabilistic Topic Models
Title | Combining Thesaurus Knowledge and Probabilistic Topic Models |
Authors | Natalia Loukachevitch, Michael Nokel, Kirill Ivanov |
Abstract | In this paper we present the approach of introducing thesaurus knowledge into probabilistic topic models. The main idea of the approach is based on the assumption that the frequencies of semantically related words and phrases, which are met in the same texts, should be enhanced: this action leads to their larger contribution into topics found in these texts. We have conducted experiments with several thesauri and found that for improving topic models, it is useful to utilize domain-specific knowledge. If a general thesaurus, such as WordNet, is used, the thesaurus-based improvement of topic models can be achieved with excluding hyponymy relations in combined topic models. |
Tasks | Topic Models |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1707.09816v1 |
http://arxiv.org/pdf/1707.09816v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-thesaurus-knowledge-and |
Repo | |
Framework | |
Z-Forcing: Training Stochastic Recurrent Networks
Title | Z-Forcing: Training Stochastic Recurrent Networks |
Authors | Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, Yoshua Bengio |
Abstract | Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{https://github.com/anirudh9119/zforcing_nips17} |
Tasks | Language Modelling, Latent Variable Models |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05411v2 |
http://arxiv.org/pdf/1711.05411v2.pdf | |
PWC | https://paperswithcode.com/paper/z-forcing-training-stochastic-recurrent |
Repo | |
Framework | |
Comparison of Distances for Supervised Segmentation of White Matter Tractography
Title | Comparison of Distances for Supervised Segmentation of White Matter Tractography |
Authors | Emanuele Olivetti, Giulia Bertò, Pietro Gori, Nusrat Sharmin, Paolo Avesani |
Abstract | Tractograms are mathematical representations of the main paths of axons within the white matter of the brain, from diffusion MRI data. Such representations are in the form of polylines, called streamlines, and one streamline approximates the common path of tens of thousands of axons. The analysis of tractograms is a task of interest in multiple fields, like neurosurgery and neurology. A basic building block of many pipelines of analysis is the definition of a distance function between streamlines. Multiple distance functions have been proposed in the literature, and different authors use different distances, usually without a specific reason other than invoking the “common practice”. To this end, in this work we want to test such common practices, in order to obtain factual reasons for choosing one distance over another. For these reasons, in this work we compare many streamline distance functions available in the literature. We focus on the common task of automatic bundle segmentation and we adopt the recent approach of supervised segmentation from expert-based examples. Using the HCP dataset, we compare several distances obtaining guidelines on the choice of which distance function one should use for supervised bundle segmentation. |
Tasks | |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01440v1 |
http://arxiv.org/pdf/1708.01440v1.pdf | |
PWC | https://paperswithcode.com/paper/comparison-of-distances-for-supervised |
Repo | |
Framework | |
Hamiltonian Monte Carlo with Energy Conserving Subsampling
Title | Hamiltonian Monte Carlo with Energy Conserving Subsampling |
Authors | Khue-Dung Dang, Matias Quiroz, Robert Kohn, Minh-Ngoc Tran, Mattias Villani |
Abstract | Hamiltonian Monte Carlo (HMC) samples efficiently from high-dimensional posterior distributions with proposed parameter draws obtained by iterating on a discretized version of the Hamiltonian dynamics. The iterations make HMC computationally costly, especially in problems with large datasets, since it is necessary to compute posterior densities and their derivatives with respect to the parameters. Naively computing the Hamiltonian dynamics on a subset of the data causes HMC to lose its key ability to generate distant parameter proposals with high acceptance probability. The key insight in our article is that efficient subsampling HMC for the parameters is possible if both the dynamics and the acceptance probability are computed from the same data subsample in each complete HMC iteration. We show that this is possible to do in a principled way in a HMC-within-Gibbs framework where the subsample is updated using a pseudo marginal MH step and the parameters are then updated using an HMC step, based on the current subsample. We show that our subsampling methods are fast and compare favorably to two popular sampling algorithms that utilize gradient estimates from data subsampling. We also explore the current limitations of subsampling HMC algorithms by varying the quality of the variance reducing control variates used in the estimators of the posterior density and its gradients. |
Tasks | |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00955v3 |
http://arxiv.org/pdf/1708.00955v3.pdf | |
PWC | https://paperswithcode.com/paper/hamiltonian-monte-carlo-with-energy |
Repo | |
Framework | |
Community detection and stochastic block models: recent developments
Title | Community detection and stochastic block models: recent developments |
Authors | Emmanuel Abbe |
Abstract | The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed. |
Tasks | Community Detection |
Published | 2017-03-29 |
URL | http://arxiv.org/abs/1703.10146v1 |
http://arxiv.org/pdf/1703.10146v1.pdf | |
PWC | https://paperswithcode.com/paper/community-detection-and-stochastic-block |
Repo | |
Framework | |
Does Neural Machine Translation Benefit from Larger Context?
Title | Does Neural Machine Translation Benefit from Larger Context? |
Authors | Sebastien Jean, Stanislas Lauly, Orhan Firat, Kyunghyun Cho |
Abstract | We propose a neural machine translation architecture that models the surrounding text in addition to the source sentence. These models lead to better performance, both in terms of general translation quality and pronoun prediction, when trained on small corpora, although this improvement largely disappears when trained with a larger corpus. We also discover that attention-based neural machine translation is well suited for pronoun prediction and compares favorably with other approaches that were specifically designed for this task. |
Tasks | Machine Translation |
Published | 2017-04-17 |
URL | http://arxiv.org/abs/1704.05135v1 |
http://arxiv.org/pdf/1704.05135v1.pdf | |
PWC | https://paperswithcode.com/paper/does-neural-machine-translation-benefit-from |
Repo | |
Framework | |
Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices
Title | Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices |
Authors | Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama |
Abstract | In a recent conference paper, we have reported a rhythm transcription method based on a merged-output hidden Markov model (HMM) that explicitly describes the multiple-voice structure of polyphonic music. This model solves a major problem of conventional methods that could not properly describe the nature of multiple voices as in polyrhythmic scores or in the phenomenon of loose synchrony between voices. In this paper we present a complete description of the proposed model and develop an inference technique, which is valid for any merged-output HMMs for which output probabilities depend on past events. We also examine the influence of the architecture and parameters of the method in terms of accuracies of rhythm transcription and voice separation and perform comparative evaluations with six other algorithms. Using MIDI recordings of classical piano pieces, we found that the proposed model outperformed other methods by more than 12 points in the accuracy for polyrhythmic performances and performed almost as good as the best one for non-polyrhythmic performances. This reveals the state-of-the-art methods of rhythm transcription for the first time in the literature. Publicly available source codes are also provided for future comparisons. |
Tasks | |
Published | 2017-01-29 |
URL | http://arxiv.org/abs/1701.08343v1 |
http://arxiv.org/pdf/1701.08343v1.pdf | |
PWC | https://paperswithcode.com/paper/rhythm-transcription-of-polyphonic-piano |
Repo | |
Framework | |
Developing an ontology for the access to the contents of an archival fonds: the case of the Catasto Gregoriano
Title | Developing an ontology for the access to the contents of an archival fonds: the case of the Catasto Gregoriano |
Authors | Lina Antonietta Coppola |
Abstract | The research was proposed to exploit and extend the relational and contextual nature of the information assets of the Catasto Gregoriano, kept at the Archivio di Stato in Rome. Developed within the MODEUS project (Making Open Data Effectively Usable), this study originates from the following key ideas of MODEUS: to require Open Data to be expressed in terms of an ontology, and to include such an ontology as a documentation of the data themselves. Thus, Open Data are naturally linked by means of the ontology, which meets the requirements of the Linked Open Data vision. |
Tasks | |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04584v1 |
http://arxiv.org/pdf/1702.04584v1.pdf | |
PWC | https://paperswithcode.com/paper/developing-an-ontology-for-the-access-to-the |
Repo | |
Framework | |
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Title | Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning |
Authors | Baolin Peng, Xiujun Li, Lihong Li, Jianfeng Gao, Asli Celikyilmaz, Sungjin Lee, Kam-Fai Wong |
Abstract | Building a dialogue agent to fulfill complex tasks, such as travel planning, is challenging because the agent has to learn to collectively complete multiple subtasks. For example, the agent needs to reserve a hotel and book a flight so that there leaves enough time for commute between arrival and hotel check-in. This paper addresses this challenge by formulating the task in the mathematical framework of options over Markov Decision Processes (MDPs), and proposing a hierarchical deep reinforcement learning approach to learning a dialogue manager that operates at different temporal scales. The dialogue manager consists of: (1) a top-level dialogue policy that selects among subtasks or options, (2) a low-level dialogue policy that selects primitive actions to complete the subtask given by the top-level policy, and (3) a global state tracker that helps ensure all cross-subtask constraints be satisfied. Experiments on a travel planning task with simulated and real users show that our approach leads to significant improvements over three baselines, two based on handcrafted rules and the other based on flat deep reinforcement learning. |
Tasks | Task-Completion Dialogue Policy Learning |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.03084v3 |
http://arxiv.org/pdf/1704.03084v3.pdf | |
PWC | https://paperswithcode.com/paper/composite-task-completion-dialogue-policy |
Repo | |
Framework | |