January 29, 2020

3041 words 15 mins read

Paper Group ANR 724

Paper Group ANR 724

Listening between the Lines: Learning Personal Attributes from Conversations. Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder. On the Proof of Fixed-Point Convergence for Plug-and-Play ADMM. Modeling Confidence in Sequence-to-Sequence Models. Stochastic Conditional Gradient++. Improved Algorithm on Online Cluster …

Listening between the Lines: Learning Personal Attributes from Conversations

Title Listening between the Lines: Learning Personal Attributes from Conversations
Authors Anna Tigunova, Andrew Yates, Paramita Mirza, Gerhard Weikum
Abstract Open-domain dialogue agents must be able to converse about many topics while incorporating knowledge about the user into the conversation. In this work we address the acquisition of such knowledge, for personalization in downstream Web applications, by extracting personal attributes from conversations. This problem is more challenging than the established task of information extraction from scientific publications or Wikipedia articles, because dialogues often give merely implicit cues about the speaker. We propose methods for inferring personal attributes, such as profession, age or family status, from conversations using deep learning. Specifically, we propose several Hidden Attribute Models, which are neural networks leveraging attention mechanisms and embeddings. Our methods are trained on a per-predicate basis to output rankings of object values for a given subject-predicate combination (e.g., ranking the doctor and nurse professions high when speakers talk about patients, emergency rooms, etc). Experiments with various conversational texts including Reddit discussions, movie scripts and a collection of crowdsourced personal dialogues demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.
Tasks
Published 2019-04-24
URL http://arxiv.org/abs/1904.10887v1
PDF http://arxiv.org/pdf/1904.10887v1.pdf
PWC https://paperswithcode.com/paper/listening-between-the-lines-learning-personal
Repo
Framework

Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder

Title Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder
Authors Budhaditya Deb, Peter Bailey, Milad Shokouhi
Abstract We consider the problem of diversifying automated reply suggestions for a commercial instant-messaging (IM) system (Skype). Our conversation model is a standard matching based information retrieval architecture, which consists of two parallel encoders to project messages and replies into a common feature representation. During inference, we select replies from a fixed response set using nearest neighbors in the feature space. To diversify responses, we formulate the model as a generative latent variable model with Conditional Variational Auto-Encoder (M-CVAE). We propose a constrained-sampling approach to make the variational inference in M-CVAE efficient for our production system. In offline experiments, M-CVAE consistently increased diversity by ~30-40% without significant impact on relevance. This translated to a 5% gain in click-rate in our online production system.
Tasks Information Retrieval
Published 2019-03-25
URL http://arxiv.org/abs/1903.10630v1
PDF http://arxiv.org/pdf/1903.10630v1.pdf
PWC https://paperswithcode.com/paper/diversifying-reply-suggestions-using-a
Repo
Framework

On the Proof of Fixed-Point Convergence for Plug-and-Play ADMM

Title On the Proof of Fixed-Point Convergence for Plug-and-Play ADMM
Authors Ruturaj G. Gavaskar, Kunal N. Chaudhury
Abstract In most state-of-the-art image restoration methods, the sum of a data-fidelity and a regularization term is optimized using an iterative algorithm such as ADMM (alternating direction method of multipliers). In recent years, the possibility of using denoisers for regularization has been explored in several works. A popular approach is to formally replace the proximal operator within the ADMM framework with some powerful denoiser. However, since most state-of-the-art denoisers cannot be posed as a proximal operator, one cannot guarantee the convergence of these so-called plug-and-play (PnP) algorithms. In fact, the theoretical convergence of PnP algorithms is an active research topic. In this letter, we consider the result of Chan et al. (IEEE TCI, 2017), where fixed-point convergence of an ADMM-based PnP algorithm was established for a class of denoisers. We argue that the original proof is incomplete, since convergence is not analyzed for one of the three possible cases outlined in the paper. Moreover, we explain why the argument for the other cases does not apply in this case. We give a different analysis to fill this gap, which firmly establishes the original convergence theorem.
Tasks Image Restoration
Published 2019-10-31
URL https://arxiv.org/abs/1910.14325v1
PDF https://arxiv.org/pdf/1910.14325v1.pdf
PWC https://paperswithcode.com/paper/on-the-proof-of-fixed-point-convergence-for
Repo
Framework

Modeling Confidence in Sequence-to-Sequence Models

Title Modeling Confidence in Sequence-to-Sequence Models
Authors Jan Niehues, Ngoc-Quan Pham
Abstract Recently, significant improvements have been achieved in various natural language processing tasks using neural sequence-to-sequence models. While aiming for the best generation quality is important, ultimately it is also necessary to develop models that can assess the quality of their output. In this work, we propose to use the similarity between training and test conditions as a measure for models’ confidence. We investigate methods solely using the similarity as well as methods combining it with the posterior probability. While traditionally only target tokens are annotated with confidence measures, we also investigate methods to annotate source tokens with confidence. By learning an internal alignment model, we can significantly improve confidence projection over using state-of-the-art external alignment tools. We evaluate the proposed methods on downstream confidence estimation for machine translation (MT). We show improvements on segment-level confidence estimation as well as on confidence estimation for source tokens. In addition, we show that the same methods can also be applied to other tasks using sequence-to-sequence models. On the automatic speech recognition (ASR) task, we are able to find 60% of the errors by looking at 20% of the data.
Tasks Machine Translation, Speech Recognition
Published 2019-10-04
URL https://arxiv.org/abs/1910.01859v1
PDF https://arxiv.org/pdf/1910.01859v1.pdf
PWC https://paperswithcode.com/paper/modeling-confidence-in-sequence-to-sequence
Repo
Framework

Stochastic Conditional Gradient++

Title Stochastic Conditional Gradient++
Authors Hamed Hassani, Amin Karbasi, Aryan Mokhtari, Zebang Shen
Abstract In this paper, we consider the general non-oblivious stochastic optimization where the underlying stochasticity may change during the optimization procedure and depends on the point at which the function is evaluated. We develop Stochastic Frank-Wolfe++ ($\text{SFW}{++} $), an efficient variant of the conditional gradient method for minimizing a smooth non-convex function subject to a convex body constraint. We show that $\text{SFW}{++} $ converges to an $\epsilon$-first order stationary point by using $O(1/\epsilon^3)$ stochastic gradients. Once further structures are present, $\text{SFW}{++}$'s theoretical guarantees, in terms of the convergence rate and quality of its solution, improve. In particular, for minimizing a convex function, $\text{SFW}{++} $ achieves an $\epsilon$-approximate optimum while using $O(1/\epsilon^2)$ stochastic gradients. It is known that this rate is optimal in terms of stochastic gradient evaluations. Similarly, for maximizing a monotone continuous DR-submodular function, a slightly different form of $\text{SFW}{++} $, called Stochastic Continuous Greedy++ ($\text{SCG}{++} $), achieves a tight $[(1-1/e)\text{OPT} -\epsilon]$ solution while using $O(1/\epsilon^2)$ stochastic gradients. Through an information theoretic argument, we also prove that $\text{SCG}{++} $'s convergence rate is optimal. Finally, for maximizing a non-monotone continuous DR-submodular function, we can achieve a $[(1/e)\text{OPT} -\epsilon]$ solution by using $O(1/\epsilon^2)$ stochastic gradients. We should highlight that our results and our novel variance reduction technique trivially extend to the standard and easier oblivious stochastic optimization settings for (non-)covex and continuous submodular settings.
Tasks Stochastic Optimization
Published 2019-02-19
URL https://arxiv.org/abs/1902.06992v3
PDF https://arxiv.org/pdf/1902.06992v3.pdf
PWC https://paperswithcode.com/paper/stochastic-conditional-gradient
Repo
Framework

Improved Algorithm on Online Clustering of Bandits

Title Improved Algorithm on Online Clustering of Bandits
Authors Shuai Li, Wei Chen, Shuai Li, Kwong-Sak Leung
Abstract We generalize the setting of online clustering of bandits by allowing non-uniform distribution over user frequencies. A more efficient algorithm is proposed with simple set structures to represent clusters. We prove a regret bound for the new algorithm which is free of the minimal frequency over users. The experiments on both synthetic and real datasets consistently show the advantage of the new algorithm over existing methods.
Tasks
Published 2019-02-25
URL https://arxiv.org/abs/1902.09162v2
PDF https://arxiv.org/pdf/1902.09162v2.pdf
PWC https://paperswithcode.com/paper/improved-algorithm-on-online-clustering-of
Repo
Framework

Solver Recommendation For Transport Problems in Slabs Using Machine Learning

Title Solver Recommendation For Transport Problems in Slabs Using Machine Learning
Authors Jinzhao Chen, Japan K. Patel, Richard Vasques
Abstract The use of machine learning algorithms to address classification problems is on the rise in many research areas. The current study is aimed at testing the potential of using such algorithms to auto-select the best solvers for transport problems in uniform slabs. Three solvers are used in this work: Richardson, diffusion synthetic acceleration, and nonlinear diffusion acceleration. Three parameters are manipulated to create different transport problem scenarios. Five machine learning algorithms are applied: linear discriminant analysis, K-nearest neighbors, support vector machine, random forest, and neural networks. We present and analyze the results of these algorithms for the test problems, showing that random forest and K-nearest neighbors are potentially the best suited candidates for this type of classification problem.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.08259v1
PDF https://arxiv.org/pdf/1906.08259v1.pdf
PWC https://paperswithcode.com/paper/solver-recommendation-for-transport-problems
Repo
Framework

Interpreting Basis Path Set in Neural Networks

Title Interpreting Basis Path Set in Neural Networks
Authors Juanping Zhu, Qi Meng, Wei Chen, Zhi-ming Ma
Abstract Based on basis path set, G-SGD algorithm significantly outperforms conventional SGD algorithm in optimizing neural networks. However, how the inner mechanism of basis paths work remains mysterious. From the aspect of graph theory, this paper defines basis path, investigates structure properties of basis paths in regular fully connected neural network and interprets the graph representation of basis path set. Moreover, we propose hierarchical algorithm HBPS to find basis path set B in fully connected neural network by decomposing the network into several independent and parallel substructures. Algorithm HBPS demands that there doesn’t exist shared edges between any two independent substructure paths.
Tasks
Published 2019-10-18
URL https://arxiv.org/abs/1910.09402v1
PDF https://arxiv.org/pdf/1910.09402v1.pdf
PWC https://paperswithcode.com/paper/interpreting-basis-path-set-in-neural
Repo
Framework

Deep Sequential Models for Suicidal Ideation from Multiple Source Data

Title Deep Sequential Models for Suicidal Ideation from Multiple Source Data
Authors Ignacio Peis, Pablo M. Olmos, Constanza Vera-Varela, María Luisa Barrigón, Philippe Courtet, Enrique Baca-García, Antonio Artés-Rodríguez
Abstract This article presents a novel method for predicting suicidal ideation from Electronic Health Records (EHR) and Ecological Momentary Assessment (EMA) data using deep sequential models. Both EHR longitudinal data and EMA question forms are defined by asynchronous, variable length, randomly-sampled data sequences. In our method, we model each of them with a Recurrent Neural Network (RNN), and both sequences are aligned by concatenating the hidden state of each of them using temporal marks. Furthermore, we incorporate attention schemes to improve performance in long sequences and time-independent pre-trained schemes to cope with very short sequences. Using a database of 1023 patients, our experimental results show that the addition of EMA records boosts the system recall to predict the suicidal ideation diagnosis from 48.13% obtained exclusively from EHR-based state-of-the-art methods to 67.78%. Additionally, our method provides interpretability through the t-SNE representation of the latent space. Further, the most relevant input features are identified and interpreted medically.
Tasks
Published 2019-11-06
URL https://arxiv.org/abs/1911.03522v1
PDF https://arxiv.org/pdf/1911.03522v1.pdf
PWC https://paperswithcode.com/paper/deep-sequential-models-for-suicidal-ideation
Repo
Framework

On Variational Learning of Controllable Representations for Text without Supervision

Title On Variational Learning of Controllable Representations for Text without Supervision
Authors Peng Xu, Jackie Chi Kit Cheung, Yanshuai Cao
Abstract The variational autoencoder (VAE) can learn the manifold of natural images on certain datasets, as evidenced by meaningful interpolation or extrapolation in the continuous latent space. However, on discrete data such as text, it is unclear if unsupervised learning can discover a similar latent space that allows controllable manipulation. In this work, we find that sequence VAEs trained on text fail to properly decode when the latent codes are manipulated, because the modified codes often land in holes or vacant regions in the aggregated posterior latent space, where the decoding network fails to generalize. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and perform manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method outperforms unsupervised baselines and strong supervised approaches on text style transfer. On automatic evaluation metrics used in text style transfer, even with the decoding network trained from scratch, our method achieves comparable results with state-of-the-art supervised approaches leveraging large-scale pre-trained models for generation. Furthermore, it is capable of performing more flexible fine-grained control over text generation than existing methods.
Tasks Style Transfer, Text Generation, Text Style Transfer
Published 2019-05-28
URL https://arxiv.org/abs/1905.11975v3
PDF https://arxiv.org/pdf/1905.11975v3.pdf
PWC https://paperswithcode.com/paper/unsupervised-controllable-text-generation
Repo
Framework

Towards automatic extractive text summarization of A-133 Single Audit reports with machine learning

Title Towards automatic extractive text summarization of A-133 Single Audit reports with machine learning
Authors Vivian T. Chou, LeAnna Kent, Joel A. Góngora, Sam Ballerini, Carl D. Hoover
Abstract The rapid growth of text data has motivated the development of machine-learning based automatic text summarization strategies that concisely capture the essential ideas in a larger text. This study aimed to devise an extractive summarization method for A-133 Single Audits, which assess if recipients of federal grants are compliant with program requirements for use of federal funding. Currently, these voluminous audits must be manually analyzed by officials for oversight, risk management, and prioritization purposes. Automated summarization has the potential to streamline these processes. Analysis focused on the “Findings” section of ~20,000 Single Audits spanning 2016-2018. Following text preprocessing and GloVe embedding, sentence-level k-means clustering was performed to partition sentences by topic and to establish the importance of each sentence. For each audit, key summary sentences were extracted by proximity to cluster centroids. Summaries were judged by non-expert human evaluation and compared to human-generated summaries using the ROUGE metric. Though the goal was to fully automate summarization of A-133 audits, human input was required at various stages due to large variability in audit writing style, content, and context. Examples of human inputs include the number of clusters, the choice to keep or discard certain clusters based on their content relevance, and the definition of a top sentence. Overall, this approach made progress towards automated extractive summaries of A-133 audits, with future work to focus on full automation and improving summary consistency. This work highlights the inherent difficulty and subjective nature of automated summarization in a real-world application.
Tasks Text Summarization
Published 2019-11-08
URL https://arxiv.org/abs/1911.06197v1
PDF https://arxiv.org/pdf/1911.06197v1.pdf
PWC https://paperswithcode.com/paper/towards-automatic-extractive-text
Repo
Framework

D3M: A deep domain decomposition method for partial differential equations

Title D3M: A deep domain decomposition method for partial differential equations
Authors Ke Li, Kejun Tang, Tianfan Wu, Qifeng Liao
Abstract A state-of-the-art deep domain decomposition method (D3M) based on the variational principle is proposed for partial differential equations (PDEs). The solution of PDEs can be formulated as the solution of a constrained optimization problem, and we design a multi-fidelity neural network framework to solve this optimization problem. Our contribution is to develop a systematical computational procedure for the underlying problem in parallel with domain decomposition. Our analysis shows that the D3M approximation solution converges to the exact solution of underlying PDEs. Our proposed framework establishes a foundation to use variational deep learning in large-scale engineering problems and designs. We present a general mathematical framework of D3M, validate its accuracy and demonstrate its efficiency with numerical experiments.
Tasks
Published 2019-09-24
URL https://arxiv.org/abs/1909.12236v1
PDF https://arxiv.org/pdf/1909.12236v1.pdf
PWC https://paperswithcode.com/paper/d3m-a-deep-domain-decomposition-method-for
Repo
Framework

A Vietnamese Text-Based Conversational Agent

Title A Vietnamese Text-Based Conversational Agent
Authors Dai Quoc Nguyen, Dat Quoc Nguyen, Son Bao Pham
Abstract This paper introduces a Vietnamese text-based conversational agent architecture on specific knowledge domain which is integrated in a question answering system. When the question answering system fails to provide answers to users’ input, our conversational agent can step in to interact with users to provide answers to users. Experimental results are promising where our Vietnamese text-based conversational agent achieves positive feedback in a study conducted in the university academic regulation domain.
Tasks Question Answering
Published 2019-11-26
URL https://arxiv.org/abs/1911.11547v1
PDF https://arxiv.org/pdf/1911.11547v1.pdf
PWC https://paperswithcode.com/paper/a-vietnamese-text-based-conversational-agent
Repo
Framework

Traffic signal control optimization under severe incident conditions using Genetic Algorithm

Title Traffic signal control optimization under severe incident conditions using Genetic Algorithm
Authors Tuo Mao, Adriana-Simona Mihaita, Chen Cai
Abstract Traffic control optimization is a challenging task for various traffic centres in the world and majority of approaches focus only on applying adaptive methods under normal (recurrent) traffic conditions. But optimizing the control plans when severe incidents occur still remains a hard topic to address, especially if a high number of lanes or entire intersections are affected. This paper aims at tackling this problem and presents a novel methodology for optimizing the traffic signal timings in signalized urban intersections, under non-recurrent traffic incidents. The approach relies on deploying genetic algorithms (GA) by considering the phase durations as decision variables and the objective function to minimize as the total travel time in the network. Firstly, we develop the GA algorithm on a signalized testbed network under recurrent traffic conditions, with the purpose of fine-tuning the algorithm for crossover, mutation, fitness calculation, and obtain the optimal phase durations. Secondly, we apply the optimal signal timings previously found under severe incidents affecting the traffic flow in the network but without any further optimization. Lastly, we further apply the GA optimization under incident conditions and show that our approach improved the total travel time by almost 40.76%.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.05356v1
PDF https://arxiv.org/pdf/1906.05356v1.pdf
PWC https://paperswithcode.com/paper/traffic-signal-control-optimization-under
Repo
Framework
Title Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search
Authors Sahin Cem Geyik, Stuart Ambler, Krishnaram Kenthapadi
Abstract We present a framework for quantifying and mitigating algorithmic bias in mechanisms designed for ranking individuals, typically used as part of web-scale search and recommendation systems. We first propose complementary measures to quantify bias with respect to protected attributes such as gender and age. We then present algorithms for computing fairness-aware re-ranking of results. For a given search or recommendation task, our algorithms seek to achieve a desired distribution of top ranked results with respect to one or more protected attributes. We show that such a framework can be tailored to achieve fairness criteria such as equality of opportunity and demographic parity depending on the choice of the desired distribution. We evaluate the proposed algorithms via extensive simulations over different parameter choices, and study the effect of fairness-aware ranking on both bias and utility measures. We finally present the online A/B testing results from applying our framework towards representative ranking in LinkedIn Talent Search, and discuss the lessons learned in practice. Our approach resulted in tremendous improvement in the fairness metrics (nearly three fold increase in the number of search queries with representative results) without affecting the business metrics, which paved the way for deployment to 100% of LinkedIn Recruiter users worldwide. Ours is the first large-scale deployed framework for ensuring fairness in the hiring domain, with the potential positive impact for more than 630M LinkedIn members.
Tasks Recommendation Systems
Published 2019-04-30
URL https://arxiv.org/abs/1905.01989v3
PDF https://arxiv.org/pdf/1905.01989v3.pdf
PWC https://paperswithcode.com/paper/fairness-aware-ranking-in-search
Repo
Framework
comments powered by Disqus