Paper Group ANR 328
Bootstrapping incremental dialogue systems: using linguistic knowledge to learn from minimal data. Bayesian Model Selection of Stochastic Block Models. subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs. Stochastic Variance Reduction Methods for Saddle-Point Problems. Solving Systems of Random Quadratic Equati …
Bootstrapping incremental dialogue systems: using linguistic knowledge to learn from minimal data
Title | Bootstrapping incremental dialogue systems: using linguistic knowledge to learn from minimal data |
Authors | Dimitrios Kalatzis, Arash Eshghi, Oliver Lemon |
Abstract | We present a method for inducing new dialogue systems from very small amounts of unannotated dialogue data, showing how word-level exploration using Reinforcement Learning (RL), combined with an incremental and semantic grammar - Dynamic Syntax (DS) - allows systems to discover, generate, and understand many new dialogue variants. The method avoids the use of expensive and time-consuming dialogue act annotations, and supports more natural (incremental) dialogues than turn-based systems. Here, language generation and dialogue management are treated as a joint decision/optimisation problem, and the MDP model for RL is constructed automatically. With an implemented system, we show that this method enables a wide range of dialogue variations to be automatically captured, even when the system is trained from only a single dialogue. The variants include question-answer pairs, over- and under-answering, self- and other-corrections, clarification interaction, split-utterances, and ellipsis. This generalisation property results from the structural knowledge and constraints present within the DS grammar, and highlights some limitations of recent systems built using machine learning techniques only. |
Tasks | Dialogue Management, Text Generation |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00347v1 |
http://arxiv.org/pdf/1612.00347v1.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-incremental-dialogue-systems |
Repo | |
Framework | |
Bayesian Model Selection of Stochastic Block Models
Title | Bayesian Model Selection of Stochastic Block Models |
Authors | Xiaoran Yan |
Abstract | A central problem in analyzing networks is partitioning them into modules or communities. One of the best tools for this is the stochastic block model, which clusters vertices into blocks with statistically homogeneous pattern of links. Despite its flexibility and popularity, there has been a lack of principled statistical model selection criteria for the stochastic block model. Here we propose a Bayesian framework for choosing the number of blocks as well as comparing it to the more elaborate degree- corrected block models, ultimately leading to a universal model selection framework capable of comparing multiple modeling combinations. We will also investigate its connection to the minimum description length principle. |
Tasks | Model Selection |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.07057v1 |
http://arxiv.org/pdf/1605.07057v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-model-selection-of-stochastic-block |
Repo | |
Framework | |
subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs
Title | subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs |
Authors | Annamalai Narayanan, Mahinthan Chandramohan, Lihui Chen, Yang Liu, Santhoshkumar Saminathan |
Abstract | In this paper, we present subgraph2vec, a novel approach for learning latent representations of rooted subgraphs from large graphs inspired by recent advancements in Deep Learning and Graph Kernels. These latent representations encode semantic substructure dependencies in a continuous vector space, which is easily exploited by statistical models for tasks such as graph classification, clustering, link prediction and community detection. subgraph2vec leverages on local information obtained from neighbourhoods of nodes to learn their latent representations in an unsupervised fashion. We demonstrate that subgraph vectors learnt by our approach could be used in conjunction with classifiers such as CNNs, SVMs and relational data clustering algorithms to achieve significantly superior accuracies. Also, we show that the subgraph vectors could be used for building a deep learning variant of Weisfeiler-Lehman graph kernel. Our experiments on several benchmark and large-scale real-world datasets reveal that subgraph2vec achieves significant improvements in accuracies over existing graph kernels on both supervised and unsupervised learning tasks. Specifically, on two realworld program analysis tasks, namely, code clone and malware detection, subgraph2vec outperforms state-of-the-art kernels by more than 17% and 4%, respectively. |
Tasks | Community Detection, Graph Classification, Link Prediction, Malware Detection |
Published | 2016-06-29 |
URL | http://arxiv.org/abs/1606.08928v1 |
http://arxiv.org/pdf/1606.08928v1.pdf | |
PWC | https://paperswithcode.com/paper/subgraph2vec-learning-distributed |
Repo | |
Framework | |
Stochastic Variance Reduction Methods for Saddle-Point Problems
Title | Stochastic Variance Reduction Methods for Saddle-Point Problems |
Authors | P Balamurugan, Francis Bach |
Abstract | We consider convex-concave saddle-point problems where the objective functions may be split in many components, and extend recent stochastic variance reduction methods (such as SVRG or SAGA) to provide the first large-scale linearly convergent algorithms for this class of problems which is common in machine learning. While the algorithmic extension is straightforward, it comes with challenges and opportunities: (a) the convex minimization analysis does not apply and we use the notion of monotone operators to prove convergence, showing in particular that the same algorithm applies to a larger class of problems, such as variational inequalities, (b) there are two notions of splits, in terms of functions, or in terms of partial derivatives, (c) the split does need to be done with convex-concave terms, (d) non-uniform sampling is key to an efficient algorithm, both in theory and practice, and (e) these incremental algorithms can be easily accelerated using a simple extension of the “catalyst” framework, leading to an algorithm which is always superior to accelerated batch algorithms. |
Tasks | |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06398v2 |
http://arxiv.org/pdf/1605.06398v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variance-reduction-methods-for-1 |
Repo | |
Framework | |
Solving Systems of Random Quadratic Equations via Truncated Amplitude Flow
Title | Solving Systems of Random Quadratic Equations via Truncated Amplitude Flow |
Authors | Gang Wang, Georgios B. Giannakis, Yonina C. Eldar |
Abstract | This paper presents a new algorithm, termed \emph{truncated amplitude flow} (TAF), to recover an unknown vector $\bm{x}$ from a system of quadratic equations of the form $y_i=\langle\bm{a}_i,\bm{x}\rangle^2$, where $\bm{a}_i$'s are given random measurement vectors. This problem is known to be \emph{NP-hard} in general. We prove that as soon as the number of equations is on the order of the number of unknowns, TAF recovers the solution exactly (up to a global unimodular constant) with high probability and complexity growing linearly with both the number of unknowns and the number of equations. Our TAF approach adopts the \emph{amplitude-based} empirical loss function, and proceeds in two stages. In the first stage, we introduce an \emph{orthogonality-promoting} initialization that can be obtained with a few power iterations. Stage two refines the initial estimate by successive updates of scalable \emph{truncated generalized gradient iterations}, which are able to handle the rather challenging nonconvex and nonsmooth amplitude-based objective function. In particular, when vectors $\bm{x}$ and $\bm{a}_i$'s are real-valued, our gradient truncation rule provably eliminates erroneously estimated signs with high probability to markedly improve upon its untruncated version. Numerical tests using synthetic data and real images demonstrate that our initialization returns more accurate and robust estimates relative to spectral initializations. Furthermore, even under the same initialization, the proposed amplitude-based refinement outperforms existing Wirtinger flow variants, corroborating the superior performance of TAF over state-of-the-art algorithms. |
Tasks | |
Published | 2016-05-26 |
URL | http://arxiv.org/abs/1605.08285v5 |
http://arxiv.org/pdf/1605.08285v5.pdf | |
PWC | https://paperswithcode.com/paper/solving-systems-of-random-quadratic-equations |
Repo | |
Framework | |
Mixed Strategy for Constrained Stochastic Optimal Control
Title | Mixed Strategy for Constrained Stochastic Optimal Control |
Authors | Masahiro Ono, Mahmoud El Chamie, Marco Pavone, Behcet Acikmese |
Abstract | Choosing control inputs randomly can result in a reduced expected cost in optimal control problems with stochastic constraints, such as stochastic model predictive control (SMPC). We consider a controller with initial randomization, meaning that the controller randomly chooses from K+1 control sequences at the beginning (called K-randimization).It is known that, for a finite-state, finite-action Markov Decision Process (MDP) with K constraints, K-randimization is sufficient to achieve the minimum cost. We found that the same result holds for stochastic optimal control problems with continuous state and action spaces.Furthermore, we show the randomization of control input can result in reduced cost when the optimization problem is nonconvex, and the cost reduction is equal to the duality gap. We then provide the necessary and sufficient conditions for the optimality of a randomized solution, and develop an efficient solution method based on dual optimization. Furthermore, in a special case with K=1 such as a joint chance-constrained problem, the dual optimization can be solved even more efficiently by root finding. Finally, we test the theories and demonstrate the solution method on multiple practical problems ranging from path planning to the planning of entry, descent, and landing (EDL) for future Mars missions. |
Tasks | |
Published | 2016-07-06 |
URL | http://arxiv.org/abs/1607.01478v1 |
http://arxiv.org/pdf/1607.01478v1.pdf | |
PWC | https://paperswithcode.com/paper/mixed-strategy-for-constrained-stochastic |
Repo | |
Framework | |
Extending Unification in $\mathcal{EL}$ to Disunification: The Case of Dismatching and Local Disunification
Title | Extending Unification in $\mathcal{EL}$ to Disunification: The Case of Dismatching and Local Disunification |
Authors | Franz Baader, Stefan Borgwardt, Barbara Morawska |
Abstract | Unification in Description Logics has been introduced as a means to detect redundancies in ontologies. We try to extend the known decidability results for unification in the Description Logic $\mathcal{EL}$ to disunification since negative constraints can be used to avoid unwanted unifiers. While decidability of the solvability of general $\mathcal{EL}$-disunification problems remains an open problem, we obtain NP-completeness results for two interesting special cases: dismatching problems, where one side of each negative constraint must be ground, and local solvability of disunification problems, where we consider only solutions that are constructed from terms occurring in the input problem. More precisely, we first show that dismatching can be reduced to local disunification, and then provide two complementary NP-algorithms for finding local solutions of disunification problems. |
Tasks | |
Published | 2016-09-19 |
URL | http://arxiv.org/abs/1609.05621v2 |
http://arxiv.org/pdf/1609.05621v2.pdf | |
PWC | https://paperswithcode.com/paper/extending-unification-in-mathcalel-to |
Repo | |
Framework | |
Neural Sentence Ordering
Title | Neural Sentence Ordering |
Authors | Xinchi Chen, Xipeng Qiu, Xuanjing Huang |
Abstract | Sentence ordering is a general and critical task for natural language generation applications. Previous works have focused on improving its performance in an external, downstream task, such as multi-document summarization. Given its importance, we propose to study it as an isolated task. We collect a large corpus of academic texts, and derive a data driven approach to learn pairwise ordering of sentences, and validate the efficacy with extensive experiments. Source codes and dataset of this paper will be made publicly available. |
Tasks | Document Summarization, Multi-Document Summarization, Sentence Ordering, Text Generation |
Published | 2016-07-23 |
URL | http://arxiv.org/abs/1607.06952v1 |
http://arxiv.org/pdf/1607.06952v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-sentence-ordering |
Repo | |
Framework | |
Blind phoneme segmentation with temporal prediction errors
Title | Blind phoneme segmentation with temporal prediction errors |
Authors | Paul Michel, Okko Räsänen, Roland Thiollière, Emmanuel Dupoux |
Abstract | Phonemic segmentation of speech is a critical step of speech recognition systems. We propose a novel unsupervised algorithm based on sequence prediction models such as Markov chains and recurrent neural network. Our approach consists in analyzing the error profile of a model trained to predict speech features frame-by-frame. Specifically, we try to learn the dynamics of speech in the MFCC space and hypothesize boundaries from local maxima in the prediction error. We evaluate our system on the TIMIT dataset, with improvements over similar methods. |
Tasks | Speech Recognition |
Published | 2016-08-01 |
URL | http://arxiv.org/abs/1608.00508v2 |
http://arxiv.org/pdf/1608.00508v2.pdf | |
PWC | https://paperswithcode.com/paper/blind-phoneme-segmentation-with-temporal |
Repo | |
Framework | |
Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks
Title | Anomaly Detection in Video Using Predictive Convolutional Long Short-Term Memory Networks |
Authors | Jefferson Ryan Medel, Andreas Savakis |
Abstract | Automating the detection of anomalous events within long video sequences is challenging due to the ambiguity of how such events are defined. We approach the problem by learning generative models that can identify anomalies in videos using limited supervision. We propose end-to-end trainable composite Convolutional Long Short-Term Memory (Conv-LSTM) networks that are able to predict the evolution of a video sequence from a small number of input frames. Regularity scores are derived from the reconstruction errors of a set of predictions with abnormal video sequences yielding lower regularity scores as they diverge further from the actual sequence over time. The models utilize a composite structure and examine the effects of conditioning in learning more meaningful representations. The best model is chosen based on the reconstruction and prediction accuracy. The Conv-LSTM models are evaluated both qualitatively and quantitatively, demonstrating competitive results on anomaly detection datasets. Conv-LSTM units are shown to be an effective tool for modeling and predicting video sequences. |
Tasks | Anomaly Detection |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00390v2 |
http://arxiv.org/pdf/1612.00390v2.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-in-video-using-predictive |
Repo | |
Framework | |
Modeling Visual Compatibility through Hierarchical Mid-level Elements
Title | Modeling Visual Compatibility through Hierarchical Mid-level Elements |
Authors | Jose Oramas, Tinne Tuytelaars |
Abstract | In this paper we present a hierarchical method to discover mid-level elements with the objective of modeling visual compatibility between objects. At the base-level, our method identifies patterns of CNN activations with the aim of modeling different variations/styles in which objects of the classes of interest may occur. At the top-level, the proposed method discovers patterns of co-occurring activations of base-level elements that define visual compatibility between pairs of object classes. Experiments on the massive Amazon dataset show the strength of our method at describing object classes and the characteristics that drive the compatibility between them. |
Tasks | |
Published | 2016-03-31 |
URL | http://arxiv.org/abs/1604.00036v1 |
http://arxiv.org/pdf/1604.00036v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-visual-compatibility-through |
Repo | |
Framework | |
Identifying Significant Predictive Bias in Classifiers
Title | Identifying Significant Predictive Bias in Classifiers |
Authors | Zhe Zhang, Daniel B. Neill |
Abstract | We present a novel subset scan method to detect if a probabilistic binary classifier has statistically significant bias – over or under predicting the risk – for some subgroup, and identify the characteristics of this subgroup. This form of model checking and goodness-of-fit test provides a way to interpretably detect the presence of classifier bias or regions of poor classifier fit. This allows consideration of not just subgroups of a priori interest or small dimensions, but the space of all possible subgroups of features. To address the difficulty of considering these exponentially many possible subgroups, we use subset scan and parametric bootstrap-based methods. Extending this method, we can penalize the complexity of the detected subgroup and also identify subgroups with high classification errors. We demonstrate these methods and find interesting results on the COMPAS crime recidivism and credit delinquency data. |
Tasks | |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08292v2 |
http://arxiv.org/pdf/1611.08292v2.pdf | |
PWC | https://paperswithcode.com/paper/identifying-significant-predictive-bias-in |
Repo | |
Framework | |
A probabilistic network for the diagnosis of acute cardiopulmonary diseases
Title | A probabilistic network for the diagnosis of acute cardiopulmonary diseases |
Authors | Alessandro Magrini, Davide Luciani, Federico Mattia Stefanini |
Abstract | In this paper, the development of a probabilistic network for the diagnosis of acute cardiopulmonary diseases is presented. This paper is a draft version of the article published after peer review in 2018 (https://doi.org/10.1002/bimj.201600206). A panel of expert physicians collaborated to specify the qualitative part, that is a directed acyclic graph defining a factorization of the joint probability distribution of domain variables. The quantitative part, that is the set of all conditional probability distributions defined by each factor, was estimated in the Bayesian paradigm: we applied a special formal representation, characterized by a low number of parameters and a parameterization intelligible for physicians, elicited the joint prior distribution of parameters from medical experts, and updated it by conditioning on a dataset of hospital patient records using Markov Chain Monte Carlo simulation. Refinement was cyclically performed until the probabilistic network provided satisfactory Concordance Index values for a selection of acute diseases and reasonable inference on six fictitious patient cases. The probabilistic network can be employed to perform medical diagnosis on a total of 63 diseases (38 acute and 25 chronic) on the basis of up to 167 patient findings. |
Tasks | Medical Diagnosis |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.06864v2 |
http://arxiv.org/pdf/1609.06864v2.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-network-for-the-diagnosis-of |
Repo | |
Framework | |
Nonrigid Optical Flow Ground Truth for Real-World Scenes with Time-Varying Shading Effects
Title | Nonrigid Optical Flow Ground Truth for Real-World Scenes with Time-Varying Shading Effects |
Authors | Wenbin Li, Darren Cosker, Zhihan Lv, Matthew Brown |
Abstract | In this paper we present a dense ground truth dataset of nonrigidly deforming real-world scenes. Our dataset contains both long and short video sequences, and enables the quantitatively evaluation for RGB based tracking and registration methods. To construct ground truth for the RGB sequences, we simultaneously capture Near-Infrared (NIR) image sequences where dense markers - visible only in NIR - represent ground truth positions. This allows for comparison with automatically tracked RGB positions and the formation of error metrics. Most previous datasets containing nonrigidly deforming sequences are based on synthetic data. Our capture protocol enables us to acquire real-world deforming objects with realistic photometric effects - such as blur and illumination change - as well as occlusion and complex deformations. A public evaluation website is constructed to allow for ranking of RGB image based optical flow and other dense tracking algorithms, with various statistical measures. Furthermore, we present an RGB-NIR multispectral optical flow model allowing for energy optimization by adoptively combining featured information from both the RGB and the complementary NIR channels. In our experiments we evaluate eight existing RGB based optical flow methods on our new dataset. We also evaluate our hybrid optical flow algorithm by comparing to two existing multispectral approaches, as well as varying our input channels across RGB, NIR and RGB-NIR. |
Tasks | Optical Flow Estimation |
Published | 2016-03-26 |
URL | http://arxiv.org/abs/1603.08120v3 |
http://arxiv.org/pdf/1603.08120v3.pdf | |
PWC | https://paperswithcode.com/paper/nonrigid-optical-flow-ground-truth-for-real |
Repo | |
Framework | |
Fundamental Limits of Budget-Fidelity Trade-off in Label Crowdsourcing
Title | Fundamental Limits of Budget-Fidelity Trade-off in Label Crowdsourcing |
Authors | Farshad Lahouti, Babak Hassibi |
Abstract | Digital crowdsourcing (CS) is a modern approach to perform certain large projects using small contributions of a large crowd. In CS, a taskmaster typically breaks down the project into small batches of tasks and assigns them to so-called workers with imperfect skill levels. The crowdsourcer then collects and analyzes the results for inference and serving the purpose of the project. In this work, the CS problem, as a human-in-the-loop computation problem, is modeled and analyzed in an information theoretic rate-distortion framework. The purpose is to identify the ultimate fidelity that one can achieve by any form of query from the crowd and any decoding (inference) algorithm with a given budget. The results are established by a joint source channel (de)coding scheme, which represent the query scheme and inference, over parallel noisy channels, which model workers with imperfect skill levels. We also present and analyze a query scheme dubbed $k$-ary incidence coding and study optimized query pricing in this setting. |
Tasks | |
Published | 2016-08-25 |
URL | http://arxiv.org/abs/1608.07328v1 |
http://arxiv.org/pdf/1608.07328v1.pdf | |
PWC | https://paperswithcode.com/paper/fundamental-limits-of-budget-fidelity-trade |
Repo | |
Framework | |