Paper Group ANR 678
Node Attribute Generation on Graphs. Normalizing Constant Estimation with Gaussianized Bridge Sampling. Unsupervised Feature Selection based on Adaptive Similarity Learning and Subspace Clustering. Learning to Infer Entities, Properties and their Relations from Clinical Conversations. Learning Graphs from Noisy Epidemic Cascades. Extracting Symptom …
Node Attribute Generation on Graphs
Title | Node Attribute Generation on Graphs |
Authors | Xu Chen, Siheng Chen, Huangjie Zheng, Jiangchao Yao, Kenan Cui, Ya Zhang, Ivor W. Tsang |
Abstract | Graph structured data provide two-fold information: graph structures and node attributes. Numerous graph-based algorithms rely on both information to achieve success in supervised tasks, such as node classification and link prediction. However, node attributes could be missing or incomplete, which significantly deteriorates the performance. The task of node attribute generation aims to generate attributes for those nodes whose attributes are completely unobserved. This task benefits many real-world problems like profiling, node classification and graph data augmentation. To tackle this task, we propose a deep adversarial learning based method to generate node attributes; called node attribute neural generator (NANG). NANG learns a unifying latent representation which is shared by both node attributes and graph structures and can be translated to different modalities. We thus use this latent representation as a bridge to convert information from one modality to another. We further introduce practical applications to quantify the performance of node attribute generation. Extensive experiments are conducted on four real-world datasets and the empirical results show that node attributes generated by the proposed method are high-qualitative and beneficial to other applications. The datasets and codes are available online. |
Tasks | Data Augmentation, Link Prediction, Node Classification |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09708v1 |
https://arxiv.org/pdf/1907.09708v1.pdf | |
PWC | https://paperswithcode.com/paper/node-attribute-generation-on-graphs |
Repo | |
Framework | |
Normalizing Constant Estimation with Gaussianized Bridge Sampling
Title | Normalizing Constant Estimation with Gaussianized Bridge Sampling |
Authors | He Jia, Uroš Seljak |
Abstract | Normalizing constant (also called partition function, Bayesian evidence, or marginal likelihood) is one of the central goals of Bayesian inference, yet most of the existing methods are both expensive and inaccurate. Here we develop a new approach, starting from posterior samples obtained with a standard Markov Chain Monte Carlo (MCMC). We apply a novel Normalizing Flow (NF) approach to obtain an analytic density estimator from these samples, followed by Optimal Bridge Sampling (OBS) to obtain the normalizing constant. We compare our method which we call Gaussianized Bridge Sampling (GBS) to existing methods such as Nested Sampling (NS) and Annealed Importance Sampling (AIS) on several examples, showing our method is both significantly faster and substantially more accurate than these methods, and comes with a reliable error estimation. |
Tasks | Bayesian Inference |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.06073v1 |
https://arxiv.org/pdf/1912.06073v1.pdf | |
PWC | https://paperswithcode.com/paper/normalizing-constant-estimation-with |
Repo | |
Framework | |
Unsupervised Feature Selection based on Adaptive Similarity Learning and Subspace Clustering
Title | Unsupervised Feature Selection based on Adaptive Similarity Learning and Subspace Clustering |
Authors | Mohsen Ghassemi Parsa, Hadi Zare, Mehdi Ghatee |
Abstract | Feature selection methods have an important role on the readability of data and the reduction of complexity of learning algorithms. In recent years, a variety of efforts are investigated on feature selection problems based on unsupervised viewpoint due to the laborious labeling task on large datasets. In this paper, we propose a novel approach on unsupervised feature selection initiated from the subspace clustering to preserve the similarities by representation learning of low dimensional subspaces among the samples. A self-expressive model is employed to implicitly learn the cluster similarities in an adaptive manner. The proposed method not only maintains the sample similarities through subspace clustering, but it also captures the discriminative information based on a regularized regression model. In line with the convergence analysis of the proposed method, the experimental results on benchmark datasets demonstrate the effectiveness of our approach as compared with the state of the art methods. |
Tasks | Feature Selection, Representation Learning |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.05458v1 |
https://arxiv.org/pdf/1912.05458v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-feature-selection-based-on |
Repo | |
Framework | |
Learning to Infer Entities, Properties and their Relations from Clinical Conversations
Title | Learning to Infer Entities, Properties and their Relations from Clinical Conversations |
Authors | Nan Du, Mingqiu Wang, Linh Tran, Gang Li, Izhak Shafran |
Abstract | Recently we proposed the Span Attribute Tagging (SAT) Model (Du et al., 2019) to infer clinical entities (e.g., symptoms) and their properties (e.g., duration). It tackles the challenge of large label space and limited training data using a hierarchical two-stage approach that identifies the span of interest in a tagging step and assigns labels to the span in a classification step. We extend the SAT model to jointly infer not only entities and their properties but also relations between them. Most relation extraction models restrict inferring relations between tokens within a few neighboring sentences, mainly to avoid high computational complexity. In contrast, our proposed Relation-SAT (R-SAT) model is computationally efficient and can infer relations over the entire conversation, spanning an average duration of 10 minutes. We evaluate our model on a corpus of clinical conversations. When the entities are given, the R-SAT outperforms baselines in identifying relations between symptoms and their properties by about 32% (0.82 vs 0.62 F-score) and by about 50% (0.60 vs 0.41 F-score) on medications and their properties. On the more difficult task of jointly inferring entities and relations, the R-SAT model achieves a performance of 0.34 and 0.45 for symptoms and medications respectively, which is significantly better than 0.18 and 0.35 for the baseline model. The contributions of different components of the model are quantified using ablation analysis. |
Tasks | Relation Extraction |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11536v1 |
https://arxiv.org/pdf/1908.11536v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-infer-entities-properties-and |
Repo | |
Framework | |
Learning Graphs from Noisy Epidemic Cascades
Title | Learning Graphs from Noisy Epidemic Cascades |
Authors | Jessica Hoffmann, Constantine Caramanis |
Abstract | We consider the problem of learning the weighted edges of a graph by observing the noisy times of infection for multiple epidemic cascades on this graph. Past work has considered this problem when the cascade information, i.e., infection times, are known exactly. Though the noisy setting is well motivated by many epidemic processes (e.g., most human epidemics), to the best of our knowledge, very little is known about when it is solvable. Previous work on the no-noise setting critically uses the ordering information. If noise can reverse this – a node’s reported (noisy) infection time comes after the reported infection time of some node it infected – then we are unable to see how previous results can be extended. We therefore tackle two versions of the noisy setting: the limited-noise setting, where we know noisy times of infections, and the extreme-noise setting, in which we only know whether or not a node was infected. We provide a polynomial time algorithm for recovering the structure of bidirectional trees in the extreme-noise setting, and show our algorithm almost matches lower bounds established in the no-noise setting, and hence is optimal up to log-factors. We extend our results for general degree-bounded graphs, where again we show that our (poly-time) algorithm can recover the structure of the graph with optimal sample complexity. We also provide the first efficient algorithm to learn the weights of the bidirectional tree in the limited-noise setting. Finally, we give a polynomial time algorithm for learning the weights of general bounded-degree graphs in the limited-noise setting. This algorithm extends to general graphs (at the price of exponential running time), proving the problem is solvable in the general case. All our algorithms work for any noise distribution, without any restriction on the variance. |
Tasks | |
Published | 2019-03-06 |
URL | https://arxiv.org/abs/1903.02650v2 |
https://arxiv.org/pdf/1903.02650v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-graphs-from-noisy-epidemic-cascades |
Repo | |
Framework | |
Extracting Symptoms and their Status from Clinical Conversations
Title | Extracting Symptoms and their Status from Clinical Conversations |
Authors | Nan Du, Kai Chen, Anjuli Kannan, Linh Tran, Yuhui Chen, Izhak Shafran |
Abstract | This paper describes novel models tailored for a new application, that of extracting the symptoms mentioned in clinical conversations along with their status. Lack of any publicly available corpus in this privacy-sensitive domain led us to develop our own corpus, consisting of about 3K conversations annotated by professional medical scribes. We propose two novel deep learning approaches to infer the symptom names and their status: (1) a new hierarchical span-attribute tagging (\SAT) model, trained using curriculum learning, and (2) a variant of sequence-to-sequence model which decodes the symptoms and their status from a few speaker turns within a sliding window over the conversation. This task stems from a realistic application of assisting medical providers in capturing symptoms mentioned by patients from their clinical conversations. To reflect this application, we define multiple metrics. From inter-rater agreement, we find that the task is inherently difficult. We conduct comprehensive evaluations on several contrasting conditions and observe that the performance of the models range from an F-score of 0.5 to 0.8 depending on the condition. Our analysis not only reveals the inherent challenges of the task, but also provides useful directions to improve the models. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02239v1 |
https://arxiv.org/pdf/1906.02239v1.pdf | |
PWC | https://paperswithcode.com/paper/extracting-symptoms-and-their-status-from |
Repo | |
Framework | |
Capsule Neural Network based Height Classification using Low-Cost Automotive Ultrasonic Sensors
Title | Capsule Neural Network based Height Classification using Low-Cost Automotive Ultrasonic Sensors |
Authors | Maximilian Pöpperl, Raghavendra Gulagundi, Senthil Yogamani, Stefan Milz |
Abstract | High performance ultrasonic sensor hardware is mainly used in medical applications. Although, the development in automotive scenarios is towards autonomous driving, the ultrasonic sensor hardware still stays low-cost and low-performance, respectively. To overcome the strict hardware limitations, we propose to use capsule neural networks. By the high classification capability of this network architecture, we can achieve outstanding results for performing a detailed height analysis of detected objects. We apply a novel resorting and reshaping method to feed the neural network with ultrasonic data. This increases classification performance and computation speed. We tested the approach under different environmental conditions to verify that the proposed method is working independent of external parameters that is needed for autonomous driving. |
Tasks | Autonomous Driving |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09839v1 |
http://arxiv.org/pdf/1902.09839v1.pdf | |
PWC | https://paperswithcode.com/paper/capsule-neural-network-based-height |
Repo | |
Framework | |
Relationships between dilemma strength and fixation properties in coevolutionary games
Title | Relationships between dilemma strength and fixation properties in coevolutionary games |
Authors | Hendrik Richter |
Abstract | Whether or not cooperation is favored over defection in evolutionary games can be assigned by structure coefficients for any arrangement of cooperators and defectors on any network modeled as a regular graph. We study how these structure coefficients relate to a scaling of dilemma strength in social dilemma games. It is shown that some graphs permit certain arrangements of cooperators and defectors to possess particularly large structure coefficients. Moreover, these large coefficients imply particularly large sections of a bounded parameter plane spanned by scaling gamble-intending and risk-averting dilemma strength. |
Tasks | |
Published | 2019-01-11 |
URL | https://arxiv.org/abs/1901.03545v2 |
https://arxiv.org/pdf/1901.03545v2.pdf | |
PWC | https://paperswithcode.com/paper/relationships-between-dilemma-strength-and |
Repo | |
Framework | |
Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance
Title | Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance |
Authors | Ulysse Marteau-Ferey, Dmitrii Ostrovskii, Francis Bach, Alessandro Rudi |
Abstract | We consider learning methods based on the regularization of a convex empirical risk by a squared Hilbertian norm, a setting that includes linear predictors and non-linear predictors through positive-definite kernels. In order to go beyond the generic analysis leading to convergence rates of the excess risk as $O(1/\sqrt{n})$ from $n$ observations, we assume that the individual losses are self-concordant, that is, their third-order derivatives are bounded by their second-order derivatives. This setting includes least-squares, as well as all generalized linear models such as logistic and softmax regression. For this class of losses, we provide a bias-variance decomposition and show that the assumptions commonly made in least-squares regression, such as the source and capacity conditions, can be adapted to obtain fast non-asymptotic rates of convergence by improving the bias terms, the variance terms or both. |
Tasks | |
Published | 2019-02-08 |
URL | https://arxiv.org/abs/1902.03046v3 |
https://arxiv.org/pdf/1902.03046v3.pdf | |
PWC | https://paperswithcode.com/paper/beyond-least-squares-fast-rates-for |
Repo | |
Framework | |
Differentiable reservoir computing
Title | Differentiable reservoir computing |
Authors | Lyudmila Grigoryeva, Juan-Pablo Ortega |
Abstract | Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the differentiability of reservoir filters for very general classes of discrete-time deterministic inputs. This constitutes a novel strong contribution to the long line of research on the ESP and the FMP and, in particular, links to existing research on the input-dependence of the ESP. Differentiability has been shown in the literature to be a key feature in the learning of attractors of chaotic dynamical systems. A Volterra-type series representation for reservoir filters with semi-infinite discrete-time inputs is constructed in the analytic case using Taylor’s theorem and corresponding approximation bounds are provided. Finally, it is shown as a corollary of these results that any fading memory filter can be uniformly approximated by a finite Volterra series with finite memory. |
Tasks | |
Published | 2019-02-16 |
URL | http://arxiv.org/abs/1902.06094v2 |
http://arxiv.org/pdf/1902.06094v2.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-reservoir-computing |
Repo | |
Framework | |
Solving Nurse Scheduling Problem Using Constraint Programming Technique
Title | Solving Nurse Scheduling Problem Using Constraint Programming Technique |
Authors | O. M. Alade, A. O. Amusat |
Abstract | Staff scheduling is a universal problem that can be encountered in many organizations, such as call centers, educational institution, industry, hospital, and any other public services. It is one of the most important aspects of workforce management strategy and the one that is most prone to errors or issues as there are many entities should be considered, such as the staff turnover, employee availability, time between rotations, unusual periods of activity, and even the last-minute shift changes. The nurse scheduling problem is a variant of staff scheduling problems which appoints nurses to shifts as well as rooms per day taking both hard constraints, i.e., hospital requirements, and soft constraints, i.e., nurse preferences, into account. Most algorithms used for scheduling problems fall short when it comes to the number of inputs they can handle. In this paper, constraint programming was developed to solve the nurse scheduling problem. The developed constraint programming model was then implemented using python programming language. |
Tasks | |
Published | 2019-02-04 |
URL | http://arxiv.org/abs/1902.01193v1 |
http://arxiv.org/pdf/1902.01193v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-nurse-scheduling-problem-using |
Repo | |
Framework | |
Voting with Random Classifiers (VORACE)
Title | Voting with Random Classifiers (VORACE) |
Authors | Cristina Cornelio, Michele Donini, Andrea Loreggia, Maria Silvia Pini, Francesca Rossi |
Abstract | In many machine learning scenarios, looking for the best classifier that fits a particular dataset can be very costly in terms of time and resources. Moreover, it can require deep knowledge of the specific domain. We propose a new technique which does not require profound expertise in the domain and avoids the commonly used strategy of hyper-parameter tuning and model selection. Our method is an innovative ensemble technique that uses voting rules over a set of randomly-generated classifiers. Given a new input sample, we interpret the output of each classifier as a ranking over the set of possible classes. We then aggregate these output rankings using a voting rule, which treats them as preferences over the classes. We show that our approach obtains good results compared to the state-of-the-art, both providing a theoretical analysis and an empirical evaluation of the approach on several datasets. |
Tasks | Model Selection |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08996v2 |
https://arxiv.org/pdf/1909.08996v2.pdf | |
PWC | https://paperswithcode.com/paper/voting-with-random-classifiers-vorace |
Repo | |
Framework | |
GLEE: Geometric Laplacian Eigenmap Embedding
Title | GLEE: Geometric Laplacian Eigenmap Embedding |
Authors | Leo Torres, Kevin S Chan, Tina Eliassi-Rad |
Abstract | Graph embedding seeks to build a low-dimensional representation of a graph G. This low-dimensional representation is then used for various downstream tasks. One popular approach is Laplacian Eigenmaps, which constructs a graph embedding based on the spectral properties of the Laplacian matrix of G. The intuition behind it, and many other embedding techniques, is that the embedding of a graph must respect node similarity: similar nodes must have embeddings that are close to one another. Here, we dispose of this distance-minimization assumption. Instead, we use the Laplacian matrix to find an embedding with geometric properties instead of spectral ones, by leveraging the so-called simplex geometry of G. We introduce a new approach, Geometric Laplacian Eigenmap Embedding (or GLEE for short), and demonstrate that it outperforms various other techniques (including Laplacian Eigenmaps) in the tasks of graph reconstruction and link prediction. |
Tasks | Graph Embedding, Link Prediction |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09763v2 |
https://arxiv.org/pdf/1905.09763v2.pdf | |
PWC | https://paperswithcode.com/paper/geometric-laplacian-eigenmap-embedding |
Repo | |
Framework | |
Automatically Extracting Challenge Sets for Non local Phenomena in Neural Machine Translation
Title | Automatically Extracting Challenge Sets for Non local Phenomena in Neural Machine Translation |
Authors | Leshem Choshen, Omri Abend |
Abstract | We show that the state of the art Transformer Machine Translation (MT) model is not biased towards monotonic reordering (unlike previous recurrent neural network models), but that nevertheless, long-distance dependencies remain a challenge for the model. Since most dependencies are short-distance, common evaluation metrics will be little influenced by how well systems perform on them. We, therefore, propose an automatic approach for extracting challenge sets replete with long-distance dependencies and argue that evaluation using this methodology provides a complementary perspective on system performance. To support our claim, we compile challenge sets for English-German and German-English, which are much larger than any previously released challenge set for MT. The extracted sets are large enough to allow reliable automatic evaluation, which makes the proposed approach a scalable and practical solution for evaluating MT performance on the long-tail of syntactic phenomena. |
Tasks | Machine Translation |
Published | 2019-09-15 |
URL | https://arxiv.org/abs/1909.06814v4 |
https://arxiv.org/pdf/1909.06814v4.pdf | |
PWC | https://paperswithcode.com/paper/automatically-extracting-challenge-sets-for |
Repo | |
Framework | |
Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle
Title | Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle |
Authors | Maximin Coavoux, Shay B. Cohen |
Abstract | We introduce a novel transition system for discontinuous constituency parsing. Instead of storing subtrees in a stack –i.e. a data structure with linear-time sequential access– the proposed system uses a set of parsing items, with constant-time random access. This change makes it possible to construct any discontinuous constituency tree in exactly $4n - 2$ transitions for a sentence of length $n$. At each parsing step, the parser considers every item in the set to be combined with a focus item and to construct a new constituent in a bottom-up fashion. The parsing strategy is based on the assumption that most syntactic structures can be parsed incrementally and that the set –the memory of the parser– remains reasonably small on average. Moreover, we introduce a provably correct dynamic oracle for the new transition system, and present the first experiments in discontinuous constituency parsing using a dynamic oracle. Our parser obtains state-of-the-art results on three English and German discontinuous treebanks. |
Tasks | Constituency Parsing |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00615v1 |
http://arxiv.org/pdf/1904.00615v1.pdf | |
PWC | https://paperswithcode.com/paper/discontinuous-constituency-parsing-with-a |
Repo | |
Framework | |