January 29, 2020

3262 words 16 mins read

Paper Group ANR 644

Paper Group ANR 644

Maximum Likelihood Estimation for Learning Populations of Parameters. UDS–DFKI Submission to the WMT2019 Similar Language Translation Shared Task. Graph Matching Networks for Learning the Similarity of Graph Structured Objects. Unsupervised Multi-Document Opinion Summarization as Copycat-Review Generation. Learning Key-Value Store Design. Online E …

Maximum Likelihood Estimation for Learning Populations of Parameters

Title Maximum Likelihood Estimation for Learning Populations of Parameters
Authors Ramya Korlakai Vinayak, Weihao Kong, Gregory Valiant, Sham M. Kakade
Abstract Consider a setting with $N$ independent individuals, each with an unknown parameter, $p_i \in [0, 1]$ drawn from some unknown distribution $P^\star$. After observing the outcomes of $t$ independent Bernoulli trials, i.e., $X_i \sim \text{Binomial}(t, p_i)$ per individual, our objective is to accurately estimate $P^\star$. This problem arises in numerous domains, including the social sciences, psychology, health-care, and biology, where the size of the population under study is usually large while the number of observations per individual is often limited. Our main result shows that, in the regime where $t \ll N$, the maximum likelihood estimator (MLE) is both statistically minimax optimal and efficiently computable. Precisely, for sufficiently large $N$, the MLE achieves the information theoretic optimal error bound of $\mathcal{O}(\frac{1}{t})$ for $t < c\log{N}$, with regards to the earth mover’s distance (between the estimated and true distributions). More generally, in an exponentially large interval of $t$ beyond $c \log{N}$, the MLE achieves the minimax error bound of $\mathcal{O}(\frac{1}{\sqrt{t\log N}})$. In contrast, regardless of how large $N$ is, the naive “plug-in” estimator for this problem only achieves the sub-optimal error of $\Theta(\frac{1}{\sqrt{t}})$.
Tasks
Published 2019-02-12
URL http://arxiv.org/abs/1902.04553v1
PDF http://arxiv.org/pdf/1902.04553v1.pdf
PWC https://paperswithcode.com/paper/maximum-likelihood-estimation-for-learning
Repo
Framework

UDS–DFKI Submission to the WMT2019 Similar Language Translation Shared Task

Title UDS–DFKI Submission to the WMT2019 Similar Language Translation Shared Task
Authors Santanu Pal, Marcos Zampieri, Josef van Genabith
Abstract In this paper we present the UDS-DFKI system submitted to the Similar Language Translation shared task at WMT 2019. The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish. Participants could choose to participate in any of these three tracks and submit system outputs in any translation direction. We report the results obtained by our system in translating from Czech to Polish and comment on the impact of out-of-domain test data in the performance of our system. UDS-DFKI achieved competitive performance ranking second among ten teams in Czech to Polish translation.
Tasks
Published 2019-08-16
URL https://arxiv.org/abs/1908.06138v1
PDF https://arxiv.org/pdf/1908.06138v1.pdf
PWC https://paperswithcode.com/paper/uds-dfki-submission-to-the-wmt2019-similar
Repo
Framework

Graph Matching Networks for Learning the Similarity of Graph Structured Objects

Title Graph Matching Networks for Learning the Similarity of Graph Structured Objects
Authors Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, Pushmeet Kohli
Abstract This paper addresses the challenging problem of retrieval and matching of graph structured objects, and makes two key contributions. First, we demonstrate how Graph Neural Networks (GNN), which have emerged as an effective model for various supervised prediction problems defined on structured data, can be trained to produce embedding of graphs in vector spaces that enables efficient similarity reasoning. Second, we propose a novel Graph Matching Network model that, given a pair of graphs as input, computes a similarity score between them by jointly reasoning on the pair through a new cross-graph attention-based matching mechanism. We demonstrate the effectiveness of our models on different domains including the challenging problem of control-flow-graph based function similarity search that plays an important role in the detection of vulnerabilities in software systems. The experimental analysis demonstrates that our models are not only able to exploit structure in the context of similarity learning but they can also outperform domain-specific baseline systems that have been carefully hand-engineered for these problems.
Tasks Graph Matching
Published 2019-04-29
URL https://arxiv.org/abs/1904.12787v2
PDF https://arxiv.org/pdf/1904.12787v2.pdf
PWC https://paperswithcode.com/paper/graph-matching-networks-for-learning-the-1
Repo
Framework

Unsupervised Multi-Document Opinion Summarization as Copycat-Review Generation

Title Unsupervised Multi-Document Opinion Summarization as Copycat-Review Generation
Authors Arthur Bražinskas, Mirella Lapata, Ivan Titov
Abstract Summarization of opinions is the process of automatically creating text summaries that reflect subjective information expressed in input documents, such as product reviews. While most previous research in opinion summarization has focused on the extractive setting, i.e. selecting fragments of the input documents to produce a summary, we let the model generate novel sentences and hence produce fluent text. Supervised abstractive summarization methods typically rely on large quantities of document-summary pairs which are expensive to acquire. In contrast, we consider the unsupervised setting, in other words, we do not use any summaries in training. We define a generative model for a multi-product review collection. Intuitively, we want to design such a model that, when generating a new review given a set of other reviews of the product, we can control the `amount of novelty’ going into the new review or, equivalently, vary the degree of deviation from the input reviews. At test time, when generating summaries, we force the novelty to be minimal, and produce a text reflecting consensus opinions. We capture this intuition by defining a hierarchical variational autoencoder model. Both individual reviews and products they correspond to are associated with stochastic latent codes, and the review generator (‘decoder’) has direct access to the text of input reviews through the pointer-generator mechanism. In experiments on Amazon and Yelp data, we show that in this model by setting at test time the review’s latent code to its mean, we produce fluent and coherent summaries. |
Tasks Abstractive Text Summarization
Published 2019-11-06
URL https://arxiv.org/abs/1911.02247v1
PDF https://arxiv.org/pdf/1911.02247v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-multi-document-opinion
Repo
Framework

Learning Key-Value Store Design

Title Learning Key-Value Store Design
Authors Stratos Idreos, Niv Dayan, Wilson Qin, Mali Akmanalp, Sophie Hilgard, Andrew Ross, James Lennon, Varun Jain, Harshita Gupta, David Li, Zichen Zhu
Abstract We introduce the concept of design continuums for the data layout of key-value stores. A design continuum unifies major distinct data structure designs under the same model. The critical insight and potential long-term impact is that such unifying models 1) render what we consider up to now as fundamentally different data structures to be seen as views of the very same overall design space, and 2) allow seeing new data structure designs with performance properties that are not feasible by existing designs. The core intuition behind the construction of design continuums is that all data structures arise from the very same set of fundamental design principles, i.e., a small set of data layout design concepts out of which we can synthesize any design that exists in the literature as well as new ones. We show how to construct, evaluate, and expand, design continuums and we also present the first continuum that unifies major data structure designs, i.e., B+tree, B-epsilon-tree, LSM-tree, and LSH-table. The practical benefit of a design continuum is that it creates a fast inference engine for the design of data structures. For example, we can predict near instantly how a specific design change in the underlying storage of a data system would affect performance, or reversely what would be the optimal data structure (from a given set of designs) given workload characteristics and a memory budget. In turn, these properties allow us to envision a new class of self-designing key-value stores with a substantially improved ability to adapt to workload and hardware changes by transitioning between drastically different data structure designs to assume a diverse set of performance properties at will.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05443v1
PDF https://arxiv.org/pdf/1907.05443v1.pdf
PWC https://paperswithcode.com/paper/learning-key-value-store-design
Repo
Framework

Online Explanation Generation for Human-Robot Teaming

Title Online Explanation Generation for Human-Robot Teaming
Authors Mehrdad Zakershahrak, Ze Gong, Nikhillesh Sadassivam, Yu Zhang
Abstract As AI becomes an integral part of our lives, the development of explainable AI, embodied in the decision-making process of an AI or robotic agent, becomes imperative. For a robotic teammate, the ability to generate explanations to explain its behavior is one of the key requirements of explainable agency. Prior work on explanation generation focuses on supporting the rationale behind the robot’s decision (or behavior). These approaches, however, fail to consider the mental workload needed to understand the received explanation. In other words, the human teammate is expected to understand any explanation provided no matter how much information is presented. In this work, we argue that explanations, especially ones of a complex nature, should be made in an online fashion during the execution, which helps spread out the information to be explained and thus reduce the mental workload of humans in highly demanding tasks. However, a challenge here is that the different parts of an explanation may be dependent on each other, which must be taken into account when generating online explanations. To this end, a general formulation of online explanation generation is presented with three variations satisfying different properties. The new explanation generation methods are based on a model reconciliation setting introduced in our prior work. We evaluate our methods both with human subjects in a standard planning competition (IPC) domain, using NASA Task Load Index (TLX), as well as in simulation with ten different problems across two IPC domains. Results strongly suggest that our methods not only generate explanations that are perceived as less cognitively demanding and much preferred over the baselines but also are computationally efficient.
Tasks Decision Making
Published 2019-03-15
URL https://arxiv.org/abs/1903.06418v5
PDF https://arxiv.org/pdf/1903.06418v5.pdf
PWC https://paperswithcode.com/paper/online-explanation-generation-for-human-robot
Repo
Framework

Approaching Machine Learning Fairness through Adversarial Network

Title Approaching Machine Learning Fairness through Adversarial Network
Authors Xiaoqian Wang, Heng Huang
Abstract Fairness is becoming a rising concern w.r.t. machine learning model performance. Especially for sensitive fields such as criminal justice and loan decision, eliminating the prediction discrimination towards a certain group of population (characterized by sensitive features like race and gender) is important for enhancing the trustworthiness of model. In this paper, we present a new general framework to improve machine learning fairness. The goal of our model is to minimize the influence of sensitive feature from the perspectives of both the data input and the predictive model. In order to achieve this goal, we reformulate the data input by removing the sensitive information and strengthen model fairness by minimizing the marginal contribution of the sensitive feature. We propose to learn the non-sensitive input via sampling among features and design an adversarial network to minimize the dependence between the reformulated input and the sensitive information. Extensive experiments on three benchmark datasets suggest that our model achieve better results than related state-of-the-art methods with respect to both fairness metrics and prediction performance.
Tasks
Published 2019-09-06
URL https://arxiv.org/abs/1909.03013v1
PDF https://arxiv.org/pdf/1909.03013v1.pdf
PWC https://paperswithcode.com/paper/approaching-machine-learning-fairness-through
Repo
Framework

Representation Learning for Discovering Phonemic Tone Contours

Title Representation Learning for Discovering Phonemic Tone Contours
Authors Bai Li, Jing Yi Xie, Frank Rudzicz
Abstract Tone is a prosodic feature used to distinguish words in many languages, some of which are endangered and scarcely documented. In this work, we use unsupervised representation learning to identify probable clusters of syllables that share the same phonemic tone. Our method extracts the pitch for each syllable, then trains a convolutional autoencoder to learn a low dimensional representation for each contour. We then apply the mean shift algorithm to cluster tones in high-density regions of the latent space. Furthermore, by feeding the centers of each cluster into the decoder, we produce a prototypical contour that represents each cluster. We apply this method to spoken multi-syllable words in Mandarin Chinese and Cantonese and evaluate how closely our clusters match the ground truth tone categories. Finally, we discuss some difficulties with our approach, including contextual tone variation and allophony effects.
Tasks Representation Learning, Unsupervised Representation Learning
Published 2019-10-20
URL https://arxiv.org/abs/1910.08987v1
PDF https://arxiv.org/pdf/1910.08987v1.pdf
PWC https://paperswithcode.com/paper/representation-learning-for-discovering
Repo
Framework

Unsupervised Representation for EHR Signals and Codes as Patient Status Vector

Title Unsupervised Representation for EHR Signals and Codes as Patient Status Vector
Authors Sajad Darabi, Mohammad Kachuee, Majid Sarrafzadeh
Abstract Effective modeling of electronic health records presents many challenges as they contain large amounts of irregularity most of which are due to the varying procedures and diagnosis a patient may have. Despite the recent progress in machine learning, unsupervised learning remains largely at open, especially in the healthcare domain. In this work, we present a two-step unsupervised representation learning scheme to summarize the multi-modal clinical time series consisting of signals and medical codes into a patient status vector. First, an auto-encoder step is used to reduce sparse medical codes and clinical time series into a distributed representation. Subsequently, the concatenation of the distributed representations is further fine-tuned using a forecasting task. We evaluate the usefulness of the representation on two downstream tasks: mortality and readmission. Our proposed method shows improved generalization performance for both short duration ICU visits and long duration ICU visits.
Tasks Representation Learning, Time Series, Unsupervised Representation Learning
Published 2019-10-04
URL https://arxiv.org/abs/1910.01803v1
PDF https://arxiv.org/pdf/1910.01803v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-representation-for-ehr-signals
Repo
Framework

Learning Structural Graph Layouts and 3D Shapes for Long Span Bridges 3D Reconstruction

Title Learning Structural Graph Layouts and 3D Shapes for Long Span Bridges 3D Reconstruction
Authors Fangqiao Hu, Jin Zhao, Yong Hunag, Hui Li
Abstract A learning-based 3D reconstruction method for long-span bridges is proposed in this paper. 3D reconstruction generates a 3D computer model of a real object or scene from images, it involves many stages and open problems. Existing point-based methods focus on generating 3D point clouds and their reconstructed polygonal mesh or fitting-based geometrical models in urban scenes civil structures reconstruction within Manhattan world constrains and have made great achievements. Difficulties arise when an attempt is made to transfer these systems to structures with complex topology and part relations like steel trusses and long-span bridges, this could be attributed to point clouds are often unevenly distributed with noise and suffer from occlusions and incompletion, recovering a satisfactory 3D model from these highly unstructured point clouds in a bottom-up pattern while preserving the geometrical and topological properties makes enormous challenge to existing algorithms. Considering the prior human knowledge that these structures are in conformity to regular spatial layouts in terms of components, a learning-based topology-aware 3D reconstruction method which can obtain high-level structural graph layouts and low-level 3D shapes from images is proposed in this paper. We demonstrate the feasibility of this method by testing on two real long-span steel truss cable-stayed bridges.
Tasks 3D Reconstruction, Generating 3D Point Clouds
Published 2019-07-08
URL https://arxiv.org/abs/1907.03387v1
PDF https://arxiv.org/pdf/1907.03387v1.pdf
PWC https://paperswithcode.com/paper/learning-structural-graph-layouts-and-3d
Repo
Framework

Distributed interference cancellation in multi-agent scenarios

Title Distributed interference cancellation in multi-agent scenarios
Authors Mahdi Shamsi, Alireza Moslemi Haghighi, Farokh Marvasti
Abstract This paper considers the problem of detecting impaired and noisy nodes over network. In a distributed algorithm, lots of processing units are incorporating and communicating with each other to reach a global goal. Due to each one’s state in the shared environment, they can help the other nodes or mislead them (due to noise or a deliberate attempt). Previous works mainly focused on proper locating agents and weight assignment based on initial environment state to minimize malfunctioning of noisy nodes. We propose an algorithm to be able to adapt sharing weights according to behavior of the agents. Applying the introduced algorithm to a multi-agent RL scenario and the well-known diffusion LMS demonstrates its capability and generality.
Tasks
Published 2019-10-22
URL https://arxiv.org/abs/1910.10109v1
PDF https://arxiv.org/pdf/1910.10109v1.pdf
PWC https://paperswithcode.com/paper/distributed-interference-cancellation-in
Repo
Framework

Expression, Affect, Action Unit Recognition: Aff-Wild2, Multi-Task Learning and ArcFace

Title Expression, Affect, Action Unit Recognition: Aff-Wild2, Multi-Task Learning and ArcFace
Authors Dimitrios Kollias, Stefanos Zafeiriou
Abstract Affective computing has been largely limited in terms of available data resources. The need to collect and annotate diverse in-the-wild datasets has become apparent with the rise of deep learning models, as the default approach to address any computer vision task. Some in-the-wild databases have been recently proposed. However: i) their size is small, ii) they are not audiovisual, iii) only a small part is manually annotated, iv) they contain a small number of subjects, or v) they are not annotated for all main behavior tasks (valence-arousal estimation, action unit detection and basic expression classification). To address these, we substantially extend the largest available in-the-wild database (Aff-Wild) to study continuous emotions such as valence and arousal. Furthermore, we annotate parts of the database with basic expressions and action units. As a consequence, for the first time, this allows the joint study of all three types of behavior states. We call this database Aff-Wild2. We conduct extensive experiments with CNN and CNN-RNN architectures that use visual and audio modalities; these networks are trained on Aff-Wild2 and their performance is then evaluated on 10 publicly available emotion databases. We show that the networks achieve state-of-the-art performance for the emotion recognition tasks. Additionally, we adapt the ArcFace loss function in the emotion recognition context and use it for training two new networks on Aff-Wild2 and then re-train them in a variety of diverse expression recognition databases. The networks are shown to improve the existing state-of-the-art. The database, emotion recognition models and source code are available at http://ibug.doc.ic.ac.uk/resources/aff-wild2.
Tasks Action Unit Detection, Emotion Recognition, Multi-Task Learning
Published 2019-09-25
URL https://arxiv.org/abs/1910.04855v1
PDF https://arxiv.org/pdf/1910.04855v1.pdf
PWC https://paperswithcode.com/paper/expression-affect-action-unit-recognition-aff
Repo
Framework

Partially Detected Intelligent Traffic Signal Control: Environmental Adaptation

Title Partially Detected Intelligent Traffic Signal Control: Environmental Adaptation
Authors Rusheng Zhang, Romain Leteurtre, Benjamin Striner, Ammar Alanazi, Abdullah Alghafis, Ozan K. Tonguz
Abstract Partially Detected Intelligent Traffic Signal Control (PD-ITSC) systems that can optimize traffic signals based on limited detected information could be a cost-efficient solution for mitigating traffic congestion in the future. In this paper, we focus on a particular problem in PD-ITSC - adaptation to changing environments. To this end, we investigate different reinforcement learning algorithms, including Q-learning, Proximal Policy Optimization (PPO), Advantage Actor-Critic (A2C), and Actor-Critic with Kronecker-Factored Trust Region (ACKTR). Our findings suggest that RL algorithms can find optimal strategies under partial vehicle detection; however, policy-based algorithms can adapt to changing environments more efficiently than value-based algorithms. We use these findings to draw conclusions about the value of different models for PD-ITSC systems.
Tasks Q-Learning
Published 2019-10-23
URL https://arxiv.org/abs/1910.10808v1
PDF https://arxiv.org/pdf/1910.10808v1.pdf
PWC https://paperswithcode.com/paper/partially-detected-intelligent-traffic-signal
Repo
Framework

Semi-supervised Learning for Word Sense Disambiguation

Title Semi-supervised Learning for Word Sense Disambiguation
Authors Darío Garigliotti
Abstract This work is a study of the impact of multiple aspects in a classic unsupervised word sense disambiguation algorithm. We identify relevant factors in a decision rule algorithm, including the initial labeling of examples, the formalization of the rule confidence, and the criteria for accepting a decision rule. Some of these factors are only implicitly considered in the original literature. We then propose a lightly supervised version of the algorithm, and employ a pseudo-word-based strategy to evaluate the impact of these factors. The obtained performances are comparable with those of highly optimized formulations of the word sense disambiguation method.
Tasks Word Sense Disambiguation
Published 2019-08-26
URL https://arxiv.org/abs/1908.09641v1
PDF https://arxiv.org/pdf/1908.09641v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-for-word-sense
Repo
Framework

Semantic Hierarchy Preserving Deep Hashing for Large-scale Image Retrieval

Title Semantic Hierarchy Preserving Deep Hashing for Large-scale Image Retrieval
Authors Xuefei Zhe, Le Ou-Yang, Shifeng Chen, Hong Yan
Abstract Convolutional neural networks have been widely used in content-based image retrieval. To better deal with large-scale data, the deep hashing model is proposed as an effective method, which maps an image to a binary code that can be used for hashing search. However, most existing deep hashing models only utilize fine-level semantic labels or convert them to similar/dissimilar labels for training. The natural semantic hierarchy structures are ignored in the training stage of the deep hashing model. In this paper, we present an effective algorithm to train a deep hashing model that can preserve a semantic hierarchy structure for large-scale image retrieval. Experiments on two datasets show that our method improves the fine-level retrieval performance. Meanwhile, our model achieves state-of-the-art results in terms of hierarchical retrieval.
Tasks Content-Based Image Retrieval, Image Retrieval
Published 2019-01-31
URL http://arxiv.org/abs/1901.11259v2
PDF http://arxiv.org/pdf/1901.11259v2.pdf
PWC https://paperswithcode.com/paper/semantic-hierarchy-preserving-deep-hashing
Repo
Framework
comments powered by Disqus