October 19, 2019

3168 words 15 mins read

Paper Group ANR 319

Finding dissimilar explanations in Bayesian networks: Complexity results. Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection. Answer Set Programming Modulo `Space-Time’. KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction. Interpreting weight maps in terms of cog …

Finding dissimilar explanations in Bayesian networks: Complexity results


Title	Finding dissimilar explanations in Bayesian networks: Complexity results
Authors	Johan Kwisthout
Abstract	Finding the most probable explanation for observed variables in a Bayesian network is a notoriously intractable problem, particularly if there are hidden variables in the network. In this paper we examine the complexity of a related problem, that is, the problem of finding a set of sufficiently dissimilar, yet all plausible, explanations. Applications of this problem are, e.g., in search query results (you won’t want 10 results that all link to the same website) or in decision support systems. We show that the problem of finding a ‘good enough’ explanation that differs in structure from the best explanation is at least as hard as finding the best explanation itself.
Tasks
Published	2018-10-26
URL	http://arxiv.org/abs/1810.11391v2
PDF	http://arxiv.org/pdf/1810.11391v2.pdf
PWC	https://paperswithcode.com/paper/finding-dissimilar-explanations-in-bayesian
Repo
Framework

Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection


Title	Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection
Authors	Johannes Hellrich, Sven Buechel, Udo Hahn
Abstract	To understand historical texts, we must be aware that language – including the emotional connotation attached to words – changes over time. In this paper, we aim at estimating the emotion which is associated with a given word in former language stages of English and German. Emotion is represented following the popular Valence-Arousal-Dominance (VAD) annotation scheme. While being more expressive than polarity alone, existing word emotion induction methods are typically not suited for addressing it. To overcome this limitation, we present adaptations of two popular algorithms to VAD. To measure their effectiveness in diachronic settings, we present the first gold standard for historical word emotions, which was created by scholars with proficiency in the respective language stages and covers both English and German. In contrast to claims in previous work, our findings indicate that hand-selecting small sets of seed words with supposedly stable emotional meaning is actually harmful rather than helpful.
Tasks
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08115v2
PDF	http://arxiv.org/pdf/1806.08115v2.pdf
PWC	https://paperswithcode.com/paper/inducing-affective-lexical-semantics-in
Repo
Framework

Answer Set Programming Modulo `Space-Time’


Title	Answer Set Programming Modulo `Space-Time’ \|
Authors	Carl Schultz, Mehul Bhatt, Jakob Suchan, Przemysław Wałęga
Abstract	We present ASP Modulo Space-Time', a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities combine and synergise for applications in a range of AI application areas where the processing and interpretation of spatio-temporal data is crucial. The framework and resulting system is the only general KR-based method for declaratively reasoning about the dynamics of space-time’ regions as first-class objects. We present an empirical evaluation (with scalability and robustness results), and include diverse application examples involving interpretation and control tasks.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06861v1
PDF	http://arxiv.org/pdf/1805.06861v1.pdf
PWC	https://paperswithcode.com/paper/answer-set-programming-modulo-space-time
Repo
Framework

KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction


Title	KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction
Authors	Hengyi Cai, Xingguang Ji, Yonghao Song, Yan Jin, Yang Zhang, Mairgup Mansur, Xiaofang Zhao
Abstract	Chinese pinyin input methods are very important for Chinese language processing. Actually, users may make typos inevitably when they input pinyin. Moreover, pinyin typo correction has become an increasingly important task with the popularity of smartphones and the mobile Internet. How to exploit the knowledge of users typing behaviors and support the typo correction for acronym pinyin remains a challenging problem. To tackle these challenges, we propose KNPTC, a novel approach based on neural machine translation (NMT). In contrast to previous work, KNPTC is able to integrate explicit knowledge into NMT for pinyin typo correction, and is able to learn to correct a variety of typos without the guidance of manually selected constraints or languagespecific features. In this approach, we first obtain the transition probabilities between adjacent letters based on large-scale real-life datasets. Then, we construct the “ground-truth” alignments of training sentence pairs by utilizing these probabilities. Furthermore, these alignments are integrated into NMT to capture sensible pinyin typo correction patterns. KNPTC is applied to correct typos in real-life datasets, which achieves 32.77% increment on average in accuracy rate of typo correction compared against the state-of-the-art system.
Tasks	Machine Translation
Published	2018-05-02
URL	http://arxiv.org/abs/1805.00741v1
PDF	http://arxiv.org/pdf/1805.00741v1.pdf
PWC	https://paperswithcode.com/paper/knptc-knowledge-and-neural-machine
Repo
Framework

Interpreting weight maps in terms of cognitive or clinical neuroscience: nonsense?


Title	Interpreting weight maps in terms of cognitive or clinical neuroscience: nonsense?
Authors	Jessica Schrouff, Janaina Mourao-Miranda
Abstract	Since machine learning models have been applied to neuroimaging data, researchers have drawn conclusions from the derived weight maps. In particular, weight maps of classifiers between two conditions are often described as a proxy for the underlying signal differences between the conditions. Recent studies have however suggested that such weight maps could not reliably recover the source of the neural signals and even led to false positives (FP). In this work, we used semi-simulated data from ElectroCorticoGraphy (ECoG) to investigate how the signal-to-noise ratio and sparsity of the neural signal affect the similarity between signal and weights. We show that not all cases produce FP and that it is unlikely for FP features to have a high weight in most cases.
Tasks
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11259v1
PDF	http://arxiv.org/pdf/1804.11259v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-weight-maps-in-terms-of
Repo
Framework

Logically-Constrained Reinforcement Learning


Title	Logically-Constrained Reinforcement Learning
Authors	Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
Abstract	We present the first model-free Reinforcement Learning (RL) algorithm to synthesise policies for an unknown Markov Decision Process (MDP), such that a linear time property is satisfied. The given temporal property is converted into a Limit Deterministic Buchi Automaton (LDBA) and a robust reward function is defined over the state-action pairs of the MDP according to the resulting LDBA. With this reward function, the policy synthesis procedure is “constrained” by the given specification. These constraints guide the MDP exploration so as to minimize the solution time by only considering the portion of the MDP that is relevant to satisfaction of the LTL property. This improves performance and scalability of the proposed method by avoiding an exhaustive update over the whole state space while the efficiency of standard methods such as dynamic programming is hindered by excessive memory requirements, caused by the need to store a full-model in memory. Additionally, we show that the RL procedure sets up a local value iteration method to efficiently calculate the maximum probability of satisfying the given property, at any given state of the MDP. We prove that our algorithm is guaranteed to find a policy whose traces probabilistically satisfy the LTL property if such a policy exists, and additionally we show that our method produces reasonable control policies even when the LTL property cannot be satisfied. The performance of the algorithm is evaluated via a set of numerical examples. We observe an improvement of one order of magnitude in the number of iterations required for the synthesis compared to existing approaches.
Tasks
Published	2018-01-24
URL	http://arxiv.org/abs/1801.08099v8
PDF	http://arxiv.org/pdf/1801.08099v8.pdf
PWC	https://paperswithcode.com/paper/logically-constrained-reinforcement-learning
Repo
Framework

Vision Based Dynamic Offside Line Marker for Soccer Games


Title	Vision Based Dynamic Offside Line Marker for Soccer Games
Authors	Karthik Muthuraman, Pranav Joshi, Suraj Kiran Raman
Abstract	Offside detection in soccer has emerged as one of the most important decisions with an average of 50 offside decisions every game. False detections and rash calls adversely affect game conditions and in many cases drastically change the outcome of the game. The human eye has finite precision and can only discern a limited amount of detail in a given instance. Current offside decisions are made manually by sideline referees and tend to remain controversial in many games. This calls for automated offside detection techniques in order to assist accurate refereeing. In this work, we have explicitly used computer vision and image processing techniques like Hough transform, color similarity (quantization), graph connected components, and vanishing point ideas to identify the probable offside regions. Keywords: Hough transform, connected components, KLT tracking, color similarity.
Tasks	Quantization
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06438v1
PDF	http://arxiv.org/pdf/1804.06438v1.pdf
PWC	https://paperswithcode.com/paper/vision-based-dynamic-offside-line-marker-for
Repo
Framework

Statement networks: a power structure narrative as depicted by newspapers


Title	Statement networks: a power structure narrative as depicted by newspapers
Authors	Shoumik Sharar Chowdhury, Nazmus Saquib, Niamat Zawad, Manash Kumar Mandal, Syed Haque
Abstract	We report a data mining pipeline and subsequent analysis to understand the core periphery power structure created in three national newspapers in Bangladesh, as depicted by statements made by people appearing in news. Statements made by one actor about another actor can be considered a form of public conversation. Named entity recognition techniques can be used to create a temporal actor network from such conversations, which shows some unique structure, and reveals much room for improvement in news reporting and also the top actors’ conversation preferences. Our results indicate there is a presence of cliquishness between powerful political leaders when it comes to their appearance in news. We also show how these cohesive cores form through the news articles, and how, over a decade, news cycles change the actors belonging in these groups.
Tasks	Named Entity Recognition
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03632v1
PDF	http://arxiv.org/pdf/1812.03632v1.pdf
PWC	https://paperswithcode.com/paper/statement-networks-a-power-structure
Repo
Framework

Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile


Title	Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
Authors	Panayotis Mertikopoulos, Bruno Lecouat, Houssam Zenati, Chuan-Sheng Foo, Vijay Chandrasekhar, Georgios Piliouras
Abstract	Owing to their connection with generative adversarial networks (GANs), saddle-point problems have recently attracted considerable interest in machine learning and beyond. By necessity, most theoretical guarantees revolve around convex-concave (or even linear) problems; however, making theoretical inroads towards efficient GAN training depends crucially on moving beyond this classic framework. To make piecemeal progress along these lines, we analyze the behavior of mirror descent (MD) in a class of non-monotone problems whose solutions coincide with those of a naturally associated variational inequality - a property which we call coherence. We first show that ordinary, “vanilla” MD converges under a strict version of this condition, but not otherwise; in particular, it may fail to converge even in bilinear models with a unique solution. We then show that this deficiency is mitigated by optimism: by taking an “extra-gradient” step, optimistic mirror descent (OMD) converges in all coherent problems. Our analysis generalizes and extends the results of Daskalakis et al. (2018) for optimistic gradient descent (OGD) in bilinear problems, and makes concrete headway for establishing convergence beyond convex-concave games. We also provide stochastic analogues of these results, and we validate our analysis by numerical experiments in a wide array of GAN models (including Gaussian mixture models, as well as the CelebA and CIFAR-10 datasets).
Tasks
Published	2018-07-07
URL	http://arxiv.org/abs/1807.02629v2
PDF	http://arxiv.org/pdf/1807.02629v2.pdf
PWC	https://paperswithcode.com/paper/optimistic-mirror-descent-in-saddle-point
Repo
Framework

Multi-turn Dialogue Response Generation in an Adversarial Learning Framework


Title	Multi-turn Dialogue Response Generation in an Adversarial Learning Framework
Authors	Oluwatobi Olabiyi, Alan Salimov, Anish Khazane, Erik T. Mueller
Abstract	We propose an adversarial learning approach for generating multi-turn dialogue responses. Our proposed framework, hredGAN, is based on conditional generative adversarial networks (GANs). The GAN’s generator is a modified hierarchical recurrent encoder-decoder network (HRED) and the discriminator is a word-level bidirectional RNN that shares context and word embeddings with the generator. During inference, noise samples conditioned on the dialogue history are used to perturb the generator’s latent space to generate several possible responses. The final response is the one ranked best by the discriminator. The hredGAN shows improved performance over existing methods: (1) it generalizes better than networks trained using only the log-likelihood criterion, and (2) it generates longer, more informative and more diverse responses with high utterance and topic relevance even with limited training data. This improvement is demonstrated on the Movie triples and Ubuntu dialogue datasets using both automatic and human evaluations.
Tasks	Word Embeddings
Published	2018-05-30
URL	https://arxiv.org/abs/1805.11752v5
PDF	https://arxiv.org/pdf/1805.11752v5.pdf
PWC	https://paperswithcode.com/paper/multi-turn-dialogue-response-generation-in-an
Repo
Framework

Training Neural Machine Translation using Word Embedding-based Loss


Title	Training Neural Machine Translation using Word Embedding-based Loss
Authors	Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura
Abstract	In neural machine translation (NMT), the computational cost at the output layer increases with the size of the target-side vocabulary. Using a limited-size vocabulary instead may cause a significant decrease in translation quality. This trade-off is derived from a softmax-based loss function that handles in-dictionary words independently, in which word similarity is not considered. In this paper, we propose a novel NMT loss function that includes word similarity in forms of distances in a word embedding space. The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation. In experiments using ASPEC Japanese-to-English and IWSLT17 English-to-French data sets, the proposed method showed improvements against a standard NMT baseline in both datasets; especially with IWSLT17 En-Fr, it achieved up to +1.72 in BLEU and +1.99 in METEOR. When the target-side vocabulary was very limited to 1,000 words, the proposed method demonstrated a substantial gain, +1.72 in METEOR with ASPEC Ja-En.
Tasks	Machine Translation
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11219v1
PDF	http://arxiv.org/pdf/1807.11219v1.pdf
PWC	https://paperswithcode.com/paper/training-neural-machine-translation-using
Repo
Framework

DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures


Title	DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures
Authors	Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun
Abstract	Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performances in applications such as image classification and language modeling. However, these techniques typically ignore device-related objectives such as inference time, memory usage, and power consumption. Optimizing neural architecture for device-related objectives is immensely crucial for deploying deep networks on portable devices with limited computing resources. We propose DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures, optimizing for both device-related (e.g., inference time and memory usage) and device-agnostic (e.g., accuracy and model size) objectives. DPP-Net employs a compact search space inspired by current state-of-the-art mobile CNNs, and further improves search efficiency by adopting progressive search (Liu et al. 2017). Experimental results on CIFAR-10 are poised to demonstrate the effectiveness of Pareto-optimal networks found by DPP-Net, for three different devices: (1) a workstation with Titan X GPU, (2) NVIDIA Jetson TX1 embedded system, and (3) mobile phone with ARM Cortex-A53. Compared to CondenseNet and NASNet (Mobile), DPP-Net achieves better performances: higher accuracy and shorter inference time on various devices. Additional experimental results show that models found by DPP-Net also achieve considerably-good performance on ImageNet as well.
Tasks	Image Classification, Language Modelling
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08198v2
PDF	http://arxiv.org/pdf/1806.08198v2.pdf
PWC	https://paperswithcode.com/paper/dpp-net-device-aware-progressive-search-for
Repo
Framework

DP-ADMM: ADMM-based Distributed Learning with Differential Privacy


Title	DP-ADMM: ADMM-based Distributed Learning with Differential Privacy
Authors	Zonghao Huang, Rui Hu, Yuanxiong Guo, Eric Chan-Tin, Yanmin Gong
Abstract	Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.
Tasks
Published	2018-08-30
URL	https://arxiv.org/abs/1808.10101v6
PDF	https://arxiv.org/pdf/1808.10101v6.pdf
PWC	https://paperswithcode.com/paper/dp-admm-admm-based-distributed-learning-with
Repo
Framework

Fairness in Supervised Learning: An Information Theoretic Approach


Title	Fairness in Supervised Learning: An Information Theoretic Approach
Authors	AmirEmad Ghassami, Sajad Khodadadian, Negar Kiyavash
Abstract	Automated decision making systems are increasingly being used in real-world applications. In these systems for the most part, the decision rules are derived by minimizing the training error on the available historical data. Therefore, if there is a bias related to a sensitive attribute such as gender, race, religion, etc. in the data, say, due to cultural/historical discriminatory practices against a certain demographic, the system could continue discrimination in decisions by including the said bias in its decision rule. We present an information theoretic framework for designing fair predictors from data, which aim to prevent discrimination against a specified sensitive attribute in a supervised learning setting. We use equalized odds as the criterion for discrimination, which demands that the prediction should be independent of the protected attribute conditioned on the actual label. To ensure fairness and generalization simultaneously, we compress the data to an auxiliary variable, which is used for the prediction task. This auxiliary variable is chosen such that it is decontaminated from the discriminatory attribute in the sense of equalized odds. The final predictor is obtained by applying a Bayesian decision rule to the auxiliary variable.
Tasks	Decision Making
Published	2018-01-13
URL	http://arxiv.org/abs/1801.04378v2
PDF	http://arxiv.org/pdf/1801.04378v2.pdf
PWC	https://paperswithcode.com/paper/fairness-in-supervised-learning-an
Repo
Framework

Multi-label Classification of User Reactions in Online News


Title	Multi-label Classification of User Reactions in Online News
Authors	Zacarias Curi, Alceu de Souza Britto Jr, Emerson Cabrera Paraiso
Abstract	The increase in the number of Internet users and the strong interaction brought by Web 2.0 made the Opinion Mining an important task in the area of natural language processing. Although several methods are capable of performing this task, few use multi-label classification, where there is a group of true labels for each example. This type of classification is useful for situations where the opinions are analyzed from the perspective of the reader, this happens because each person can have different interpretations and opinions on the same subject. This paper discuss the efficiency of problem transformation methods combined with different classification algorithms for the task of multi-label classification of reactions in news texts. To do that, extensive tests were carried out on two news corpora written in Brazilian Portuguese annotated with reactions. A new corpus called BFRC-PT is presented. In the tests performed, the highest number of correct predictions was obtained with the Classifier Chains method combined with the Random Forest algorithm. When considering the class distribution, the best results were obtained with the Binary Relevance method combined with the LSTM and Random Forest algorithms.
Tasks	Multi-Label Classification, Opinion Mining
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02811v2
PDF	http://arxiv.org/pdf/1809.02811v2.pdf
PWC	https://paperswithcode.com/paper/multi-label-classification-of-user-reactions
Repo
Framework