Paper Group ANR 319
Finding dissimilar explanations in Bayesian networks: Complexity results. Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection. Answer Set Programming Modulo `Space-Time’. KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction. Interpreting weight maps in terms of cog …
Finding dissimilar explanations in Bayesian networks: Complexity results
Title | Finding dissimilar explanations in Bayesian networks: Complexity results |
Authors | Johan Kwisthout |
Abstract | Finding the most probable explanation for observed variables in a Bayesian network is a notoriously intractable problem, particularly if there are hidden variables in the network. In this paper we examine the complexity of a related problem, that is, the problem of finding a set of sufficiently dissimilar, yet all plausible, explanations. Applications of this problem are, e.g., in search query results (you won’t want 10 results that all link to the same website) or in decision support systems. We show that the problem of finding a ‘good enough’ explanation that differs in structure from the best explanation is at least as hard as finding the best explanation itself. |
Tasks | |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11391v2 |
http://arxiv.org/pdf/1810.11391v2.pdf | |
PWC | https://paperswithcode.com/paper/finding-dissimilar-explanations-in-bayesian |
Repo | |
Framework | |
Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection
Title | Modeling Word Emotion in Historical Language: Quantity Beats Supposed Stability in Seed Word Selection |
Authors | Johannes Hellrich, Sven Buechel, Udo Hahn |
Abstract | To understand historical texts, we must be aware that language – including the emotional connotation attached to words – changes over time. In this paper, we aim at estimating the emotion which is associated with a given word in former language stages of English and German. Emotion is represented following the popular Valence-Arousal-Dominance (VAD) annotation scheme. While being more expressive than polarity alone, existing word emotion induction methods are typically not suited for addressing it. To overcome this limitation, we present adaptations of two popular algorithms to VAD. To measure their effectiveness in diachronic settings, we present the first gold standard for historical word emotions, which was created by scholars with proficiency in the respective language stages and covers both English and German. In contrast to claims in previous work, our findings indicate that hand-selecting small sets of seed words with supposedly stable emotional meaning is actually harmful rather than helpful. |
Tasks | |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08115v2 |
http://arxiv.org/pdf/1806.08115v2.pdf | |
PWC | https://paperswithcode.com/paper/inducing-affective-lexical-semantics-in |
Repo | |
Framework | |
Answer Set Programming Modulo `Space-Time’
Title | Answer Set Programming Modulo `Space-Time’ | |
Authors | Carl Schultz, Mehul Bhatt, Jakob Suchan, Przemysław Wałęga |
Abstract | We present ASP Modulo Space-Time', a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities combine and synergise for applications in a range of AI application areas where the processing and interpretation of spatio-temporal data is crucial. The framework and resulting system is the only general KR-based method for declaratively reasoning about the dynamics of space-time’ regions as first-class objects. We present an empirical evaluation (with scalability and robustness results), and include diverse application examples involving interpretation and control tasks. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06861v1 |
http://arxiv.org/pdf/1805.06861v1.pdf | |
PWC | https://paperswithcode.com/paper/answer-set-programming-modulo-space-time |
Repo | |
Framework | |
KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction
Title | KNPTC: Knowledge and Neural Machine Translation Powered Chinese Pinyin Typo Correction |
Authors | Hengyi Cai, Xingguang Ji, Yonghao Song, Yan Jin, Yang Zhang, Mairgup Mansur, Xiaofang Zhao |
Abstract | Chinese pinyin input methods are very important for Chinese language processing. Actually, users may make typos inevitably when they input pinyin. Moreover, pinyin typo correction has become an increasingly important task with the popularity of smartphones and the mobile Internet. How to exploit the knowledge of users typing behaviors and support the typo correction for acronym pinyin remains a challenging problem. To tackle these challenges, we propose KNPTC, a novel approach based on neural machine translation (NMT). In contrast to previous work, KNPTC is able to integrate explicit knowledge into NMT for pinyin typo correction, and is able to learn to correct a variety of typos without the guidance of manually selected constraints or languagespecific features. In this approach, we first obtain the transition probabilities between adjacent letters based on large-scale real-life datasets. Then, we construct the “ground-truth” alignments of training sentence pairs by utilizing these probabilities. Furthermore, these alignments are integrated into NMT to capture sensible pinyin typo correction patterns. KNPTC is applied to correct typos in real-life datasets, which achieves 32.77% increment on average in accuracy rate of typo correction compared against the state-of-the-art system. |
Tasks | Machine Translation |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.00741v1 |
http://arxiv.org/pdf/1805.00741v1.pdf | |
PWC | https://paperswithcode.com/paper/knptc-knowledge-and-neural-machine |
Repo | |
Framework | |
Interpreting weight maps in terms of cognitive or clinical neuroscience: nonsense?
Title | Interpreting weight maps in terms of cognitive or clinical neuroscience: nonsense? |
Authors | Jessica Schrouff, Janaina Mourao-Miranda |
Abstract | Since machine learning models have been applied to neuroimaging data, researchers have drawn conclusions from the derived weight maps. In particular, weight maps of classifiers between two conditions are often described as a proxy for the underlying signal differences between the conditions. Recent studies have however suggested that such weight maps could not reliably recover the source of the neural signals and even led to false positives (FP). In this work, we used semi-simulated data from ElectroCorticoGraphy (ECoG) to investigate how the signal-to-noise ratio and sparsity of the neural signal affect the similarity between signal and weights. We show that not all cases produce FP and that it is unlikely for FP features to have a high weight in most cases. |
Tasks | |
Published | 2018-04-30 |
URL | http://arxiv.org/abs/1804.11259v1 |
http://arxiv.org/pdf/1804.11259v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-weight-maps-in-terms-of |
Repo | |
Framework | |
Logically-Constrained Reinforcement Learning
Title | Logically-Constrained Reinforcement Learning |
Authors | Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening |
Abstract | We present the first model-free Reinforcement Learning (RL) algorithm to synthesise policies for an unknown Markov Decision Process (MDP), such that a linear time property is satisfied. The given temporal property is converted into a Limit Deterministic Buchi Automaton (LDBA) and a robust reward function is defined over the state-action pairs of the MDP according to the resulting LDBA. With this reward function, the policy synthesis procedure is “constrained” by the given specification. These constraints guide the MDP exploration so as to minimize the solution time by only considering the portion of the MDP that is relevant to satisfaction of the LTL property. This improves performance and scalability of the proposed method by avoiding an exhaustive update over the whole state space while the efficiency of standard methods such as dynamic programming is hindered by excessive memory requirements, caused by the need to store a full-model in memory. Additionally, we show that the RL procedure sets up a local value iteration method to efficiently calculate the maximum probability of satisfying the given property, at any given state of the MDP. We prove that our algorithm is guaranteed to find a policy whose traces probabilistically satisfy the LTL property if such a policy exists, and additionally we show that our method produces reasonable control policies even when the LTL property cannot be satisfied. The performance of the algorithm is evaluated via a set of numerical examples. We observe an improvement of one order of magnitude in the number of iterations required for the synthesis compared to existing approaches. |
Tasks | |
Published | 2018-01-24 |
URL | http://arxiv.org/abs/1801.08099v8 |
http://arxiv.org/pdf/1801.08099v8.pdf | |
PWC | https://paperswithcode.com/paper/logically-constrained-reinforcement-learning |
Repo | |
Framework | |
Vision Based Dynamic Offside Line Marker for Soccer Games
Title | Vision Based Dynamic Offside Line Marker for Soccer Games |
Authors | Karthik Muthuraman, Pranav Joshi, Suraj Kiran Raman |
Abstract | Offside detection in soccer has emerged as one of the most important decisions with an average of 50 offside decisions every game. False detections and rash calls adversely affect game conditions and in many cases drastically change the outcome of the game. The human eye has finite precision and can only discern a limited amount of detail in a given instance. Current offside decisions are made manually by sideline referees and tend to remain controversial in many games. This calls for automated offside detection techniques in order to assist accurate refereeing. In this work, we have explicitly used computer vision and image processing techniques like Hough transform, color similarity (quantization), graph connected components, and vanishing point ideas to identify the probable offside regions. Keywords: Hough transform, connected components, KLT tracking, color similarity. |
Tasks | Quantization |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06438v1 |
http://arxiv.org/pdf/1804.06438v1.pdf | |
PWC | https://paperswithcode.com/paper/vision-based-dynamic-offside-line-marker-for |
Repo | |
Framework | |
Statement networks: a power structure narrative as depicted by newspapers
Title | Statement networks: a power structure narrative as depicted by newspapers |
Authors | Shoumik Sharar Chowdhury, Nazmus Saquib, Niamat Zawad, Manash Kumar Mandal, Syed Haque |
Abstract | We report a data mining pipeline and subsequent analysis to understand the core periphery power structure created in three national newspapers in Bangladesh, as depicted by statements made by people appearing in news. Statements made by one actor about another actor can be considered a form of public conversation. Named entity recognition techniques can be used to create a temporal actor network from such conversations, which shows some unique structure, and reveals much room for improvement in news reporting and also the top actors’ conversation preferences. Our results indicate there is a presence of cliquishness between powerful political leaders when it comes to their appearance in news. We also show how these cohesive cores form through the news articles, and how, over a decade, news cycles change the actors belonging in these groups. |
Tasks | Named Entity Recognition |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03632v1 |
http://arxiv.org/pdf/1812.03632v1.pdf | |
PWC | https://paperswithcode.com/paper/statement-networks-a-power-structure |
Repo | |
Framework | |
Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
Title | Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile |
Authors | Panayotis Mertikopoulos, Bruno Lecouat, Houssam Zenati, Chuan-Sheng Foo, Vijay Chandrasekhar, Georgios Piliouras |
Abstract | Owing to their connection with generative adversarial networks (GANs), saddle-point problems have recently attracted considerable interest in machine learning and beyond. By necessity, most theoretical guarantees revolve around convex-concave (or even linear) problems; however, making theoretical inroads towards efficient GAN training depends crucially on moving beyond this classic framework. To make piecemeal progress along these lines, we analyze the behavior of mirror descent (MD) in a class of non-monotone problems whose solutions coincide with those of a naturally associated variational inequality - a property which we call coherence. We first show that ordinary, “vanilla” MD converges under a strict version of this condition, but not otherwise; in particular, it may fail to converge even in bilinear models with a unique solution. We then show that this deficiency is mitigated by optimism: by taking an “extra-gradient” step, optimistic mirror descent (OMD) converges in all coherent problems. Our analysis generalizes and extends the results of Daskalakis et al. (2018) for optimistic gradient descent (OGD) in bilinear problems, and makes concrete headway for establishing convergence beyond convex-concave games. We also provide stochastic analogues of these results, and we validate our analysis by numerical experiments in a wide array of GAN models (including Gaussian mixture models, as well as the CelebA and CIFAR-10 datasets). |
Tasks | |
Published | 2018-07-07 |
URL | http://arxiv.org/abs/1807.02629v2 |
http://arxiv.org/pdf/1807.02629v2.pdf | |
PWC | https://paperswithcode.com/paper/optimistic-mirror-descent-in-saddle-point |
Repo | |
Framework | |
Multi-turn Dialogue Response Generation in an Adversarial Learning Framework
Title | Multi-turn Dialogue Response Generation in an Adversarial Learning Framework |
Authors | Oluwatobi Olabiyi, Alan Salimov, Anish Khazane, Erik T. Mueller |
Abstract | We propose an adversarial learning approach for generating multi-turn dialogue responses. Our proposed framework, hredGAN, is based on conditional generative adversarial networks (GANs). The GAN’s generator is a modified hierarchical recurrent encoder-decoder network (HRED) and the discriminator is a word-level bidirectional RNN that shares context and word embeddings with the generator. During inference, noise samples conditioned on the dialogue history are used to perturb the generator’s latent space to generate several possible responses. The final response is the one ranked best by the discriminator. The hredGAN shows improved performance over existing methods: (1) it generalizes better than networks trained using only the log-likelihood criterion, and (2) it generates longer, more informative and more diverse responses with high utterance and topic relevance even with limited training data. This improvement is demonstrated on the Movie triples and Ubuntu dialogue datasets using both automatic and human evaluations. |
Tasks | Word Embeddings |
Published | 2018-05-30 |
URL | https://arxiv.org/abs/1805.11752v5 |
https://arxiv.org/pdf/1805.11752v5.pdf | |
PWC | https://paperswithcode.com/paper/multi-turn-dialogue-response-generation-in-an |
Repo | |
Framework | |
Training Neural Machine Translation using Word Embedding-based Loss
Title | Training Neural Machine Translation using Word Embedding-based Loss |
Authors | Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura |
Abstract | In neural machine translation (NMT), the computational cost at the output layer increases with the size of the target-side vocabulary. Using a limited-size vocabulary instead may cause a significant decrease in translation quality. This trade-off is derived from a softmax-based loss function that handles in-dictionary words independently, in which word similarity is not considered. In this paper, we propose a novel NMT loss function that includes word similarity in forms of distances in a word embedding space. The proposed loss function encourages an NMT decoder to generate words close to their references in the embedding space; this helps the decoder to choose similar acceptable words when the actual best candidates are not included in the vocabulary due to its size limitation. In experiments using ASPEC Japanese-to-English and IWSLT17 English-to-French data sets, the proposed method showed improvements against a standard NMT baseline in both datasets; especially with IWSLT17 En-Fr, it achieved up to +1.72 in BLEU and +1.99 in METEOR. When the target-side vocabulary was very limited to 1,000 words, the proposed method demonstrated a substantial gain, +1.72 in METEOR with ASPEC Ja-En. |
Tasks | Machine Translation |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11219v1 |
http://arxiv.org/pdf/1807.11219v1.pdf | |
PWC | https://paperswithcode.com/paper/training-neural-machine-translation-using |
Repo | |
Framework | |
DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures
Title | DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures |
Authors | Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, Min Sun |
Abstract | Recent breakthroughs in Neural Architectural Search (NAS) have achieved state-of-the-art performances in applications such as image classification and language modeling. However, these techniques typically ignore device-related objectives such as inference time, memory usage, and power consumption. Optimizing neural architecture for device-related objectives is immensely crucial for deploying deep networks on portable devices with limited computing resources. We propose DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures, optimizing for both device-related (e.g., inference time and memory usage) and device-agnostic (e.g., accuracy and model size) objectives. DPP-Net employs a compact search space inspired by current state-of-the-art mobile CNNs, and further improves search efficiency by adopting progressive search (Liu et al. 2017). Experimental results on CIFAR-10 are poised to demonstrate the effectiveness of Pareto-optimal networks found by DPP-Net, for three different devices: (1) a workstation with Titan X GPU, (2) NVIDIA Jetson TX1 embedded system, and (3) mobile phone with ARM Cortex-A53. Compared to CondenseNet and NASNet (Mobile), DPP-Net achieves better performances: higher accuracy and shorter inference time on various devices. Additional experimental results show that models found by DPP-Net also achieve considerably-good performance on ImageNet as well. |
Tasks | Image Classification, Language Modelling |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08198v2 |
http://arxiv.org/pdf/1806.08198v2.pdf | |
PWC | https://paperswithcode.com/paper/dpp-net-device-aware-progressive-search-for |
Repo | |
Framework | |
DP-ADMM: ADMM-based Distributed Learning with Differential Privacy
Title | DP-ADMM: ADMM-based Distributed Learning with Differential Privacy |
Authors | Zonghao Huang, Rui Hu, Yuanxiong Guo, Eric Chan-Tin, Yanmin Gong |
Abstract | Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee. |
Tasks | |
Published | 2018-08-30 |
URL | https://arxiv.org/abs/1808.10101v6 |
https://arxiv.org/pdf/1808.10101v6.pdf | |
PWC | https://paperswithcode.com/paper/dp-admm-admm-based-distributed-learning-with |
Repo | |
Framework | |
Fairness in Supervised Learning: An Information Theoretic Approach
Title | Fairness in Supervised Learning: An Information Theoretic Approach |
Authors | AmirEmad Ghassami, Sajad Khodadadian, Negar Kiyavash |
Abstract | Automated decision making systems are increasingly being used in real-world applications. In these systems for the most part, the decision rules are derived by minimizing the training error on the available historical data. Therefore, if there is a bias related to a sensitive attribute such as gender, race, religion, etc. in the data, say, due to cultural/historical discriminatory practices against a certain demographic, the system could continue discrimination in decisions by including the said bias in its decision rule. We present an information theoretic framework for designing fair predictors from data, which aim to prevent discrimination against a specified sensitive attribute in a supervised learning setting. We use equalized odds as the criterion for discrimination, which demands that the prediction should be independent of the protected attribute conditioned on the actual label. To ensure fairness and generalization simultaneously, we compress the data to an auxiliary variable, which is used for the prediction task. This auxiliary variable is chosen such that it is decontaminated from the discriminatory attribute in the sense of equalized odds. The final predictor is obtained by applying a Bayesian decision rule to the auxiliary variable. |
Tasks | Decision Making |
Published | 2018-01-13 |
URL | http://arxiv.org/abs/1801.04378v2 |
http://arxiv.org/pdf/1801.04378v2.pdf | |
PWC | https://paperswithcode.com/paper/fairness-in-supervised-learning-an |
Repo | |
Framework | |
Multi-label Classification of User Reactions in Online News
Title | Multi-label Classification of User Reactions in Online News |
Authors | Zacarias Curi, Alceu de Souza Britto Jr, Emerson Cabrera Paraiso |
Abstract | The increase in the number of Internet users and the strong interaction brought by Web 2.0 made the Opinion Mining an important task in the area of natural language processing. Although several methods are capable of performing this task, few use multi-label classification, where there is a group of true labels for each example. This type of classification is useful for situations where the opinions are analyzed from the perspective of the reader, this happens because each person can have different interpretations and opinions on the same subject. This paper discuss the efficiency of problem transformation methods combined with different classification algorithms for the task of multi-label classification of reactions in news texts. To do that, extensive tests were carried out on two news corpora written in Brazilian Portuguese annotated with reactions. A new corpus called BFRC-PT is presented. In the tests performed, the highest number of correct predictions was obtained with the Classifier Chains method combined with the Random Forest algorithm. When considering the class distribution, the best results were obtained with the Binary Relevance method combined with the LSTM and Random Forest algorithms. |
Tasks | Multi-Label Classification, Opinion Mining |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02811v2 |
http://arxiv.org/pdf/1809.02811v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-label-classification-of-user-reactions |
Repo | |
Framework | |