January 30, 2020

3162 words 15 mins read

Paper Group ANR 392

Paper Group ANR 392

Transferable Recognition-Aware Image Processing. Improvement of Batch Normalization in Imbalanced Data. Theoretical Foundations of Defeasible Description Logics. Deep Generative Graph Distribution Learning for Synthetic Power Grids. Finding Your Voice: The Linguistic Development of Mental Health Counselors. JParaCrawl: A Large Scale Web-Based Engli …

Transferable Recognition-Aware Image Processing

Title Transferable Recognition-Aware Image Processing
Authors Zhuang Liu, Tinghui Zhou, Zhiqiang Shen, Bingyi Kang, Trevor Darrell
Abstract Recent progress in image recognition has stimulated the deployment of vision systems (e.g. image search engines) at an unprecedented scale. As a result, visual data are now often consumed not only by humans but also by machines. Meanwhile, existing image processing methods only optimize for better human perception, whereas the resulting images may not be accurately recognized by machines. This can be undesirable, e.g., the images can be improperly handled by search engines or recommendation systems. In this work, we propose simple approaches to improve machine interpretability of processed images: optimizing the recognition loss directly on the image processing network or through an intermediate transforming model, a process which we show can also be done in an unsupervised manner. Interestingly, the processing model’s ability to enhance the recognition performance can transfer when evaluated on different recognition models, even if they are of different architectures, trained on different object categories or even different recognition tasks. This makes the solutions applicable even when we do not have the knowledge about future downstream recognition models, e.g., if we are to upload the processed images to the Internet. We conduct comprehensive experiments on three image processing tasks with two downstream recognition tasks, and confirm our method brings substantial accuracy improvement on both the same recognition model and when transferring to a different one, with minimal or no loss in the image processing quality.
Tasks Image Retrieval, Recommendation Systems
Published 2019-10-21
URL https://arxiv.org/abs/1910.09185v1
PDF https://arxiv.org/pdf/1910.09185v1.pdf
PWC https://paperswithcode.com/paper/transferable-recognition-aware-image-1
Repo
Framework

Improvement of Batch Normalization in Imbalanced Data

Title Improvement of Batch Normalization in Imbalanced Data
Authors Muneki Yasuda, Seishirou Ueno
Abstract In this study, we consider classification problems based on neural networks in data-imbalanced environment. Learning from an imbalanced data set is one of the most important and practical problems in the field of machine learning. A weighted loss function based on cost-sensitive approach is a well-known effective method for imbalanced data sets. We consider a combination of weighted loss function and batch normalization (BN) in this study. BN is a powerful standard technique in the recent developments in deep learning. A simple combination of both methods leads to a size-mismatch problem due to a mismatch between interpretations of effective size of data set in both methods. We propose a simple modification to BN to correct the size-mismatch and demonstrate that this modified BN is effective in data-imbalanced environment.
Tasks
Published 2019-11-25
URL https://arxiv.org/abs/1911.10687v1
PDF https://arxiv.org/pdf/1911.10687v1.pdf
PWC https://paperswithcode.com/paper/improvement-of-batch-normalization-in
Repo
Framework

Theoretical Foundations of Defeasible Description Logics

Title Theoretical Foundations of Defeasible Description Logics
Authors Katarina Britz, Giovanni Casini, Thomas Meyer, Kody Moodley, Uli Sattler, Ivan Varzinczak
Abstract We extend description logics (DLs) with non-monotonic reasoning features. We start by investigating a notion of defeasible subsumption in the spirit of defeasible conditionals as studied by Kraus, Lehmann and Magidor in the propositional case. In particular, we consider a natural and intuitive semantics for defeasible subsumption, and investigate KLM-style syntactic properties for both preferential and rational subsumption. Our contribution includes two representation results linking our semantic constructions to the set of preferential and rational properties considered. Besides showing that our semantics is appropriate, these results pave the way for more effective decision procedures for defeasible reasoning in DLs. Indeed, we also analyse the problem of non-monotonic reasoning in DLs at the level of entailment and present an algorithm for the computation of rational closure of a defeasible ontology. Importantly, our algorithm relies completely on classical entailment and shows that the computational complexity of reasoning over defeasible ontologies is no worse than that of reasoning in the underlying classical DL ALC.
Tasks
Published 2019-04-16
URL http://arxiv.org/abs/1904.07559v1
PDF http://arxiv.org/pdf/1904.07559v1.pdf
PWC https://paperswithcode.com/paper/theoretical-foundations-of-defeasible
Repo
Framework

Deep Generative Graph Distribution Learning for Synthetic Power Grids

Title Deep Generative Graph Distribution Learning for Synthetic Power Grids
Authors Mahdi Khodayar, Jianhui Wang, Zhaoyu Wang
Abstract Power system studies require the topological structures of real-world power networks; however, such data is confidential due to important security concerns. Thus, power grid synthesis (PGS), i.e., creating realistic power grids that imitate actual power networks, has gained significant attention. In this letter, we cast PGS into a graph distribution learning (GDL) problem where the probability distribution functions (PDFs) of the nodes (buses) and edges (lines) are captured. A novel deep GDL (DeepGDL) model is proposed to learn the topological patterns of buses/lines with their physical features (e.g., power injection and line impedance). Having a deep nonlinear recurrent structure, DeepGDL understands complex nonlinear topological properties and captures the graph PDF. Sampling from the obtained PDF, we are able to create a large set of realistic networks that all resemble the original power grid. Simulation results show the significant accuracy of our created synthetic power grids in terms of various topological metrics and power flow measurements.
Tasks
Published 2019-01-17
URL http://arxiv.org/abs/1901.09674v3
PDF http://arxiv.org/pdf/1901.09674v3.pdf
PWC https://paperswithcode.com/paper/a-deep-generative-model-for-graphs-supervised
Repo
Framework

Finding Your Voice: The Linguistic Development of Mental Health Counselors

Title Finding Your Voice: The Linguistic Development of Mental Health Counselors
Authors Justine Zhang, Robert Filbin, Christine Morrison, Jaclyn Weiser, Cristian Danescu-Niculescu-Mizil
Abstract Mental health counseling is an enterprise with profound societal importance where conversations play a primary role. In order to acquire the conversational skills needed to face a challenging range of situations, mental health counselors must rely on training and on continued experience with actual clients. However, in the absence of large scale longitudinal studies, the nature and significance of this developmental process remain unclear. For example, prior literature suggests that experience might not translate into consequential changes in counselor behavior. This has led some to even argue that counseling is a profession without expertise. In this work, we develop a computational framework to quantify the extent to which individuals change their linguistic behavior with experience and to study the nature of this evolution. We use our framework to conduct a large longitudinal study of mental health counseling conversations, tracking over 3,400 counselors across their tenure. We reveal that overall, counselors do indeed change their conversational behavior to become more diverse across interactions, developing an individual voice that distinguishes them from other counselors. Furthermore, a finer-grained investigation shows that the rate and nature of this diversification vary across functionally different conversational components.
Tasks
Published 2019-06-17
URL https://arxiv.org/abs/1906.07194v1
PDF https://arxiv.org/pdf/1906.07194v1.pdf
PWC https://paperswithcode.com/paper/finding-your-voice-the-linguistic-development
Repo
Framework

JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus

Title JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
Authors Makoto Morishita, Jun Suzuki, Masaaki Nagata
Abstract Recent machine translation algorithms mainly rely on parallel corpora. However, since the availability of parallel corpora remains limited, only some resource-rich language pairs can benefit from them. We constructed a parallel corpus for English-Japanese, for which the amount of publicly available parallel corpora is still limited. We constructed the parallel corpus by broadly crawling the web and automatically aligning parallel sentences. Our collected corpus, called JParaCrawl, amassed over 8.7 million sentence pairs. We show how it includes a broader range of domains and how a neural machine translation model trained with it works as a good pre-trained model for fine-tuning specific domains. The pre-training and fine-tuning approaches achieved or surpassed performance comparable to model training from the initial state and reduced the training time. Additionally, we trained the model with an in-domain dataset and JParaCrawl to show how we achieved the best performance with them. JParaCrawl and the pre-trained models are freely available online for research purposes.
Tasks Machine Translation
Published 2019-11-25
URL https://arxiv.org/abs/1911.10668v2
PDF https://arxiv.org/pdf/1911.10668v2.pdf
PWC https://paperswithcode.com/paper/jparacrawl-a-large-scale-web-based-english
Repo
Framework

Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits

Title Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits
Authors Achyudh Ram, Ji Xin, Meiyappan Nagappan, Yaoliang Yu, Rocío Cabrera Lozoya, Antonino Sabetta, Jimmy Lin
Abstract Public vulnerability databases such as CVE and NVD account for only 60% of security vulnerabilities present in open-source projects, and are known to suffer from inconsistent quality. Over the last two years, there has been considerable growth in the number of known vulnerabilities across projects available in various repositories such as NPM and Maven Central. Such an increasing risk calls for a mechanism to infer the presence of security threats in a timely manner. We propose novel hierarchical deep learning models for the identification of security-relevant commits from either the commit diff or the source code for the Java classes. By comparing the performance of our model against code2vec, a state-of-the-art model that learns from path-based representations of code, and a logistic regression baseline, we show that deep learning models show promising results in identifying security-related commits. We also conduct a comparative analysis of how various deep learning models learn across different input representations and the effect of regularization on the generalization of our models.
Tasks
Published 2019-11-15
URL https://arxiv.org/abs/1911.07620v1
PDF https://arxiv.org/pdf/1911.07620v1.pdf
PWC https://paperswithcode.com/paper/exploiting-token-and-path-based
Repo
Framework

Hierarchical Graph Network for Multi-hop Question Answering

Title Hierarchical Graph Network for Multi-hop Question Answering
Authors Yuwei Fang, Siqi Sun, Zhe Gan, Rohit Pillai, Shuohang Wang, Jingjing Liu
Abstract In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question answering. To aggregate clues from scattered texts across multiple paragraphs, a hierarchical graph is created by constructing nodes from different levels of granularity (i.e., questions, paragraphs, sentences, and entities), the representations of which are initialized with BERT-based context encoders. By weaving heterogeneous nodes in an integral unified graph, this characteristic hierarchical differentiation of node granularity enables HGN to support different question answering sub-tasks simultaneously (e.g., paragraph selection, supporting facts extraction, and answer prediction). Given a constructed hierarchical graph for each question, the initial node representations are updated through graph propagation; and for each sub-task, multi-hop reasoning is performed by traversing through graph edges. Extensive experiments on the HotpotQA benchmark demonstrate that the proposed HGN approach significantly outperforms prior state-of-the-art methods by a large margin in both Distractor and Fullwiki settings.
Tasks Question Answering
Published 2019-11-09
URL https://arxiv.org/abs/1911.03631v1
PDF https://arxiv.org/pdf/1911.03631v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-graph-network-for-multi-hop
Repo
Framework

Block-distributed Gradient Boosted Trees

Title Block-distributed Gradient Boosted Trees
Authors Theodore Vasiloudis, Hyunsu Cho, Henrik Boström
Abstract The Gradient Boosted Tree (GBT) algorithm is one of the most popular machine learning algorithms used in production, for tasks that include Click-Through Rate (CTR) prediction and learning-to-rank. To deal with the massive datasets available today, many distributed GBT methods have been proposed. However, they all assume a row-distributed dataset, addressing scalability only with respect to the number of data points and not the number of features, and increasing communication cost for high-dimensional data. In order to allow for scalability across both the data point and feature dimensions, and reduce communication cost, we propose block-distributed GBTs. We achieve communication efficiency by making full use of the data sparsity and adapting the Quickscorer algorithm to the block-distributed setting. We evaluate our approach using datasets with millions of features, and demonstrate that we are able to achieve multiple orders of magnitude reduction in communication cost for sparse data, with no loss in accuracy, while providing a more scalable design. As a result, we are able to reduce the training time for high-dimensional data, and allow more cost-effective scale-out without the need for expensive network communication.
Tasks Click-Through Rate Prediction, Learning-To-Rank
Published 2019-04-23
URL https://arxiv.org/abs/1904.10522v2
PDF https://arxiv.org/pdf/1904.10522v2.pdf
PWC https://paperswithcode.com/paper/block-distributed-gradient-boosted-trees
Repo
Framework

Adversarial Examples for Deep Learning Cyber Security Analytics

Title Adversarial Examples for Deep Learning Cyber Security Analytics
Authors Alesia Chernikova, Alina Oprea
Abstract As advances in Deep Neural Networks demonstrate unprecedented levels of performance in many critical applications, their vulnerability to attacks is still an open question. Adversarial examples are small modifications of legitimate data points, resulting in mis-classification at testing time. As Deep Neural Networks found a wide range of applications to cyber security analytics, it becomes important to study the robustness of these models in this setting. We consider adversarial testing-time attacks against Deep Learning models designed for cyber security applications. In security applications, machine learning models are not typically trained directly on the raw network traffic or security logs, but on intermediate features defined by domain experts. Existing attacks applied directly to the intermediate feature representation result in violation of feature constraints, leading to invalid adversarial examples. We propose a general framework for crafting adversarial attacks that takes into consideration the mathematical dependencies between intermediate features in model input vector, as well as physical constraints imposed by the applications. We apply our methods on two security applications, a malicious connection and a malicious domain classifier, to generate feasible adversarial examples in these domains. We show that with minimal effort (e.g., generating 12 network connections), an attacker can change the prediction of a model from Malicious to Benign. We extensively evaluate the success of our attacks, and how they depend on several optimization objectives and imbalance ratios in the training data.
Tasks
Published 2019-09-23
URL https://arxiv.org/abs/1909.10480v2
PDF https://arxiv.org/pdf/1909.10480v2.pdf
PWC https://paperswithcode.com/paper/adversarial-examples-for-deep-learning-cyber
Repo
Framework

Chinese-Japanese Unsupervised Neural Machine Translation Using Sub-character Level Information

Title Chinese-Japanese Unsupervised Neural Machine Translation Using Sub-character Level Information
Authors Longtu Zhang, Mamoru Komachi
Abstract Unsupervised neural machine translation (UNMT) requires only monolingual data of similar language pairs during training and can produce bi-directional translation models with relatively good performance on alphabetic languages (Lample et al., 2018). However, no research has been done to logographic language pairs. This study focuses on Chinese-Japanese UNMT trained by data containing sub-character (ideograph or stroke) level information which is decomposed from character level data. BLEU scores of both character and sub-character level systems were compared against each other and the results showed that despite the effectiveness of UNMT on character level data, sub-character level data could further enhance the performance, in which the stroke level system outperformed the ideograph level system.
Tasks Machine Translation
Published 2019-03-01
URL http://arxiv.org/abs/1903.00149v1
PDF http://arxiv.org/pdf/1903.00149v1.pdf
PWC https://paperswithcode.com/paper/chinese-japanese-unsupervised-neural-machine
Repo
Framework

Learning to Dynamically Coordinate Multi-Robot Teams in Graph Attention Networks

Title Learning to Dynamically Coordinate Multi-Robot Teams in Graph Attention Networks
Authors Zheyuan Wang, Matthew Gombolay
Abstract Increasing interest in integrating advanced robotics within manufacturing has spurred a renewed concentration in developing real-time scheduling solutions to coordinate human-robot collaboration in this environment. Traditionally, the problem of scheduling agents to complete tasks with temporal and spatial constraints has been approached either with exact algorithms, which are computationally intractable for large-scale, dynamic coordination, or approximate methods that require domain experts to craft heuristics for each application. We seek to overcome the limitations of these conventional methods by developing a novel graph attention network formulation to automatically learn features of scheduling problems to allow their deployment. To learn effective policies for combinatorial optimization problems via machine learning, we combine imitation learning on smaller problems with deep Q-learning on larger problems, in a non-parametric framework, to allow for fast, near-optimal scheduling of robot teams. We show that our network-based policy finds at least twice as many solutions over prior state-of-the-art methods in all testing scenarios.
Tasks Combinatorial Optimization, Imitation Learning, Q-Learning
Published 2019-12-04
URL https://arxiv.org/abs/1912.02059v1
PDF https://arxiv.org/pdf/1912.02059v1.pdf
PWC https://paperswithcode.com/paper/learning-to-dynamically-coordinate-multi
Repo
Framework

Sparse Tensor Additive Regression

Title Sparse Tensor Additive Regression
Authors Botao Hao, Boxiang Wang, Pengyuan Wang, Jingfei Zhang, Jian Yang, Will Wei Sun
Abstract Tensors are becoming prevalent in modern applications such as medical imaging and digital marketing. In this paper, we propose a sparse tensor additive regression (STAR) that models a scalar response as a flexible nonparametric function of tensor covariates. The proposed model effectively exploits the sparse and low-rank structures in the tensor additive regression. We formulate the parameter estimation as a non-convex optimization problem, and propose an efficient penalized alternating minimization algorithm. We establish a non-asymptotic error bound for the estimator obtained from each iteration of the proposed algorithm, which reveals an interplay between the optimization error and the statistical rate of convergence. We demonstrate the efficacy of STAR through extensive comparative simulation studies, and an application to the click-through-rate prediction in online advertising.
Tasks Click-Through Rate Prediction
Published 2019-03-31
URL https://arxiv.org/abs/1904.00479v2
PDF https://arxiv.org/pdf/1904.00479v2.pdf
PWC https://paperswithcode.com/paper/sparse-tensor-additive-regression
Repo
Framework

Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics

Title Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics
Authors Preslav Nakov
Abstract An important characteristic of English written text is the abundance of noun compounds - sequences of nouns acting as a single noun, e.g., colon cancer tumor suppressor protein. While eventually mastered by domain experts, their interpretation poses a major challenge for automated analysis. Understanding noun compounds’ syntax and semantics is important for many natural language applications, including question answering, machine translation, information retrieval, and information extraction. I address the problem of noun compounds syntax by means of novel, highly accurate unsupervised and lightly supervised algorithms using the Web as a corpus and search engines as interfaces to that corpus. Traditionally the Web has been viewed as a source of page hit counts, used as an estimate for n-gram word frequencies. I extend this approach by introducing novel surface features and paraphrases, which yield state-of-the-art results for the task of noun compound bracketing. I also show how these kinds of features can be applied to other structural ambiguity problems, like prepositional phrase attachment and noun phrase coordination. I address noun compound semantics by automatically generating paraphrasing verbs and prepositions that make explicit the hidden semantic relations between the nouns in a noun compound. I also demonstrate how these paraphrasing verbs can be used to solve various relational similarity problems, and how paraphrasing noun compounds can improve machine translation.
Tasks Information Retrieval, Machine Translation, Prepositional Phrase Attachment, Question Answering
Published 2019-11-23
URL https://arxiv.org/abs/1912.01113v1
PDF https://arxiv.org/pdf/1912.01113v1.pdf
PWC https://paperswithcode.com/paper/using-the-web-as-an-implicit-training-set
Repo
Framework

Deep Smoothing of the Implied Volatility Surface

Title Deep Smoothing of the Implied Volatility Surface
Authors Damien Ackerer, Natasa Tagasovska, Thibault Vatter
Abstract We present an artificial neural network (ANN) approach to value financial derivatives. Atypically to standard ANN applications, practitioners equally use option pricing models to validate market prices and to infer unobserved prices. Importantly, models need to generate realistic arbitrage-free prices, meaning that no option portfolio can lead to risk-free profits. The absence of arbitrage opportunities is guaranteed by penalizing the loss using soft constraints on an extended grid of input values. ANNs can be pre-trained by first calibrating a standard option pricing model, and then training an ANN to a larger synthetic dataset generated from the calibrated model. The parameters transfer as well as the non-arbitrage constraints appear to be particularly useful when only sparse or erroneous data are available. We also explore how deeper ANNs improve over shallower ones, as well as other properties of the network architecture. We benchmark our method against standard option pricing models, such as Heston with and without jumps. We validate our method both on training sets, and testing sets, namely, highlighting both their capacity to reproduce observed prices and predict new ones.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.05065v1
PDF https://arxiv.org/pdf/1906.05065v1.pdf
PWC https://paperswithcode.com/paper/deep-smoothing-of-the-implied-volatility
Repo
Framework
comments powered by Disqus