January 29, 2020

2877 words 14 mins read

Paper Group ANR 711

How Sequence-to-Sequence Models Perceive Language Styles?. Deep Learning for Physical-Layer 5G Wireless Techniques: Opportunities, Challenges and Solutions. Soft edit distance for differentiable comparison of symbolic sequences. Subtractive Perceptrons for Learning Images: A Preliminary Report. Fast Training of Sparse Graph Neural Networks on Dense …

How Sequence-to-Sequence Models Perceive Language Styles?


Title	How Sequence-to-Sequence Models Perceive Language Styles?
Authors	Ruozi Huang, Mi Zhang, Xudong Pan, Beina Sheng
Abstract	Style is ubiquitous in our daily language uses, while what is language style to learning machines? In this paper, by exploiting the second-order statistics of semantic vectors of different corpora, we present a novel perspective on this question via style matrix, i.e. the covariance matrix of semantic vectors, and explain for the first time how Sequence-to-Sequence models encode style information innately in its semantic vectors. As an application, we devise a learning-free text style transfer algorithm, which explicitly constructs a pair of transfer operators from the style matrices for style transfer. Moreover, our algorithm is also observed to be flexible enough to transfer out-of-domain sentences. Extensive experimental evidence justifies the informativeness of style matrix and the competitive performance of our proposed style transfer algorithm with the state-of-the-art methods.
Tasks	Style Transfer, Text Style Transfer
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05947v1
PDF	https://arxiv.org/pdf/1908.05947v1.pdf
PWC	https://paperswithcode.com/paper/how-sequence-to-sequence-models-perceive
Repo
Framework

Deep Learning for Physical-Layer 5G Wireless Techniques: Opportunities, Challenges and Solutions


Title	Deep Learning for Physical-Layer 5G Wireless Techniques: Opportunities, Challenges and Solutions
Authors	Hongji Huang, Song Guo, Guan Gui, Zhen Yang, Jianhua Zhang, Hikmet Sari, Fumiyuki Adachi
Abstract	The new demands for high-reliability and ultra-high capacity wireless communication have led to extensive research into 5G communications. However, the current communication systems, which were designed on the basis of conventional communication theories, signficantly restrict further performance improvements and lead to severe limitations. Recently, the emerging deep learning techniques have been recognized as a promising tool for handling the complicated communication systems, and their potential for optimizing wireless communications has been demonstrated. In this article, we first review the development of deep learning solutions for 5G communication, and then propose efficient schemes for deep learning-based 5G scenarios. Specifically, the key ideas for several important deep learningbased communication methods are presented along with the research opportunities and challenges. In particular, novel communication frameworks of non-orthogonal multiple access (NOMA), massive multiple-input multiple-output (MIMO), and millimeter wave (mmWave) are investigated, and their superior performances are demonstrated. We vision that the appealing deep learning-based wireless physical layer frameworks will bring a new direction in communication theories and that this work will move us forward along this road.
Tasks
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09673v1
PDF	http://arxiv.org/pdf/1904.09673v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-physical-layer-5g-wireless
Repo
Framework

Soft edit distance for differentiable comparison of symbolic sequences


Title	Soft edit distance for differentiable comparison of symbolic sequences
Authors	Evgenii Ofitserov, Vasily Tsvetkov, Vadim Nazarov
Abstract	Edit distance, also known as Levenshtein distance, is an essential way to compare two strings that proved to be particularly useful in the analysis of genetic sequences and natural language processing. However, edit distance is a discrete function that is known to be hard to optimize. This fact hampers the use of this metric in Machine Learning. Even as simple algorithm as K-means fails to cluster a set of sequences using edit distance if they are of variable length and abundance. In this paper we propose a novel metric - soft edit distance (SED), which is a smooth approximation of edit distance. It is differentiable and therefore it is possible to optimize it with gradient methods. Similar to original edit distance, SED as well as its derivatives can be calculated with recurrent formulas at polynomial time. We prove usefulness of the proposed metric on synthetic datasets and clustering of biological sequences.
Tasks
Published	2019-04-29
URL	http://arxiv.org/abs/1904.12562v1
PDF	http://arxiv.org/pdf/1904.12562v1.pdf
PWC	https://paperswithcode.com/paper/soft-edit-distance-for-differentiable
Repo
Framework

Subtractive Perceptrons for Learning Images: A Preliminary Report


Title	Subtractive Perceptrons for Learning Images: A Preliminary Report
Authors	H. R. Tizhoosh, Shivam Kalra, Shalev Lifshitz, Morteza Babaie
Abstract	In recent years, artificial neural networks have achieved tremendous success for many vision-based tasks. However, this success remains within the paradigm of \emph{weak AI} where networks, among others, are specialized for just one given task. The path toward \emph{strong AI}, or Artificial General Intelligence, remains rather obscure. One factor, however, is clear, namely that the feed-forward structure of current networks is not a realistic abstraction of the human brain. In this preliminary work, some ideas are proposed to define a \textit{subtractive Perceptron} (s-Perceptron), a graph-based neural network that delivers a more compact topology to learn one specific task. In this preliminary study, we test the s-Perceptron with the MNIST dataset, a commonly used image archive for digit recognition. The proposed network achieves excellent results compared to the benchmark networks that rely on more complex topologies.
Tasks
Published	2019-09-15
URL	https://arxiv.org/abs/1909.12933v1
PDF	https://arxiv.org/pdf/1909.12933v1.pdf
PWC	https://paperswithcode.com/paper/subtractive-perceptrons-for-learning-images-a
Repo
Framework

Fast Training of Sparse Graph Neural Networks on Dense Hardware


Title	Fast Training of Sparse Graph Neural Networks on Dense Hardware
Authors	Matej Balog, Bart van Merriënboer, Subhodeep Moitra, Yujia Li, Daniel Tarlow
Abstract	Graph neural networks have become increasingly popular in recent years due to their ability to naturally encode relational input data and their ability to scale to large graphs by operating on a sparse representation of graph adjacency matrices. As we look to scale up these models using custom hardware, a natural assumption would be that we need hardware tailored to sparse operations and/or dynamic control flow. In this work, we question this assumption by scaling up sparse graph neural networks using a platform targeted at dense computation on fixed-size data. Drawing inspiration from optimization of numerical algorithms on sparse matrices, we develop techniques that enable training the sparse graph neural network model from Allamanis et al. [2018] in 13 minutes using a 512-core TPUv2 Pod, whereas the original training takes almost a day.
Tasks
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11786v1
PDF	https://arxiv.org/pdf/1906.11786v1.pdf
PWC	https://paperswithcode.com/paper/fast-training-of-sparse-graph-neural-networks
Repo
Framework

An Efficient and Margin-Approaching Zero-Confidence Adversarial Attack


Title	An Efficient and Margin-Approaching Zero-Confidence Adversarial Attack
Authors	Yang Zhang, Shiyu Chang, Mo Yu, Kaizhi Qian
Abstract	There are two major paradigms of white-box adversarial attacks that attempt to impose input perturbations. The first paradigm, called the fix-perturbation attack, crafts adversarial samples within a given perturbation level. The second paradigm, called the zero-confidence attack, finds the smallest perturbation needed to cause mis-classification, also known as the margin of an input feature. While the former paradigm is well-resolved, the latter is not. Existing zero-confidence attacks either introduce significant ap-proximation errors, or are too time-consuming. We therefore propose MARGINATTACK, a zero-confidence attack framework that is able to compute the margin with improved accuracy and efficiency. Our experiments show that MARGINATTACK is able to compute a smaller margin than the state-of-the-art zero-confidence attacks, and matches the state-of-the-art fix-perturbation at-tacks. In addition, it runs significantly faster than the Carlini-Wagner attack, currently the most ac-curate zero-confidence attack algorithm.
Tasks	Adversarial Attack
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00511v1
PDF	https://arxiv.org/pdf/1910.00511v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-and-margin-approaching-zero-1
Repo
Framework

Multi-view Locality Low-rank Embedding for Dimension Reduction


Title	Multi-view Locality Low-rank Embedding for Dimension Reduction
Authors	Lin Feng, Xiangzhu Meng, Huibing Wang
Abstract	During the last decades, we have witnessed a surge of interests of learning a low-dimensional space with discriminative information from one single view. Even though most of them can achieve satisfactory performance in some certain situations, they fail to fully consider the information from multiple views which are highly relevant but sometimes look different from each other. Besides, correlations between features from multiple views always vary greatly, which challenges multi-view subspace learning. Therefore, how to learn an appropriate subspace which can maintain valuable information from multi-view features is of vital importance but challenging. To tackle this problem, this paper proposes a novel multi-view dimension reduction method named Multi-view Locality Low-rank Embedding for Dimension Reduction (MvL2E). MvL2E makes full use of correlations between multi-view features by adopting low-rank representations. Meanwhile, it aims to maintain the correlations and construct a suitable manifold space to capture the low-dimensional embedding for multi-view features. A centroid based scheme is designed to force multiple views to learn from each other. And an iterative alternating strategy is developed to obtain the optimal solution of MvL2E. The proposed method is evaluated on 5 benchmark datasets. Comprehensive experiments show that our proposed MvL2E can achieve comparable performance with previous approaches proposed in recent literatures.
Tasks	Dimensionality Reduction
Published	2019-05-20
URL	https://arxiv.org/abs/1905.08138v1
PDF	https://arxiv.org/pdf/1905.08138v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-locality-low-rank-embedding-for
Repo
Framework

Generic Multilayer Network Data Analysis with the Fusion of Content and Structure


Title	Generic Multilayer Network Data Analysis with the Fusion of Content and Structure
Authors	Xuan-Son Vu, Abhishek Santra, Sharma Chakravarthy, Lili Jiang
Abstract	Multi-feature data analysis (e.g., on Facebook, LinkedIn) is challenging especially if one wants to do it efficiently and retain the flexibility by choosing features of interest for analysis. Features (e.g., age, gender, relationship, political view etc.) can be explicitly given from datasets, but also can be derived from content (e.g., political view based on Facebook posts). Analysis from multiple perspectives is needed to understand the datasets (or subsets of it) and to infer meaningful knowledge. For example, the influence of age, location, and marital status on political views may need to be inferred separately (or in combination). In this paper, we adapt multilayer network (MLN) analysis, a nontraditional approach, to model the Facebook datasets, integrate content analysis, and conduct analysis, which is driven by a list of desired application based queries. Our experimental analysis shows the flexibility and efficiency of the proposed approach when modeling and analyzing datasets with multiple features.
Tasks
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08635v1
PDF	https://arxiv.org/pdf/1905.08635v1.pdf
PWC	https://paperswithcode.com/paper/generic-multilayer-network-data-analysis-with
Repo
Framework

The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis


Title	The Secrets of Machine Learning: Ten Things You Wish You Had Known Earlier to be More Effective at Data Analysis
Authors	Cynthia Rudin, David Carlson
Abstract	Despite the widespread usage of machine learning throughout organizations, there are some key principles that are commonly missed. In particular: 1) There are at least four main families for supervised learning: logical modeling methods, linear combination methods, case-based reasoning methods, and iterative summarization methods. 2) For many application domains, almost all machine learning methods perform similarly (with some caveats). Deep learning methods, which are the leading technique for computer vision problems, do not maintain an edge over other methods for most problems (and there are reasons why). 3) Neural networks are hard to train and weird stuff often happens when you try to train them. 4) If you don’t use an interpretable model, you can make bad mistakes. 5) Explanations can be misleading and you can’t trust them. 6) You can pretty much always find an accurate-yet-interpretable model, even for deep neural networks. 7) Special properties such as decision making or robustness must be built in, they don’t happen on their own. 8) Causal inference is different than prediction (correlation is not causation). 9) There is a method to the madness of deep neural architectures, but not always. 10) It is a myth that artificial intelligence can do anything.
Tasks	Causal Inference, Decision Making
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01998v1
PDF	https://arxiv.org/pdf/1906.01998v1.pdf
PWC	https://paperswithcode.com/paper/the-secrets-of-machine-learning-ten-things
Repo
Framework

Efficient two-sample functional estimation and the super-oracle phenomenon


Title	Efficient two-sample functional estimation and the super-oracle phenomenon
Authors	Thomas B. Berrett, Richard J. Samworth
Abstract	We consider the estimation of two-sample integral functionals, of the type that occur naturally, for example, when the object of interest is a divergence between unknown probability densities. Our first main result is that, in wide generality, a weighted nearest neighbour estimator is efficient, in the sense of achieving the local asymptotic minimax lower bound. Moreover, we also prove a corresponding central limit theorem, which facilitates the construction of asymptotically valid confidence intervals for the functional, having asymptotically minimal width. One interesting consequence of our results is the discovery that, for certain functionals, the worst-case performance of our estimator may improve on that of the natural `oracle’ estimator, which is given access to the values of the unknown densities at the observations. \|
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.09347v1
PDF	http://arxiv.org/pdf/1904.09347v1.pdf
PWC	https://paperswithcode.com/paper/190409347
Repo
Framework

Multi-Robot Path Planning Via Genetic Programming


Title	Multi-Robot Path Planning Via Genetic Programming
Authors	Alexandre Trudeau, Christopher M. Clark
Abstract	This paper presents a Genetic Programming (GP) approach to solving multi-robot path planning (MRPP) problems in single-lane workspaces, specifically those easily mapped to graph representations. GP’s versatility enables this approach to produce programs optimizing for multiple attributes rather than a single attribute such as path length or completeness. When optimizing for the number of time steps needed to solve individual MRPP problems, the GP constructed programs outperformed complete MRPP algorithms, i.e. Push-Swap-Wait (PSW), by $54.1%$. The GP constructed programs also consistently outperformed PSW in solving problems that did not meet PSW’s completeness conditions. Furthermore, the GP constructed programs exhibited a greater capacity for scaling than PSW as the number of robots navigating within an MRPP environment increased. This research illustrates the benefits of using Genetic Programming for solving individual MRPP problems, including instances in which the number of robots exceeds the number of leaves in the tree-modeled workspace.
Tasks
Published	2019-12-19
URL	https://arxiv.org/abs/1912.09503v1
PDF	https://arxiv.org/pdf/1912.09503v1.pdf
PWC	https://paperswithcode.com/paper/multi-robot-path-planning-via-genetic
Repo
Framework

Science and Technology Advance through Surprise


Title	Science and Technology Advance through Surprise
Authors	Feng Shi, James Evans
Abstract	Breakthrough discoveries and inventions involve unexpected combinations of contents including problems, methods, and natural entities, and also diverse contexts such as journals, subfields, and conferences. Drawing on data from tens of millions of research papers, patents, and researchers, we construct models that predict next year’s content and context combinations with an AUC of 95% based on embeddings constructed from high-dimensional stochastic block models, where the improbability of new combinations itself predicts up to 50% of the likelihood that they will gain outsized citations and major awards. Most of these breakthroughs occur when problems in one field are unexpectedly solved by researchers from a distant other. These findings demonstrate the critical role of surprise in advance, and enable evaluation of scientific institutions ranging from education and peer review to awards in supporting it.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.09370v2
PDF	https://arxiv.org/pdf/1910.09370v2.pdf
PWC	https://paperswithcode.com/paper/science-and-technology-advance-through
Repo
Framework

DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain


Title	DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain
Authors	Yichong Xu, Xiaodong Liu, Chunyuan Li, Hoifung Poon, Jianfeng Gao
Abstract	This paper describes our competing system to enter the MEDIQA-2019 competition. We use a multi-source transfer learning approach to transfer the knowledge from MT-DNN and SciBERT to natural language understanding tasks in the medical domain. For transfer learning fine-tuning, we use multi-task learning on NLI, RQE and QA tasks on general and medical domains to improve performance. The proposed methods are proved effective for natural language understanding in the medical domain, and we rank the first place on the QA task.
Tasks	Multi-Task Learning, Transfer Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04382v1
PDF	https://arxiv.org/pdf/1906.04382v1.pdf
PWC	https://paperswithcode.com/paper/doubletransfer-at-mediqa-2019-multi-source
Repo
Framework

Anomaly Detection with Inexact Labels


Title	Anomaly Detection with Inexact Labels
Authors	Tomoharu Iwata, Machiko Toyoda, Shotaro Tora, Naonori Ueda
Abstract	We propose a supervised anomaly detection method for data with inexact anomaly labels, where each label, which is assigned to a set of instances, indicates that at least one instance in the set is anomalous. Although many anomaly detection methods have been proposed, they cannot handle inexact anomaly labels. To measure the performance with inexact anomaly labels, we define the inexact AUC, which is our extension of the area under the ROC curve (AUC) for inexact labels. The proposed method trains an anomaly score function so that the smooth approximation of the inexact AUC increases while anomaly scores for non-anomalous instances become low. We model the anomaly score function by a neural network-based unsupervised anomaly detection method, e.g., autoencoders. The proposed method performs well even when only a small number of inexact labels are available by incorporating an unsupervised anomaly detection mechanism with inexact AUC maximization. Using various datasets, we experimentally demonstrate that our proposed method improves the anomaly detection performance with inexact anomaly labels, and outperforms existing unsupervised and supervised anomaly detection and multiple instance learning methods.
Tasks	Anomaly Detection, Multiple Instance Learning, Unsupervised Anomaly Detection
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04807v1
PDF	https://arxiv.org/pdf/1909.04807v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-with-inexact-labels
Repo
Framework

Randomized Ablation Feature Importance


Title	Randomized Ablation Feature Importance
Authors	Luke Merrick
Abstract	Given a model $f$ that predicts a target $y$ from a vector of input features $\pmb{x} = x_1, x_2, \ldots, x_M$, we seek to measure the importance of each feature with respect to the model’s ability to make a good prediction. To this end, we consider how (on average) some measure of goodness or badness of prediction (which we term “loss” $\ell$), changes when we hide or ablate each feature from the model. To ablate a feature, we replace its value with another possible value randomly. By averaging over many points and many possible replacements, we measure the importance of a feature on the model’s ability to make good predictions. Furthermore, we present statistical measures of uncertainty that quantify how confident we are that the feature importance we measure from our finite dataset and finite number of ablations is close to the theoretical true importance value.
Tasks	Feature Importance
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00174v2
PDF	https://arxiv.org/pdf/1910.00174v2.pdf
PWC	https://paperswithcode.com/paper/randomized-ablation-feature-importance
Repo
Framework