April 2, 2020

# Paper Group ANR 342

Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents. Connectivity-driven Communication in Multi-agent Reinforcement Learning through Diffusion Processes on Graphs. Transformer on a Diet. Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition. Identification of Indian Languages …

#### Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents

Title Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents
Authors Sashank Santhanam, Alireza Karduni, Samira Shaikh
Abstract Humans quite frequently interact with conversational agents. The rapid advancement in generative language modeling through neural networks has helped advance the creation of intelligent conversational agents. Researchers typically evaluate the output of their models through crowdsourced judgments, but there are no established best practices for conducting such studies. Moreover, it is unclear if cognitive biases in decision-making are affecting crowdsourced workers’ judgments when they undertake these tasks. To investigate, we conducted a between-subjects study with 77 crowdsourced workers to understand the role of cognitive biases, specifically anchoring bias, when humans are asked to evaluate the output of conversational agents. Our results provide insight into how best to evaluate conversational agents. We find increased consistency in ratings across two experimental conditions may be a result of anchoring bias. We also determine that external factors such as time and prior experience in similar tasks have effects on inter-rater consistency.
Published 2020-02-18
URL https://arxiv.org/abs/2002.07927v2
PDF https://arxiv.org/pdf/2002.07927v2.pdf
PWC https://paperswithcode.com/paper/studying-the-effects-of-cognitive-biases-in
Repo
Framework

#### Connectivity-driven Communication in Multi-agent Reinforcement Learning through Diffusion Processes on Graphs

Title Connectivity-driven Communication in Multi-agent Reinforcement Learning through Diffusion Processes on Graphs
Authors Emanuele Pesce, Giovanni Montana
Abstract We discuss the problem of learning collaborative behaviour in multi-agent systems using deep reinforcement learning (DRL). A connectivity-driven communication (CDC) algorithm is proposed to address three key aspects: what agents to involve in the communication, what information content to share, and how often to share it. We introduce the notion of a connectivity network, modelled as a weighted graph, where nodes represent agents and edges represent the degree of connectivity between pairs of agents. The optimal graph topology is learned end-to-end concurrently with the stochastic policy so as to maximise future expected returns. The communication patterns depend on the graph’s topology through a diffusion process on the graph, the heat kernel, which is found by exponentiating the Laplacian eigensystem through time and is fully differentiable. Empirical results show that CDC is capable of superior performance over alternative algorithms for a range of cooperative navigation tasks.
Published 2020-02-12
URL https://arxiv.org/abs/2002.05233v1
PDF https://arxiv.org/pdf/2002.05233v1.pdf
PWC https://paperswithcode.com/paper/connectivity-driven-communication-in-multi
Repo
Framework

#### Transformer on a Diet

Title Transformer on a Diet
Authors Chenguang Wang, Zihao Ye, Aston Zhang, Zheng Zhang, Alexander J. Smola
Abstract Transformer has been widely used thanks to its ability to capture sequence information in an efficient way. However, recent developments, such as BERT and GPT-2, deliver only heavy architectures with a focus on effectiveness. In this paper, we explore three carefully-designed light Transformer architectures to figure out whether the Transformer with less computations could produce competitive results. Experimental results on language model benchmark datasets hint that such trade-off is promising, and the light Transformer reduces 70% parameters at best, while obtains competitive perplexity compared to standard Transformer. The source code is publicly available.
Published 2020-02-14
URL https://arxiv.org/abs/2002.06170v1
PDF https://arxiv.org/pdf/2002.06170v1.pdf
PWC https://paperswithcode.com/paper/transformer-on-a-diet
Repo
Framework

#### Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition

Title Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition
Authors Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jianhua Tao, Ye Bai
Abstract Recently, language identity information has been utilized to improve the performance of end-to-end code-switching (CS) speech recognition. However, previous works use an additional language identification (LID) model as an auxiliary module, which causes the system complex. In this work, we propose an improved recurrent neural network transducer (RNN-T) model with language bias to alleviate the problem. We use the language identities to bias the model to predict the CS points. This promotes the model to learn the language identity information directly from transcription, and no additional LID model is needed. We evaluate the approach on a Mandarin-English CS corpus SEAME. Compared to our RNN-T baseline, the proposed method can achieve 16.2% and 12.9% relative error reduction on two test sets, respectively.
Published 2020-02-19
URL https://arxiv.org/abs/2002.08126v1
PDF https://arxiv.org/pdf/2002.08126v1.pdf
PWC https://paperswithcode.com/paper/rnn-transducer-with-language-bias-for-end-to
Repo
Framework

#### Identification of Indian Languages using Ghost-VLAD pooling

Title Identification of Indian Languages using Ghost-VLAD pooling
Authors Krishna D N, Ankita Patil, M. S. P Raj, Sai Prasad H S, Prabhu Aashish Garapati
Abstract In this work, we propose a new pooling strategy for language identification by considering Indian languages. The idea is to obtain utterance level features for any variable length audio for robust language recognition. We use the GhostVLAD approach to generate an utterance level feature vector for any variable length input audio by aggregating the local frame level features across time. The generated feature vector is shown to have very good language discriminative features and helps in getting state of the art results for language identification task. We conduct our experiments on 635Hrs of audio data for 7 Indian languages. Our method outperforms the previous state of the art x-vector [11] method by an absolute improvement of 1.88% in F1-score and achieves 98.43% F1-score on the held-out test data. We compare our system with various pooling approaches and show that GhostVLAD is the best pooling approach for this task. We also provide visualization of the utterance level embeddings generated using Ghost-VLAD pooling and show that this method creates embeddings which has very good language discriminative features.
Published 2020-02-05
URL https://arxiv.org/abs/2002.01664v1
PDF https://arxiv.org/pdf/2002.01664v1.pdf
PWC https://paperswithcode.com/paper/identification-of-indian-languages-using
Repo
Framework

#### Forecasting Bitcoin closing price series using linear regression and neural networks models

Title Forecasting Bitcoin closing price series using linear regression and neural networks models
Authors Nicola Uras, Lodovica Marchesi, Michele Marchesi, Roberto Tonelli
Abstract This paper studies how to forecast daily closing price series of Bitcoin, using data on prices and volumes of prior days. Bitcoin price behaviour is still largely unexplored, presenting new opportunities. We compared our results with two modern works on Bitcoin prices forecasting and with a well-known recent paper that uses Intel, National Bank shares and Microsoft daily NASDAQ closing prices spanning a 3-year interval. We followed different approaches in parallel, implementing both statistical techniques and machine learning algorithms. The SLR model for univariate series forecast uses only closing prices, whereas the MLR model for multivariate series uses both price and volume data. We applied the ADF -Test to these series, which resulted to be indistinguishable from a random walk. We also used two artificial neural networks: MLP and LSTM. We then partitioned the dataset into shorter sequences, representing different price regimes, obtaining best result using more than one previous price, thus confirming our regime hypothesis. All the models were evaluated in terms of MAPE and relativeRMSE. They performed well, and were overall better than those obtained in the benchmarks. Based on the results, it was possible to demonstrate the efficacy of the proposed methodology and its contribution to the state-of-the-art.
Published 2020-01-04
URL https://arxiv.org/abs/2001.01127v1
PDF https://arxiv.org/pdf/2001.01127v1.pdf
PWC https://paperswithcode.com/paper/forecasting-bitcoin-closing-price-series
Repo
Framework

#### An Efficient Software-Hardware Design Framework for Spiking Neural Network Systems

Title An Efficient Software-Hardware Design Framework for Spiking Neural Network Systems
Authors Khanh N. Dang, Abderazek Ben Abdallah
Abstract Spiking Neural Network (SNN) is the third generation of Neural Network (NN) mimicking the natural behavior of the brain. By processing based on binary input/output, SNNs offer lower complexity, higher density and lower power consumption. This work presents an efficient software-hardware design framework for developing SNN systems in hardware. In addition, a design of low-cost neurosynaptic core is presented based on packet-switching communication approach. The evaluation results show that the ANN to SNN conversion method with the size 784:1200:1200:10 performs 99% accuracy for MNIST while the unsupervised STDP archives 89% with the size 784:400 with recurrent connections. The design of 256-neurons and 65k synapses is also implemented in ASIC 45nm technology with an area cost of 0.205 $m m^2$.
Published 2020-03-22
URL https://arxiv.org/abs/2003.09847v1
PDF https://arxiv.org/pdf/2003.09847v1.pdf
PWC https://paperswithcode.com/paper/an-efficient-software-hardware-design
Repo
Framework

#### Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior

Title Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Authors Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu
Abstract Recent neural text-to-speech (TTS) models with fine-grained latent features enable precise control of the prosody of synthesized speech. Such models typically incorporate a fine-grained variational autoencoder (VAE) structure, extracting latent features at each input token (e.g., phonemes). However, generating samples with the standard VAE prior often results in unnatural and discontinuous speech, with dramatic prosodic variation between tokens. This paper proposes a sequential prior in a discrete latent space which can generate more naturally sounding samples. This is accomplished by discretizing the latent features using vector quantization (VQ), and separately training an autoregressive (AR) prior model over the result. We evaluate the approach using listening tests, objective metrics of automatic speech recognition (ASR) performance, and measurements of prosody attributes. Experimental results show that the proposed model significantly improves the naturalness in random sample generation. Furthermore, initial experiments demonstrate that randomly sampling from the proposed model can be used as data augmentation to improve the ASR performance.
Tasks Data Augmentation, Quantization, Speech Recognition
Published 2020-02-06
URL https://arxiv.org/abs/2002.03788v1
PDF https://arxiv.org/pdf/2002.03788v1.pdf
PWC https://paperswithcode.com/paper/generating-diverse-and-natural-text-to-speech
Repo
Framework

#### Inexpensive surface electromyography sleeve with consistent electrode placement enables dexterous and stable prosthetic control through deep learning

Title Inexpensive surface electromyography sleeve with consistent electrode placement enables dexterous and stable prosthetic control through deep learning
Authors Jacob A. George, Anna Neibling, Michael D. Paskett, Gregory A. Clark
Abstract The dexterity of conventional myoelectric prostheses is limited in part by the small datasets used to train the control algorithms. Variations in surface electrode positioning make it difficult to collect consistent data and to estimate motor intent reliably over time. To address these challenges, we developed an inexpensive, easy-to-don sleeve that can record robust and repeatable surface electromyography from 32 embedded monopolar electrodes. Embedded grommets are used to consistently align the sleeve with natural skin markings (e.g., moles, freckles, scars). The sleeve can be manufactured in a few hours for less than \$60. Data from seven intact participants show the sleeve provides a signal-to-noise ratio of 14, a don-time under 11 seconds, and sub-centimeter precision for electrode placement. Furthermore, in a case study with one intact participant, we use the sleeve to demonstrate that neural networks can provide simultaneous and proportional control of six degrees of freedom, even 263 days after initial algorithm training. We also highlight that consistent recordings, accumulated over time to establish a large dataset, significantly improve dexterity. These results suggest that deep learning with a 74-layer neural network can substantially improve the dexterity and stability of myoelectric prosthetic control, and that deep-learning techniques can be readily instantiated and further validated through inexpensive sleeves/sockets with consistent recording locations.
Published 2020-02-28
URL https://arxiv.org/abs/2003.00070v1
PDF https://arxiv.org/pdf/2003.00070v1.pdf
PWC https://paperswithcode.com/paper/inexpensive-surface-electromyography-sleeve
Repo
Framework

Abstract Deep Neural Networks (DNNs) are commonly used for various traffic analysis problems, such as website fingerprinting and flow correlation, as they outperform traditional (e.g., statistical) techniques by large margins. However, deep neural networks are known to be vulnerable to adversarial examples: adversarial inputs to the model that get labeled incorrectly by the model due to small adversarial perturbations. In this paper, for the first time, we show that an adversary can defeat DNN-based traffic analysis techniques by applying \emph{adversarial perturbations} on the patterns of \emph{live} network traffic.
Published 2020-02-16
URL https://arxiv.org/abs/2002.06495v1
PDF https://arxiv.org/pdf/2002.06495v1.pdf
Repo
Framework

#### A Primer on Domain Adaptation

Title A Primer on Domain Adaptation
Authors Pirmin Lemberger, Ivan Panico
Abstract Standard supervised machine learning assumes that the distribution of the source samples used to train an algorithm is the same as the one of the target samples on which it is supposed to make predictions. However, as any data scientist will confirm, this is hardly ever the case in practice. The set of statistical and numerical methods that deal with such situations is known as domain adaptation, a field with a long and rich history. The myriad of methods available and the unfortunate lack of a clear and universally accepted terminology can however make the topic rather daunting for the newcomer. Therefore, rather than aiming at completeness, which leads to exhibiting a tedious catalog of methods, this pedagogical review aims at a coherent presentation of four important special cases: (1) prior shift, a situation in which training samples were selected according to their labels without any knowledge of their actual distribution in the target, (2) covariate shift which deals with a situation where training examples were picked according to their features but with some selection bias, (3) concept shift where the dependence of the labels on the features defers between the source and the target, and last but not least (4) subspace mapping which deals with a situation where features in the target have been subjected to an unknown distortion with respect to the source features. In each case we first build an intuition, next we provide the appropriate mathematical framework and eventually we describe a practical application.
Published 2020-01-27
URL https://arxiv.org/abs/2001.09994v2
PDF https://arxiv.org/pdf/2001.09994v2.pdf
Repo
Framework

#### On the impact of modern deep-learning techniques to the performance and time-requirements of classification models in experimental high-energy physics

Title On the impact of modern deep-learning techniques to the performance and time-requirements of classification models in experimental high-energy physics
Authors Giles Chatham Strong
Abstract Beginning from a basic neural-network architecture, we test the potential benefits offered by a range of advanced techniques for machine learning and deep learning in the context of a typical classification problem encountered in the domain of high-energy physics, using a well-studied dataset: the 2014 Higgs ML Kaggle dataset. The advantages are evaluated in terms of both performance metrics and the time required to train and apply the resulting models. Techniques examined include domain-specific data-augmentation, learning rate and momentum scheduling, (advanced) ensembling in both model-space and weight-space, and alternative architectures and connection methods. Following the investigation, we arrive at a model which achieves equal performance to the winning solution of the original Kaggle challenge, whilst requiring about 1% of the training time and less than 5% of the inference time using much less specialised hardware. Additionally, a new wrapper library for PyTorch called LUMIN is presented, which incorporates all of the techniques studied.
Published 2020-02-03
URL https://arxiv.org/abs/2002.01427v2
PDF https://arxiv.org/pdf/2002.01427v2.pdf
PWC https://paperswithcode.com/paper/on-the-impact-of-modern-deep-learning
Repo
Framework

#### Style Example-Guided Text Generation using Generative Adversarial Transformers

Title Style Example-Guided Text Generation using Generative Adversarial Transformers
Authors Kuo-Hao Zeng, Mohammad Shoeybi, Ming-Yu Liu
Abstract We introduce a language generative model framework for generating a styled paragraph based on a context sentence and a style reference example. The framework consists of a style encoder and a texts decoder. The style encoder extracts a style code from the reference example, and the text decoder generates texts based on the style code and the context. We propose a novel objective function to train our framework. We also investigate different network design choices. We conduct extensive experimental validation with comparison to strong baselines to validate the effectiveness of the proposed framework using a newly collected dataset with diverse text styles. Both code and dataset will be released upon publication.
Published 2020-03-02
URL https://arxiv.org/abs/2003.00674v1
PDF https://arxiv.org/pdf/2003.00674v1.pdf
PWC https://paperswithcode.com/paper/style-example-guided-text-generation-using-1
Repo
Framework

#### Functional Data Analysis and Visualisation of Three-dimensional Surface Shape

Title Functional Data Analysis and Visualisation of Three-dimensional Surface Shape
Authors Stanislav Katina, Liberty Vittert, Adrian W. Bowman
Abstract The advent of high resolution imaging has made data on surface shape widespread. Methods for the analysis of shape based on landmarks are well established but high resolution data require a functional approach. The starting point is a systematic and consistent description of each surface shape. Three innovative forms of analysis are then introduced. The first uses surface integration to address issues of registration, principal component analysis and the measurement of asymmetry, all in functional form. Computational issues are handled through discrete approximations to integrals, based in this case on appropriate surface area weighted sums. The second innovation is to focus on sub-spaces where interesting behaviour such as group differences are exhibited, rather than on individual principal components. The third innovation concerns the comparison of individual shapes with a relevant control set, where the concept of a normal range is extended to the highly multivariate setting of surface shape. This has particularly strong applications to medical contexts where the assessment of individual patients is very important. All of these ideas are developed and illustrated in the important context of human facial shape, with a strong emphasis on the effective visual communication of effects of interest.
Published 2020-03-14
URL https://arxiv.org/abs/2003.08817v1
PDF https://arxiv.org/pdf/2003.08817v1.pdf
PWC https://paperswithcode.com/paper/functional-data-analysis-and-visualisation-of
Repo
Framework

#### Model Watermarking for Image Processing Networks

Title Model Watermarking for Image Processing Networks
Authors Jie Zhang, Dongdong Chen, Jing Liao, Han Fang, Weiming Zhang, Wenbo Zhou, Hao Cui, Nenghai Yu
Abstract Deep learning has achieved tremendous success in numerous industrial applications. As training a good model often needs massive high-quality data and computation resources, the learned models often have significant business values. However, these valuable deep models are exposed to a huge risk of infringements. For example, if the attacker has the full information of one target model including the network structure and weights, the model can be easily finetuned on new datasets. Even if the attacker can only access the output of the target model, he/she can still train another similar surrogate model by generating a large scale of input-output training pairs. How to protect the intellectual property of deep models is a very important but seriously under-researched problem. There are a few recent attempts at classification network protection only. In this paper, we propose the first model watermarking framework for protecting image processing models. To achieve this goal, we leverage the spatial invisible watermarking mechanism. Specifically, given a black-box target model, a unified and invisible watermark is hidden into its outputs, which can be regarded as a special task-agnostic barrier. In this way, when the attacker trains one surrogate model by using the input-output pairs of the target model, the hidden watermark will be learned and extracted afterward. To enable watermarks from binary bits to high-resolution images, both traditional and deep spatial invisible watermarking mechanism are considered. Experiments demonstrate the robustness of the proposed watermarking mechanism, which can resist surrogate models learned with different network structures and objective functions. Besides deep models, the proposed method is also easy to be extended to protect data and traditional image processing algorithms.