January 30, 2020

3214 words 16 mins read

Paper Group ANR 344

Classifying textual data: shallow, deep and ensemble methods. Learning Restricted Boltzmann Machines with Arbitrary External Fields. Understand Dynamic Regret with Switching Cost for Online Decision Making. Coordination Group Formation for OnLine Coordinated Routing Mechanisms. Spiking Neural Networks and Online Learning: An Overview and Perspectiv …

Classifying textual data: shallow, deep and ensemble methods


Title	Classifying textual data: shallow, deep and ensemble methods
Authors	Laura Anderlucci, Lucia Guastadisegni, Cinzia Viroli
Abstract	This paper focuses on a comparative evaluation of the most common and modern methods for text classification, including the recent deep learning strategies and ensemble methods. The study is motivated by a challenging real data problem, characterized by high-dimensional and extremely sparse data, deriving from incoming calls to the customer care of an Italian phone company. We will show that deep learning outperforms many classical (shallow) strategies but the combination of shallow and deep learning methods in a unique ensemble classifier may improve the robustness and the accuracy of “single” classification methods.
Tasks	Text Classification
Published	2019-02-18
URL	http://arxiv.org/abs/1902.07068v1
PDF	http://arxiv.org/pdf/1902.07068v1.pdf
PWC	https://paperswithcode.com/paper/classifying-textual-data-shallow-deep-and
Repo
Framework

Learning Restricted Boltzmann Machines with Arbitrary External Fields


Title	Learning Restricted Boltzmann Machines with Arbitrary External Fields
Authors	Surbhi Goel
Abstract	We study the problem of learning graphical models with latent variables. We give the first algorithm for learning locally consistent (ferromagnetic or antiferromagnetic) Restricted Boltzmann Machines (or RBMs) with {\em arbitrary} external fields. Our algorithm has optimal dependence on dimension in the sample complexity and run time however it suffers from a sub-optimal dependency on the underlying parameters of the RBM. Prior results have been established only for {\em ferromagnetic} RBMs with {\em consistent} external fields (signs must be same)\cite{bresler2018learning}. The proposed algorithm strongly relies on the concavity of magnetization which does not hold in our setting. We show the following key structural property: even in the presence of arbitrary external field, for any two observed nodes that share a common latent neighbor, the covariance is high. This enables us to design a simple greedy algorithm that maximizes covariance to iteratively build the neighborhood of each vertex.
Tasks
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06595v1
PDF	https://arxiv.org/pdf/1906.06595v1.pdf
PWC	https://paperswithcode.com/paper/learning-restricted-boltzmann-machines-with
Repo
Framework

Understand Dynamic Regret with Switching Cost for Online Decision Making


Title	Understand Dynamic Regret with Switching Cost for Online Decision Making
Authors	Yawei Zhao, Qian Zhao, Xingxing Zhang, En Zhu, Xinwang Liu, Jianping Yin
Abstract	As a metric to measure the performance of an online method, dynamic regret with switching cost has drawn much attention for online decision making problems. Although the sublinear regret has been provided in many previous researches, we still have little knowledge about the relation between the dynamic regret and the switching cost. In the paper, we investigate the relation for two classic online settings: Online Algorithms (OA) and Online Convex Optimization (OCO). We provide a new theoretical analysis framework, which shows an interesting observation, that is, the relation between the switching cost and the dynamic regret is different for settings of OA and OCO. Specifically, the switching cost has significant impact on the dynamic regret in the setting of OA. But, it does not have an impact on the dynamic regret in the setting of OCO. Furthermore, we provide a lower bound of regret for the setting of OCO, which is same with the lower bound in the case of no switching cost. It shows that the switching cost does not change the difficulty of online decision making problems in the setting of OCO.
Tasks	Decision Making
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12595v1
PDF	https://arxiv.org/pdf/1911.12595v1.pdf
PWC	https://paperswithcode.com/paper/understand-dynamic-regret-with-switching-cost
Repo
Framework

Coordination Group Formation for OnLine Coordinated Routing Mechanisms


Title	Coordination Group Formation for OnLine Coordinated Routing Mechanisms
Authors	Wang Peng, Lili Du
Abstract	This study considers that the collective route choices of travelers en route represent a resolution of their competition on network routes. Well understanding this competition and coordinating their route choices help mitigate urban traffic congestion. Even though existing studies have developed such mechanisms (e.g., the CRM [1]), we still lack the quantitative method to evaluate the coordination penitential and identify proper coordination groups (CG) to implement the CRM. Thus, they hit prohibitive computing difficulty when implemented with many opt-in travelers. Motived by this view, this study develops mathematical approaches to quantify the coordination potential between two and among multiple travelers. Next, we develop the adaptive centroid-based clustering algorithm (ACCA), which splits travelers en route in a local network into CGs, each with proper size and strong coordination potential. Moreover, the ACCA is statistically secured to stop at a local optimal clustering solution, which balances the inner-cluster and inter-cluster coordination potential. It can be implemented by parallel computation to accelerate its computing efficiency. Furthermore, we propose a clustering based coordinated routing mechanism (CB-CRM), which implements a CRM on each individual CG. The numerical experiments built upon both Sioux Falls and Hardee city networks show that the ACCA works efficiently to form proper coordination groups so that as compared to the CRM, the CB-CRM significantly improves computation efficiency with minor system performance loss in a large network. This merit becomes more apparent under high penetration and congested traffic condition. Last, the experiments validate the good features of the ACCA as well as the value of implementing parallel computation.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05159v1
PDF	https://arxiv.org/pdf/1911.05159v1.pdf
PWC	https://paperswithcode.com/paper/coordination-group-formation-for-online
Repo
Framework

Spiking Neural Networks and Online Learning: An Overview and Perspectives


Title	Spiking Neural Networks and Online Learning: An Overview and Perspectives
Authors	Jesus L. Lobo, Javier Del Ser, Albert Bifet, Nikola Kasabov
Abstract	Applications that generate huge amounts of data in the form of fast streams are becoming increasingly prevalent, being therefore necessary to learn in an online manner. These conditions usually impose memory and processing time restrictions, and they often turn into evolving environments where a change may affect the input data distribution. Such a change causes that predictive models trained over these stream data become obsolete and do not adapt suitably to new distributions. Specially in these non-stationary scenarios, there is a pressing need for new algorithms that adapt to these changes as fast as possible, while maintaining good performance scores. Unfortunately, most off-the-shelf classification models need to be retrained if they are used in changing environments, and fail to scale properly. Spiking Neural Networks have revealed themselves as one of the most successful approaches to model the behavior and learning potential of the brain, and exploit them to undertake practical online learning tasks. Besides, some specific flavors of Spiking Neural Networks can overcome the necessity of retraining after a drift occurs. This work intends to merge both fields by serving as a comprehensive overview, motivating further developments that embrace Spiking Neural Networks for online learning scenarios, and being a friendly entry point for non-experts.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1908.08019v1
PDF	https://arxiv.org/pdf/1908.08019v1.pdf
PWC	https://paperswithcode.com/paper/spiking-neural-networks-and-online-learning
Repo
Framework

Multilingual Speech Recognition with Corpus Relatedness Sampling


Title	Multilingual Speech Recognition with Corpus Relatedness Sampling
Authors	Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze
Abstract	Multilingual acoustic models have been successfully applied to low-resource speech recognition. Most existing works have combined many small corpora together and pretrained a multilingual model by sampling from each corpus uniformly. The model is eventually fine-tuned on each target corpus. This approach, however, fails to exploit the relatedness and similarity among corpora in the training set. For example, the target corpus might benefit more from a corpus in the same domain or a corpus from a close language. In this work, we propose a simple but useful sampling strategy to take advantage of this relatedness. We first compute the corpus-level embeddings and estimate the similarity between each corpus. Next, we start training the multilingual model with uniform-sampling from each corpus at first, then we gradually increase the probability to sample from related corpora based on its similarity with the target corpus. Finally, the model would be fine-tuned automatically on the target corpus. Our sampling strategy outperforms the baseline multilingual model on 16 low-resource tasks. Additionally, we demonstrate that our corpus embeddings capture the language and domain information of each corpus.
Tasks	Speech Recognition
Published	2019-08-02
URL	https://arxiv.org/abs/1908.01060v1
PDF	https://arxiv.org/pdf/1908.01060v1.pdf
PWC	https://paperswithcode.com/paper/multilingual-speech-recognition-with-corpus
Repo
Framework

On Model Robustness Against Adversarial Examples


Title	On Model Robustness Against Adversarial Examples
Authors	Shufei Zhang, Kaizhu Huang, Zenglin Xu
Abstract	We study the model robustness against adversarial examples, referred to as small perturbed input data that may however fool many state-of-the-art deep learning models. Unlike previous research, we establish a novel theory addressing the robustness issue from the perspective of stability of the loss function in the small neighborhood of natural examples. We propose to exploit an energy function to describe the stability and prove that reducing such energy guarantees the robustness against adversarial examples. We also show that the traditional training methods including adversarial training with the $l_2$ norm constraint (AT) and Virtual Adversarial Training (VAT) tend to minimize the lower bound of our proposed energy function. We make an analysis showing that minimization of such lower bound can however lead to insufficient robustness within the neighborhood around the input sample. Furthermore, we design a more rational method with the energy regularization which proves to achieve better robustness than previous methods. Through a series of experiments, we demonstrate the superiority of our model on both supervised tasks and semi-supervised tasks. In particular, our proposed adversarial framework achieves the best performance compared with previous adversarial training methods on benchmark datasets MNIST, CIFAR-10, and SVHN. Importantly, they demonstrate much better robustness against adversarial examples than all the other comparison methods.
Tasks
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06479v1
PDF	https://arxiv.org/pdf/1911.06479v1.pdf
PWC	https://paperswithcode.com/paper/on-model-robustness-against-adversarial
Repo
Framework

Towards a Model for Spoken Conversational Search


Title	Towards a Model for Spoken Conversational Search
Authors	Johanne R. Trippas, Damiano Spina, Paul Thomas, Mark Sanderson, Hideo Joho, Lawrence Cavedon
Abstract	Conversation is the natural mode for information exchange in daily life, a spoken conversational interaction for search input and output is a logical format for information seeking. However, the conceptualisation of user-system interactions or information exchange in spoken conversational search (SCS) has not been explored. The first step in conceptualising SCS is to understand the conversational moves used in an audio-only communication channel for search. This paper explores conversational actions for the task of search. We define a qualitative methodology for creating conversational datasets, propose analysis protocols, and develop the SCSdata. Furthermore, we use the SCSdata to create the first annotation schema for SCS: the SCoSAS, enabling us to investigate interactivity in SCS. We further establish that SCS needs to incorporate interactivity and pro-activity to overcome the complexity that the information seeking process in an audio-only channel poses. In summary, this exploratory study unpacks the breadth of SCS. Our results highlight the need for integrating discourse in future SCS models and contributes the advancement in the formalisation of SCS models and the design of SCS systems.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13166v2
PDF	https://arxiv.org/pdf/1910.13166v2.pdf
PWC	https://paperswithcode.com/paper/191013166
Repo
Framework

Local Regularization of Noisy Point Clouds: Improved Global Geometric Estimates and Data Analysis


Title	Local Regularization of Noisy Point Clouds: Improved Global Geometric Estimates and Data Analysis
Authors	Nicolas Garcia Trillos, Daniel Sanz-Alonso, Ruiyi Yang
Abstract	Several data analysis techniques employ similarity relationships between data points to uncover the intrinsic dimension and geometric structure of the underlying data-generating mechanism. In this paper we work under the model assumption that the data is made of random perturbations of feature vectors lying on a low-dimensional manifold. We study two questions: how to define the similarity relationship over noisy data points, and what is the resulting impact of the choice of similarity in the extraction of global geometric information from the underlying manifold. We provide concrete mathematical evidence that using a local regularization of the noisy data to define the similarity improves the approximation of the hidden Euclidean distance between unperturbed points. Furthermore, graph-based objects constructed with the locally regularized similarity function satisfy better error bounds in their recovery of global geometric ones. Our theory is supported by numerical experiments that demonstrate that the gain in geometric understanding facilitated by local regularization translates into a gain in classification accuracy in simulated and real data.
Tasks
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03335v1
PDF	http://arxiv.org/pdf/1904.03335v1.pdf
PWC	https://paperswithcode.com/paper/local-regularization-of-noisy-point-clouds
Repo
Framework

Assessing Architectural Similarity in Populations of Deep Neural Networks


Title	Assessing Architectural Similarity in Populations of Deep Neural Networks
Authors	Audrey Chung, Paul Fieguth, Alexander Wong
Abstract	Evolutionary deep intelligence has recently shown great promise for producing small, powerful deep neural network models via the synthesis of increasingly efficient architectures over successive generations. Despite recent research showing the efficacy of multi-parent evolutionary synthesis, little has been done to directly assess architectural similarity between networks during the synthesis process for improved parent network selection. In this work, we present a preliminary study into quantifying architectural similarity via the percentage overlap of architectural clusters. Results show that networks synthesized using architectural alignment (via gene tagging) maintain higher architectural similarities within each generation, potentially restricting the search space of highly efficient network architectures.
Tasks
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09879v1
PDF	http://arxiv.org/pdf/1904.09879v1.pdf
PWC	https://paperswithcode.com/paper/190409879
Repo
Framework

Video-to-Video Translation for Visual Speech Synthesis


Title	Video-to-Video Translation for Visual Speech Synthesis
Authors	Michail C. Doukas, Viktoriia Sharmanska, Stefanos Zafeiriou
Abstract	Despite remarkable success in image-to-image translation that celebrates the advancements of generative adversarial networks (GANs), very limited attempts are known for video domain translation. We study the task of video-to-video translation in the context of visual speech generation, where the goal is to transform an input video of any spoken word to an output video of a different word. This is a multi-domain translation, where each word forms a domain of videos uttering this word. Adaptation of the state-of-the-art image-to-image translation model (StarGAN) to this setting falls short with a large vocabulary size. Instead we propose to use character encodings of the words and design a novel character-based GANs architecture for video-to-video translation called Visual Speech GAN (ViSpGAN). We are the first to demonstrate video-to-video translation with a vocabulary of 500 words.
Tasks	Image-to-Image Translation, Speech Synthesis
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12043v1
PDF	https://arxiv.org/pdf/1905.12043v1.pdf
PWC	https://paperswithcode.com/paper/video-to-video-translation-for-visual-speech
Repo
Framework

Modeling Recurrence for Transformer


Title	Modeling Recurrence for Transformer
Authors	Jie Hao, Xing Wang, Baosong Yang, Longyue Wang, Jinfeng Zhang, Zhaopeng Tu
Abstract	Recently, the Transformer model that is based solely on attention mechanisms, has advanced the state-of-the-art on various machine translation tasks. However, recent studies reveal that the lack of recurrence hinders its further improvement of translation capacity. In response to this problem, we propose to directly model recurrence for Transformer with an additional recurrence encoder. In addition to the standard recurrent neural network, we introduce a novel attentive recurrent network to leverage the strengths of both attention and recurrent networks. Experimental results on the widely-used WMT14 English-German and WMT17 Chinese-English translation tasks demonstrate the effectiveness of the proposed approach. Our studies also reveal that the proposed model benefits from a short-cut that bridges the source and target sequences with a single recurrent layer, which outperforms its deep counterpart.
Tasks	Machine Translation
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03092v1
PDF	http://arxiv.org/pdf/1904.03092v1.pdf
PWC	https://paperswithcode.com/paper/modeling-recurrence-for-transformer
Repo
Framework

CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network


Title	CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network
Authors	Vincent Wan, Chun-an Chan, Tom Kenter, Jakub Vit, Rob Clark
Abstract	The prosodic aspects of speech signals produced by current text-to-speech systems are typically averaged over training material, and as such lack the variety and liveliness found in natural speech. To avoid monotony and averaged prosody contours, it is desirable to have a way of modeling the variation in the prosodic aspects of speech, so audio signals can be synthesized in multiple ways for a given text. We present a new, hierarchically structured conditional variational autoencoder to generate prosodic features (fundamental frequency, energy and duration) suitable for use with a vocoder or a generative model like WaveNet. At inference time, an embedding representing the prosody of a sentence may be sampled from the variational layer to allow for prosodic variation. To efficiently capture the hierarchical nature of the linguistic input (words, syllables and phones), both the encoder and decoder parts of the auto-encoder are hierarchical, in line with the linguistic structure, with layers being clocked dynamically at the respective rates. We show in our experiments that our dynamic hierarchical network outperforms a non-hierarchical state-of-the-art baseline, and, additionally, that prosody transfer across sentences is possible by employing the prosody embedding of one sentence to generate the speech signal of another.
Tasks	Speech Synthesis
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07195v2
PDF	https://arxiv.org/pdf/1905.07195v2.pdf
PWC	https://paperswithcode.com/paper/chive-varying-prosody-in-speech-synthesis
Repo
Framework

Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models


Title	Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models
Authors	Ahmed Hussen Abdelaziz, Barry-John Theobald, Justin Binder, Gabriele Fanelli, Paul Dixon, Nicholas Apostoloff, Thibaut Weise, Sachin Kajareker
Abstract	Speech-driven visual speech synthesis involves mapping features extracted from acoustic speech to the corresponding lip animation controls for a face model. This mapping can take many forms, but a powerful approach is to use deep neural networks (DNNs). However, a limitation is the lack of synchronized audio, video, and depth data required to reliably train the DNNs, especially for speaker-independent models. In this paper, we investigate adapting an automatic speech recognition (ASR) acoustic model (AM) for the visual speech synthesis problem. We train the AM on ten thousand hours of audio-only data. The AM is then adapted to the visual speech synthesis domain using ninety hours of synchronized audio-visual speech. Using a subjective assessment test, we compared the performance of the AM-initialized DNN to one with a random initialization. The results show that viewers significantly prefer animations generated from the AM-initialized DNN than the ones generated using the randomly initialized model. We conclude that visual speech synthesis can significantly benefit from the powerful representation of speech in the ASR acoustic models.
Tasks	Speech Recognition, Speech Synthesis
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06860v1
PDF	https://arxiv.org/pdf/1905.06860v1.pdf
PWC	https://paperswithcode.com/paper/speaker-independent-speech-driven-visual
Repo
Framework

Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination


Title	Assessing Algorithmic Fairness with Unobserved Protected Class Using Data Combination
Authors	Nathan Kallus, Xiaojie Mao, Angela Zhou
Abstract	The increasing impact of algorithmic decisions on people’s lives compels us to scrutinize their fairness and, in particular, the disparate impacts that ostensibly-color-blind algorithms can have on different groups. Examples include credit decisioning, hiring, advertising, criminal justice, personalized medicine, and targeted policymaking, where in some cases legislative or regulatory frameworks for fairness exist and define specific protected classes. In this paper we study a fundamental challenge to assessing disparate impacts in practice: protected class membership is often not observed in the data. This is particularly a problem in lending and healthcare. We consider the use of an auxiliary dataset, such as the US census, that includes class labels but not decisions or outcomes. We show that a variety of common disparity measures are generally unidentifiable aside for some unrealistic cases, providing a new perspective on the documented biases of popular proxy-based methods. We provide exact characterizations of the sharpest-possible partial identification set of disparities either under no assumptions or when we incorporate mild smoothness constraints. We further provide optimization-based algorithms for computing and visualizing these sets, which enables reliable and robust assessments – an important tool when disparity assessment can have far-reaching policy implications. We demonstrate this in two case studies with real data: mortgage lending and personalized medicine dosing.
Tasks
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00285v1
PDF	https://arxiv.org/pdf/1906.00285v1.pdf
PWC	https://paperswithcode.com/paper/190600285
Repo
Framework