January 26, 2020

3129 words 15 mins read

Paper Group ANR 1449

A Personalized Affective Memory Neural Model for Improving Emotion Recognition. Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes. RNN-T For Latency Controlled ASR With Improved Beam Search. Estimating the Density of States of Boolean Satisfiability Problems on Classi …

A Personalized Affective Memory Neural Model for Improving Emotion Recognition


Title	A Personalized Affective Memory Neural Model for Improving Emotion Recognition
Authors	Pablo Barros, German I. Parisi, Stefan Wermter
Abstract	Recent models of emotion recognition strongly rely on supervised deep learning solutions for the distinction of general emotion expressions. However, they are not reliable when recognizing online and personalized facial expressions, e.g., for person-specific affective understanding. In this paper, we present a neural model based on a conditional adversarial autoencoder to learn how to represent and edit general emotion expressions. We then propose Grow-When-Required networks as personalized affective memories to learn individualized aspects of emotion expressions. Our model achieves state-of-the-art performance on emotion recognition when evaluated on \textit{in-the-wild} datasets. Furthermore, our experiments include ablation studies and neural visualizations in order to explain the behavior of our model.
Tasks	Emotion Recognition
Published	2019-04-23
URL	http://arxiv.org/abs/1904.12632v1
PDF	http://arxiv.org/pdf/1904.12632v1.pdf
PWC	https://paperswithcode.com/paper/190412632
Repo
Framework

Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes


Title	Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes
Authors	Yujia Bao, Zhengyi Deng, Yan Wang, Heeyoon Kim, Victor Diego Armengol, Francisco Acevedo, Nofal Ouardaoui, Cathy Wang, Giovanni Parmigiani, Regina Barzilay, Danielle Braun, Kevin S Hughes
Abstract	PURPOSE: The medical literature relevant to germline genetics is growing exponentially. Clinicians need tools monitoring and prioritizing the literature to understand the clinical implications of the pathogenic genetic variants. We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance (risk of cancer for germline mutation carriers) or prevalence of germline genetic mutations. METHODS: We conducted literature searches in PubMed and retrieved paper titles and abstracts to create an annotated dataset for training and evaluating the two machine learning classification models. Our first model is a support vector machine (SVM) which learns a linear decision rule based on the bag-of-ngrams representation of each title and abstract. Our second model is a convolutional neural network (CNN) which learns a complex nonlinear decision rule based on the raw title and abstract. We evaluated the performance of the two models on the classification of papers as relevant to penetrance or prevalence. RESULTS: For penetrance classification, we annotated 3740 paper titles and abstracts and used 60% for training the model, 20% for tuning the model, and 20% for evaluating the model. The SVM model achieves 89.53% accuracy (percentage of papers that were correctly classified) while the CNN model achieves 88.95 % accuracy. For prevalence classification, we annotated 3753 paper titles and abstracts. The SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 % accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts as relevant to penetrance or prevalence. By facilitating literature review, this tool could help clinicians and researchers keep abreast of the burgeoning knowledge of gene-cancer associations and keep the knowledge bases for clinical decision support tools up to date.
Tasks
Published	2019-04-24
URL	http://arxiv.org/abs/1904.12617v1
PDF	http://arxiv.org/pdf/1904.12617v1.pdf
PWC	https://paperswithcode.com/paper/190412617
Repo
Framework

RNN-T For Latency Controlled ASR With Improved Beam Search


Title	RNN-T For Latency Controlled ASR With Improved Beam Search
Authors	Mahaveer Jain, Kjell Schubert, Jay Mahadeokar, Ching-Feng Yeh, Kaustubh Kalgaonkar, Anuroop Sriram, Christian Fuegen, Michael L. Seltzer
Abstract	Neural transducer-based systems such as RNN Transducers (RNN-T) for automatic speech recognition (ASR) blend the individual components of a traditional hybrid ASR systems (acoustic model, language model, punctuation model, inverse text normalization) into one single model. This greatly simplifies training and inference and hence makes RNN-T a desirable choice for ASR systems. In this work, we investigate use of RNN-T in applications that require a tune-able latency budget during inference time. We also improved the decoding speed of the originally proposed RNN-T beam search algorithm. We evaluated our proposed system on English videos ASR dataset and show that neural RNN-T models can achieve comparable WER and better computational efficiency compared to a well tuned hybrid ASR baseline.
Tasks	Language Modelling, Speech Recognition
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01629v2
PDF	https://arxiv.org/pdf/1911.01629v2.pdf
PWC	https://paperswithcode.com/paper/rnn-t-for-latency-controlled-asr-with
Repo
Framework

Estimating the Density of States of Boolean Satisfiability Problems on Classical and Quantum Computing Platforms


Title	Estimating the Density of States of Boolean Satisfiability Problems on Classical and Quantum Computing Platforms
Authors	Tuhin Sahai, Anurag Mishra, Jose Miguel Pasini, Susmit Jha
Abstract	Given a Boolean formula $\phi(x)$ in conjunctive normal form (CNF), the density of states counts the number of variable assignments that violate exactly $e$ clauses, for all values of $e$. Thus, the density of states is a histogram of the number of unsatisfied clauses over all possible assignments. This computation generalizes both maximum-satisfiability (MAX-SAT) and model counting problems and not only provides insight into the entire solution space, but also yields a measure for the \emph{hardness} of the problem instance. Consequently, in real-world scenarios, this problem is typically infeasible even when using state-of-the-art algorithms. While finding an exact answer to this problem is a computationally intensive task, we propose a novel approach for estimating density of states based on the concentration of measure inequalities. The methodology results in a quadratic unconstrained binary optimization (QUBO), which is particularly amenable to quantum annealing-based solutions. We present the overall approach and compare results from the D-Wave quantum annealer against the best-known classical algorithms such as the Hamze-de Freitas-Selby (HFS) algorithm and satisfiability modulo theory (SMT) solvers.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13088v1
PDF	https://arxiv.org/pdf/1910.13088v1.pdf
PWC	https://paperswithcode.com/paper/estimating-the-density-of-states-of-boolean
Repo
Framework

Towards Hardware Implementation of Neural Network-based Communication Algorithms


Title	Towards Hardware Implementation of Neural Network-based Communication Algorithms
Authors	Fayçal Ait Aoudia, Jakob Hoydis
Abstract	There is a recent interest in neural network (NN)-based communication algorithms which have shown to achieve (beyond) state-of-the-art performance for a variety of problems or lead to reduced implementation complexity. However, most work on this topic is simulation based and implementation on specialized hardware for fast inference, such as field-programmable gate arrays (FPGAs), is widely ignored. In particular for practical uses, NN weights should be quantized and inference carried out by a fixed-point instead of floating-point system, widely used in consumer class computers and graphics processing units (GPUs). Moving to such representations enables higher inference rates and complexity reductions, at the cost of precision loss. We demonstrate that it is possible to implement NN-based algorithms in fixed-point arithmetic with quantized weights at negligible performance loss and with hardware complexity compatible with practical systems, such as FPGAs and application-specific integrated circuits (ASICs).
Tasks
Published	2019-02-19
URL	http://arxiv.org/abs/1902.06939v1
PDF	http://arxiv.org/pdf/1902.06939v1.pdf
PWC	https://paperswithcode.com/paper/towards-hardware-implementation-of-neural
Repo
Framework

Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation


Title	Improving Robustness Without Sacrificing Accuracy with Patch Gaussian Augmentation
Authors	Raphael Gontijo Lopes, Dong Yin, Ben Poole, Justin Gilmer, Ekin D. Cubuk
Abstract	Deploying machine learning systems in the real world requires both high accuracy on clean data and robustness to naturally occurring corruptions. While architectural advances have led to improved accuracy, building robust models remains challenging. Prior work has argued that there is an inherent trade-off between robustness and accuracy, which is exemplified by standard data augment techniques such as Cutout, which improves clean accuracy but not robustness, and additive Gaussian noise, which improves robustness but hurts accuracy. To overcome this trade-off, we introduce Patch Gaussian, a simple augmentation scheme that adds noise to randomly selected patches in an input image. Models trained with Patch Gaussian achieve state of the art on the CIFAR-10 and ImageNetCommon Corruptions benchmarks while also improving accuracy on clean data. We find that this augmentation leads to reduced sensitivity to high frequency noise(similar to Gaussian) while retaining the ability to take advantage of relevant high frequency information in the image (similar to Cutout). Finally, we show that Patch Gaussian can be used in conjunction with other regularization methods and data augmentation policies such as AutoAugment, and improves performance on the COCO object detection benchmark.
Tasks	Data Augmentation, Object Detection
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02611v1
PDF	https://arxiv.org/pdf/1906.02611v1.pdf
PWC	https://paperswithcode.com/paper/improving-robustness-without-sacrificing
Repo
Framework

Making Bayesian Predictive Models Interpretable: A Decision Theoretic Approach


Title	Making Bayesian Predictive Models Interpretable: A Decision Theoretic Approach
Authors	Homayun Afrabandpey, Tomi Peltola, Juho Piironen, Aki Vehtari, Samuel Kaski
Abstract	A salient approach to interpretable machine learning is to restrict modeling to simple and hence understandable models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models. Fundamentally, however, interpretability is about users’ preferences, not the data generation mechanism: it is more natural to formulate interpretability as a utility function. In this work, we propose an interpretability utility, which explicates the trade-off between explanation fidelity and interpretability in the Bayesian framework. The method consists of two steps. First, a reference model, possibly a black-box Bayesian predictive model compromising no accuracy, is constructed and fitted to the training data. Second, a proxy model from an interpretable model family that best mimics the predictive behaviour of the reference model is found by optimizing the interpretability utility function. The approach is model agnostic - neither the interpretable model nor the reference model are restricted to be from a certain class of models - and the optimization problem can be solved using standard tools in the chosen model family. Through experiments on real-word data sets using decision trees as interpretable models and Bayesian additive regression models as reference models, we show that for the same level of interpretability, our approach generates more accurate models than the earlier alternative of restricting the prior. We also propose a systematic way to measure stabilities of interpretabile models constructed by different interpretability approaches and show that our proposed approach generates more stable models.
Tasks	Interpretable Machine Learning
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09358v1
PDF	https://arxiv.org/pdf/1910.09358v1.pdf
PWC	https://paperswithcode.com/paper/making-bayesian-predictive-models
Repo
Framework

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks


Title	Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks
Authors	Steffen Eger, Paul Youssef, Iryna Gurevych
Abstract	Activation functions play a crucial role in neural networks because they are the nonlinearities which have been attributed to the success story of deep learning. One of the currently most popular activation functions is ReLU, but several competitors have recently been proposed or ‘discovered’, including LReLU functions and swish. While most works compare newly proposed activation functions on few tasks (usually from image classification) and against few competitors (usually ReLU), we perform the first large-scale comparison of 21 activation functions across eight different NLP tasks. We find that a largely unknown activation function performs most stably across all tasks, the so-called penalized tanh function. We also show that it can successfully replace the sigmoid and tanh gates in LSTM cells, leading to a 2 percentage point (pp) improvement over the standard choices on a challenging NLP task.
Tasks	Image Classification
Published	2019-01-09
URL	http://arxiv.org/abs/1901.02671v1
PDF	http://arxiv.org/pdf/1901.02671v1.pdf
PWC	https://paperswithcode.com/paper/is-it-time-to-swish-comparing-deep-learning
Repo
Framework

Phase transition in PCA with missing data: Reduced signal-to-noise ratio, not sample size!


Title	Phase transition in PCA with missing data: Reduced signal-to-noise ratio, not sample size!
Authors	Niels Bruun Ipsen, Lars Kai Hansen
Abstract	How does missing data affect our ability to learn signal structures? It has been shown that learning signal structure in terms of principal components is dependent on the ratio of sample size and dimensionality and that a critical number of observations is needed before learning starts (Biehl and Mietzner, 1993). Here we generalize this analysis to include missing data. Probabilistic principal component analysis is regularly used for estimating signal structures in datasets with missing data. Our analytic result suggests that the effect of missing data is to effectively reduce signal-to-noise ratio rather than - as generally believed - to reduce sample size. The theory predicts a phase transition in the learning curves and this is indeed found both in simulation data and in real datasets.
Tasks
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00709v1
PDF	https://arxiv.org/pdf/1905.00709v1.pdf
PWC	https://paperswithcode.com/paper/phase-transition-in-pca-with-missing-data
Repo
Framework

Distributed Machine Learning through Heterogeneous Edge Systems


Title	Distributed Machine Learning through Heterogeneous Edge Systems
Authors	Hanpeng Hu, Dan Wang, Chuan Wu
Abstract	Many emerging AI applications request distributed machine learning (ML) among edge systems (e.g., IoT devices and PCs at the edge of the Internet), where data cannot be uploaded to a central venue for model training, due to their large volumes and/or security/privacy concerns. Edge devices are intrinsically heterogeneous in computing capacity, posing significant challenges to parameter synchronization for parallel training with the parameter server (PS) architecture. This paper proposes ADSP, a parameter synchronization scheme for distributed machine learning (ML) with heterogeneous edge systems. Eliminating the significant waiting time occurring with existing parameter synchronization models, the core idea of ADSP is to let faster edge devices continue training, while committing their model updates at strategically decided intervals. We design algorithms that decide time points for each worker to commit its model update, and ensure not only global model convergence but also faster convergence. Our testbed implementation and experiments show that ADSP outperforms existing parameter synchronization models significantly in terms of ML model convergence time, scalability and adaptability to large heterogeneity.
Tasks
Published	2019-11-16
URL	https://arxiv.org/abs/1911.06949v1
PDF	https://arxiv.org/pdf/1911.06949v1.pdf
PWC	https://paperswithcode.com/paper/distributed-machine-learning-through
Repo
Framework

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator


Title	On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator
Authors	Qi Cai, Mingyi Hong, Yongxin Chen, Zhaoran Wang
Abstract	We study the global convergence of generative adversarial imitation learning for linear quadratic regulators, which is posed as minimax optimization. To address the challenges arising from non-convex-concave geometry, we analyze the alternating gradient algorithm and establish its Q-linear rate of convergence to a unique saddle point, which simultaneously recovers the globally optimal policy and reward function. We hope our results may serve as a small step towards understanding and taming the instability in imitation learning as well as in more general non-convex-concave alternating minimax optimization that arises from reinforcement learning and generative adversarial learning.
Tasks	Imitation Learning
Published	2019-01-11
URL	http://arxiv.org/abs/1901.03674v1
PDF	http://arxiv.org/pdf/1901.03674v1.pdf
PWC	https://paperswithcode.com/paper/on-the-global-convergence-of-imitation
Repo
Framework

Coordination of PV Smart Inverters Using Deep Reinforcement Learning for Grid Voltage Regulation


Title	Coordination of PV Smart Inverters Using Deep Reinforcement Learning for Grid Voltage Regulation
Authors	Changfu Li, Chenrui Jin, Ratnesh Sharma
Abstract	Increasing adoption of solar photovoltaic (PV) presents new challenges to modern power grid due to its variable and intermittent nature. Fluctuating outputs from PV generation can cause the grid violating voltage operation limits. PV smart inverters (SIs) provide a fast-response method to regulate voltage by modulating real and/or reactive power at the connection point. Yet existing local autonomous control scheme of SIs is based on local information without coordination, which can lead to suboptimal performance. In this paper, a deep reinforcement learning (DRL) based algorithm is developed and implemented for coordinating multiple SIs. The reward scheme of the DRL is carefully designed to ensure voltage operation limits of the grid are met with more effective utilization of SI reactive power. The proposed DRL agent for voltage control can learn its policy through interaction with massive offline simulations, and adapts to load and solar variations. The performance of the DRL agent is compared against the local autonomous control on the IEEE 37 node system with thousands of scenarios. The results show a properly trained DRL agent can intelligently coordinate different SIs for maintaining grid voltage within allowable ranges, achieving reduction of PV production curtailment, and decreasing system losses.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.05907v1
PDF	https://arxiv.org/pdf/1910.05907v1.pdf
PWC	https://paperswithcode.com/paper/coordination-of-pv-smart-inverters-using-deep
Repo
Framework

End-to-End Model for Speech Enhancement by Consistent Spectrogram Masking


Title	End-to-End Model for Speech Enhancement by Consistent Spectrogram Masking
Authors	Xingjian Du, Mengyao Zhu, Xuan Shi, Xinpeng Zhang, Wen Zhang, Jingdong Chen
Abstract	Recently, phase processing is attracting increasinginterest in speech enhancement community. Some researchersintegrate phase estimations module into speech enhancementmodels by using complex-valued short-time Fourier transform(STFT) spectrogram based training targets, e.g. Complex RatioMask (cRM) [1]. However, masking on spectrogram would violentits consistency constraints. In this work, we prove that theinconsistent problem enlarges the solution space of the speechenhancement model and causes unintended artifacts. ConsistencySpectrogram Masking (CSM) is proposed to estimate the complexspectrogram of a signal with the consistency constraint in asimple but not trivial way. The experiments comparing ourCSM based end-to-end model with other methods are conductedto confirm that the CSM accelerate the model training andhave significant improvements in speech quality. From ourexperimental results, we assured that our method could enha
Tasks	Speech Enhancement
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00295v1
PDF	http://arxiv.org/pdf/1901.00295v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-model-for-speech-enhancement-by
Repo
Framework

Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data


Title	Investigating Correlations of Inter-coder Agreement and Machine Annotation Performance for Historical Video Data
Authors	Kader Pustu-Iren, Markus Mühling, Nikolaus Korfhage, Joanna Bars, Sabrina Bernhöft, Angelika Hörth, Bernd Freisleben, Ralph Ewerth
Abstract	Video indexing approaches such as visual concept classification and person recognition are essential to enable fine-grained semantic search in large-scale video archives such as the historical video collection of former German Democratic Republic (GDR) maintained by the German Broadcasting Archive (DRA). Typically, a lexicon of visual concepts has to be defined for semantic search. However, the definition of visual concepts can be more or less subjective due to individually differing judgments of annotators, which may have an impact on annotation quality and subsequently training of supervised machine learning methods. In this paper, we analyze the inter-coder agreement for historical TV data of the former GDR for visual concept classification and person recognition. The inter-coder agreement is evaluated for a group of expert as well as non-expert annotators in order to determine differences in annotation homogeneity. Furthermore, correlations between visual recognition performance and inter-annotator agreement are measured. In this context, information about image quantity and agreement are used to predict average precision for concept classification. Finally, the influence of expert vs. non-expert annotations acquired in the study are used to evaluate person recognition.
Tasks	Person Recognition
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10450v1
PDF	https://arxiv.org/pdf/1907.10450v1.pdf
PWC	https://paperswithcode.com/paper/investigating-correlations-of-inter-coder
Repo
Framework

Nonnegative Matrix Factorization with Local Similarity Learning


Title	Nonnegative Matrix Factorization with Local Similarity Learning
Authors	Chong Peng, Zhao Kang, Chenglizhao Chen, Qiang Cheng
Abstract	Existing nonnegative matrix factorization methods focus on learning global structure of the data to construct basis and coefficient matrices, which ignores the local structure that commonly exists among data. In this paper, we propose a new type of nonnegative matrix factorization method, which learns local similarity and clustering in a mutually enhancing way. The learned new representation is more representative in that it better reveals inherent geometric property of the data. Nonlinear expansion is given and efficient multiplicative updates are developed with theoretical convergence guarantees. Extensive experimental results have confirmed the effectiveness of the proposed model.
Tasks
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04150v1
PDF	https://arxiv.org/pdf/1907.04150v1.pdf
PWC	https://paperswithcode.com/paper/nonnegative-matrix-factorization-with-local
Repo
Framework