January 29, 2020

2978 words 14 mins read

Paper Group ANR 584

GmCN: Graph Mask Convolutional Network. MCTS-based Automated Negotiation Agent (Extended Abstract). Asymmetric Correntropy for Robust Adaptive Filtering. Neural Network Inference on Mobile SoCs. Cost-Sensitive Training for Autoregressive Models. $α$ Belief Propagation as Fully Factorized Approximation. Implicit Langevin Algorithms for Sampling From …

GmCN: Graph Mask Convolutional Network


Title	GmCN: Graph Mask Convolutional Network
Authors	Bo Jiang, Beibei Wang, Jin Tang, Bin Luo
Abstract	Graph Convolutional Networks (GCNs) have shown very powerful for graph data representation and learning tasks. Existing GCNs usually conduct feature aggregation on a fixed neighborhood graph in which each node computes its representation by aggregating the feature representations of all its neighbors which is biased by its own representation. However, this fixed aggregation strategy is not guaranteed to be optimal for GCN based graph learning and also can be affected by some graph structure noises, such as incorrect or undesired edge connections. To address these issues, we propose a novel Graph mask Convolutional Network (GmCN) in which nodes can adaptively select the optimal neighbors in their feature aggregation to better serve GCN learning. GmCN can be theoretically interpreted by a regularization framework, based on which we derive a simple update algorithm to determine the optimal mask adaptively in GmCN training process. Experiments on several datasets validate the effectiveness of GmCN.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1910.01735v2
PDF	https://arxiv.org/pdf/1910.01735v2.pdf
PWC	https://paperswithcode.com/paper/graph-mask-convolutional-network
Repo
Framework

MCTS-based Automated Negotiation Agent (Extended Abstract)


Title	MCTS-based Automated Negotiation Agent (Extended Abstract)
Authors	Cédric Buron, Zahia Guessoum, Sylvain Ductor
Abstract	This paper introduces a new Negotiating Agent for automated negotiation on continuous domains and without considering a specified deadline. The agent bidding strategy relies on Monte Carlo Tree Search, which is a trendy method since it has been used with success on games with high branching factor such as Go. It uses two opponent modeling techniques for its bidding strategy and its utility: Gaussian process regression and Bayesian learning. Evaluation is done by confronting the existing agents that are able to negotiate in such context: Random Walker, Tit-for-tat and Nice Tit-for-Tat. None of those agents succeeds in beating our agent; moreover the modular and adaptive nature of our approach is a huge advantage when it comes to optimize it in specific applicative contexts.
Tasks
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12411v1
PDF	http://arxiv.org/pdf/1903.12411v1.pdf
PWC	https://paperswithcode.com/paper/mcts-based-automated-negotiation-agent
Repo
Framework

Asymmetric Correntropy for Robust Adaptive Filtering


Title	Asymmetric Correntropy for Robust Adaptive Filtering
Authors	Badong Chen, Zhuang Li, Yingsong Li, Pengju Ren
Abstract	In recent years, correntropy has been seccessfully applied to robust adaptive filtering to eliminate adverse effects of impulsive noises or outliers. Correntropy is generally defined as the expectation of a Gaussian kernel between two random variables. This definition is reasonable when the error between the two random variables is symmetrically distributed around zero. For the case of asymmetric error distribution, the symmetric Gaussian kernel is however inappropriate and cannot adapt to the error distribution well. To address this problem, in this letter we propose a new variant of correntropy, named asymmetric correntropy, which uses an asymmetric Gaussian model as the kernel function. In addition, a robust adaptive filtering algorithm based on asymmetric correntropy is developed and its steadystate convergence performance is analyzed. Simulations are provided to confirm the theoretical results and good performance of the proposed algorithm.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.11855v1
PDF	https://arxiv.org/pdf/1911.11855v1.pdf
PWC	https://paperswithcode.com/paper/asymmetric-correntropy-for-robust-adaptive
Repo
Framework

Neural Network Inference on Mobile SoCs


Title	Neural Network Inference on Mobile SoCs
Authors	Siqi Wang, Anuj Pathania, Tulika Mitra
Abstract	The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accelerators. These different components are capable of independently performing inference but with very different power-performance characteristics. In this article, we provide a quantitative evaluation of the inference capabilities of the different components on mobile SoCs. We also present insights behind their respective power-performance behavior. Finally, we explore the performance limit of the mobile SoCs by synergistically engaging all the components concurrently. We observe that a mobile SoC provides up to 2x improvement with parallel inference when all its components are engaged, as opposed to engaging only one component.
Tasks
Published	2019-08-24
URL	https://arxiv.org/abs/1908.11450v2
PDF	https://arxiv.org/pdf/1908.11450v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-inference-on-mobile-socs
Repo
Framework

Cost-Sensitive Training for Autoregressive Models


Title	Cost-Sensitive Training for Autoregressive Models
Authors	Irina Saparina, Anton Osokin
Abstract	Training autoregressive models to better predict under the test metric, instead of maximizing the likelihood, has been reported to be beneficial in several use cases but brings additional complications, which prevent wider adoption. In this paper, we follow the learning-to-search approach (Daum'e III et al., 2009; Leblond et al., 2018) and investigate its several components. First, we propose a way to construct a reference policy based on an alignment between the model output and ground truth. Our reference policy is optimal when applied to the Kendall-tau distance between permutations (appear in the task of word ordering) and helps when working with the METEOR score for machine translation. Second, we observe that the learning-to-search approach benefits from choosing the costs related to the test metrics. Finally, we study the effect of different learning objectives and find that the standard KL loss only learns several high-probability tokens and can be replaced with ranking objectives that target these tokens explicitly.
Tasks	Machine Translation
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03771v1
PDF	https://arxiv.org/pdf/1912.03771v1.pdf
PWC	https://paperswithcode.com/paper/cost-sensitive-training-for-autoregressive
Repo
Framework

$α$ Belief Propagation as Fully Factorized Approximation


Title	$α$ Belief Propagation as Fully Factorized Approximation
Authors	Dong Liu, Nima N. Moghadam, Lars K. Rasmussen, Jinliang Huang, Saikat Chatterjee
Abstract	Belief propagation (BP) can do exact inference in loop-free graphs, but its performance could be poor in graphs with loops, and the understanding of its solution is limited. This work gives an interpretable belief propagation rule that is actually minimization of a localized $\alpha$-divergence. We term this algorithm as $\alpha$ belief propagation ($\alpha$-BP). The performance of $\alpha$-BP is tested in MAP (maximum a posterior) inference problems, where $\alpha$-BP can outperform (loopy) BP by a significant margin even in fully-connected graphs.
Tasks
Published	2019-08-23
URL	https://arxiv.org/abs/1908.08906v1
PDF	https://arxiv.org/pdf/1908.08906v1.pdf
PWC	https://paperswithcode.com/paper/belief-propagation-as-fully-factorized
Repo
Framework

Implicit Langevin Algorithms for Sampling From Log-concave Densities


Title	Implicit Langevin Algorithms for Sampling From Log-concave Densities
Authors	Liam Hodgkinson, Robert Salomone, Fred Roosta
Abstract	For sampling from a log-concave density, we study implicit integrators resulting from $\theta$-method discretization of the overdamped Langevin diffusion stochastic differential equation. Theoretical and algorithmic properties of the resulting sampling methods for $ \theta \in [0,1] $ and a range of step sizes are established. Our results generalize and extend prior works in several directions. In particular, for $\theta\ge1/2$, we prove geometric ergodicity and stability of the resulting methods for all step sizes. We show that obtaining subsequent samples amounts to solving a strongly-convex optimization problem, which is readily achievable using one of numerous existing methods. Numerical examples supporting our theoretical analysis are also presented.
Tasks
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12322v1
PDF	http://arxiv.org/pdf/1903.12322v1.pdf
PWC	https://paperswithcode.com/paper/implicit-langevin-algorithms-for-sampling
Repo
Framework

Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization


Title	Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization
Authors	Jihun Yun, Jung Hyun Lee, Sung Ju Hwang, Eunho Yang
Abstract	Neural Network quantization, which aims to reduce bit-lengths of the network weights and activations, is one of the key ingredients to reduce the size of neural networks for their deployments to resource-limited devices. However, compressing to low bit-lengths may incur large loss of information and preserving the performance of the full-precision networks under these settings is extremely challenging even with the state-of-the-art quantization approaches. To tackle this problem of low-bit quantization, we propose a novel Semi-Relaxed Quantization (SRQ) that can effectively reduce the quantization error, along with a new regularization technique, DropBits which replaces dropout regularization to randomly drop the bits instead of neurons to minimize information loss while improving generalization on low-bit networks. Moreover, we show the possibility of learning heterogeneous quantization levels, that finds proper bit-lengths for each layer using DropBits. We experimentally validate our method on various benchmark datasets and network architectures, whose results show that our method largely outperforms recent quantization approaches. To the best of our knowledge, we are the first in obtaining competitive performance on 3-bit quantization of ResNet-18 on ImageNet dataset with both weights and activations quantized, across all layers. Last but not the least, we show promising results on heterogeneous quantization, which we believe will open the door to new research directions in neural network quantization.
Tasks	Quantization
Published	2019-11-29
URL	https://arxiv.org/abs/1911.12990v1
PDF	https://arxiv.org/pdf/1911.12990v1.pdf
PWC	https://paperswithcode.com/paper/semi-relaxed-quantization-with-dropbits
Repo
Framework

A Music Classification Model based on Metric Learning and Feature Extraction from MP3 Audio Files


Title	A Music Classification Model based on Metric Learning and Feature Extraction from MP3 Audio Files
Authors	Angelo C. Mendes da Silva, Mauricio A. Nunes, Raul Fonseca Neto
Abstract	The development of models for learning music similarity and feature extraction from audio media files is an increasingly important task for the entertainment industry. This work proposes a novel music classification model based on metric learning and feature extraction from MP3 audio files. The metric learning process considers the learning of a set of parameterized distances employing a structured prediction approach from a set of MP3 audio files containing several music genres. The main objective of this work is to make possible learning a personalized metric for each customer. To extract the acoustic information we use the Mel-Frequency Cepstral Coefficient (MFCC) and make a dimensionality reduction with the use of Principal Components Analysis. We attest the model validity performing a set of experiments and comparing the training and testing results with baseline algorithms, such as K-means and Soft Margin Linear Support Vector Machine (SVM). Experiments show promising results and encourage the future development of an online version of the learning model.
Tasks	Dimensionality Reduction, Metric Learning, Music Classification, Structured Prediction
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12804v2
PDF	https://arxiv.org/pdf/1905.12804v2.pdf
PWC	https://paperswithcode.com/paper/a-music-classification-model-based-on-metric
Repo
Framework

HARK Side of Deep Learning – From Grad Student Descent to Automated Machine Learning


Title	HARK Side of Deep Learning – From Grad Student Descent to Automated Machine Learning
Authors	Oguzhan Gencoglu, Mark van Gils, Esin Guldogan, Chamin Morikawa, Mehmet Süzen, Mathias Gruber, Jussi Leinonen, Heikki Huttunen
Abstract	Recent advancements in machine learning research, i.e., deep learning, introduced methods that excel conventional algorithms as well as humans in several complex tasks, ranging from detection of objects in images and speech recognition to playing difficult strategic games. However, the current methodology of machine learning research and consequently, implementations of the real-world applications of such algorithms, seems to have a recurring HARKing (Hypothesizing After the Results are Known) issue. In this work, we elaborate on the algorithmic, economic and social reasons and consequences of this phenomenon. We present examples from current common practices of conducting machine learning research (e.g. avoidance of reporting negative results) and failure of generalization ability of the proposed algorithms and datasets in actual real-life usage. Furthermore, a potential future trajectory of machine learning research and development from the perspective of accountable, unbiased, ethical and privacy-aware algorithmic decision making is discussed. We would like to emphasize that with this discussion we neither claim to provide an exhaustive argumentation nor blame any specific institution or individual on the raised issues. This is simply a discussion put forth by us, insiders of the machine learning field, reflecting on us.
Tasks	Decision Making, Speech Recognition
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07633v1
PDF	http://arxiv.org/pdf/1904.07633v1.pdf
PWC	https://paperswithcode.com/paper/hark-side-of-deep-learning-from-grad-student
Repo
Framework

Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates


Title	Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
Authors	Yang Liu, Hongyi Guo
Abstract	Learning with noisy labels is a common problem in supervised learning. Existing approaches require practitioners to specify \emph{noise rates}, i.e., a set of parameters controlling the severity of label noises in the problem. The specifications are either assumed to be given or estimated using additional approaches. In this work, we introduce a technique to learn from noisy labels that does not require a priori specification of the noise rates. In particular, we introduce a new family of loss functions that we name as \emph{peer loss} functions. Our approach then uses a standard empirical risk minimization (ERM) framework with peer loss functions. Peer loss functions associate each training sample with a certain form of `peer" samples, which evaluate a classifier' predictions jointly. We show that, under mild conditions, performing ERM with peer loss functions on the noisy dataset leads to the optimal or a near optimal classifier as if performing ERM over the clean training data, which we do not have access to. To our best knowledge, this is the first result on` learning with noisy labels without knowing noise rates” with theoretical guarantees. We pair our results with an extensive set of experiments, where we compare with state-of-the-art techniques of learning with noisy labels. Our results show that peer loss functions based method consistently outperforms the baseline benchmarks, as well as some recent new results. Peer loss provides a way to simplify model development when facing potentially noisy training labels, and can be promoted as a robust candidate loss function in such situations.
Tasks
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03231v2
PDF	https://arxiv.org/pdf/1910.03231v2.pdf
PWC	https://paperswithcode.com/paper/peer-loss-functions-learning-from-noisy
Repo
Framework

Unsupervised training of neural mask-based beamforming


Title	Unsupervised training of neural mask-based beamforming
Authors	Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach
Abstract	We present an unsupervised training approach for a neural network-based mask estimator in an acoustic beamforming application. The network is trained to maximize a likelihood criterion derived from a spatial mixture model of the observations. It is trained from scratch without requiring any parallel data consisting of degraded input and clean training targets. Thus, training can be carried out on real recordings of noisy speech rather than simulated ones. In contrast to previous work on unsupervised training of neural mask estimators, our approach avoids the need for a possibly pre-trained teacher model entirely. We demonstrate the effectiveness of our approach by speech recognition experiments on two different datasets: one mainly deteriorated by noise (CHiME 4) and one by reverberation (REVERB). The results show that the performance of the proposed system is on par with a supervised system using oracle target masks for training and with a system trained using a model-based teacher.
Tasks	Speech Recognition
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01578v2
PDF	http://arxiv.org/pdf/1904.01578v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-training-of-neural-mask-based
Repo
Framework

Exquisitor: Interactive Learning at Large


Title	Exquisitor: Interactive Learning at Large
Authors	Björn Þór Jónsson, Omar Shahbaz Khan, Hanna Ragnarsdóttir, Þórhildur Þorleiksdóttir, Jan Zahálka, Stevan Rudinac, Gylfi Þór Guðmundsson, Laurent Amsaleg, Marcel Worring
Abstract	Increasing scale is a dominant trend in today’s multimedia collections, which especially impacts interactive applications. To facilitate interactive exploration of large multimedia collections, new approaches are needed that are capable of learning on the fly new analytic categories based on the visual and textual content. To facilitate general use on standard desktops, laptops, and mobile devices, they must furthermore work with limited computing resources. We present Exquisitor, a highly scalable interactive learning approach, capable of intelligent exploration of the large-scale YFCC100M image collection with extremely efficient responses from the interactive classifier. Based on relevance feedback from the user on previously suggested items, Exquisitor uses semantic features, extracted from both visual and text attributes, to suggest relevant media items to the user. Exquisitor builds upon the state of the art in large-scale data representation, compression and indexing, introducing a cluster-based retrieval mechanism that facilitates the efficient suggestions. With Exquisitor, each interaction round over the full YFCC100M collection is completed in less than 0.3 seconds using a single CPU core. That is 4x less time using 16x smaller computational resources than the most efficient state-of-the-art method, with a positive impact on result quality. These results open up many interesting research avenues, both for exploration of industry-scale media collections and for media exploration on mobile devices.
Tasks
Published	2019-04-18
URL	https://arxiv.org/abs/1904.08689v3
PDF	https://arxiv.org/pdf/1904.08689v3.pdf
PWC	https://paperswithcode.com/paper/exquisitor-interactive-learning-at-large
Repo
Framework

Fast Task-Adaptation for Tasks Labeled Using Natural Language in Reinforcement Learning


Title	Fast Task-Adaptation for Tasks Labeled Using Natural Language in Reinforcement Learning
Authors	Matthias Hutsebaut-Buysse, Kevin Mets, Steven Latré
Abstract	Over its lifetime, a reinforcement learning agent is often tasked with different tasks. How to efficiently adapt a previously learned control policy from one task to another, remains an open research question. In this paper, we investigate how instructions formulated in natural language can enable faster and more effective task adaptation. This can serve as the basis for developing language instructed skills, which can be used in a lifelong learning setting. Our method is capable of assessing, given a set of developed base control policies, which policy will adapt best to a new unseen task.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04040v1
PDF	https://arxiv.org/pdf/1910.04040v1.pdf
PWC	https://paperswithcode.com/paper/fast-task-adaptation-for-tasks-labeled-using
Repo
Framework

Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text


Title	Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
Authors	Kui Xue, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Huanhuan Zhang, Ping He
Abstract	Entity and relation extraction is the necessary step in structuring medical text. However, the feature extraction ability of the bidirectional long short term memory network in the existing model does not achieve the best effect. At the same time, the language model has achieved excellent results in more and more natural language processing tasks. In this paper, we present a focused attention model for the joint entity and relation extraction task. Our model integrates well-known BERT language model into joint learning through dynamic range attention mechanism, thus improving the feature representation ability of shared parameter layer. Experimental results on coronary angiography texts collected from Shuguang Hospital show that the F1-score of named entity recognition and relation classification tasks reach 96.89% and 88.51%, which are better than state-of-the-art methods 1.65% and 1.22%, respectively.
Tasks	Joint Entity and Relation Extraction, Language Modelling, Named Entity Recognition, Relation Classification, Relation Extraction
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07721v2
PDF	https://arxiv.org/pdf/1908.07721v2.pdf
PWC	https://paperswithcode.com/paper/190807721
Repo
Framework