January 26, 2020

2852 words 14 mins read

Paper Group ANR 1583

Hybrid Data-Model Parallel Training for Sequence-to-Sequence Recurrent Neural Network Machine Translation. AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models. On the Detection of Mutual Influences and Their Consideration in Reinforcement Learning Processes. Value-of-Information based Arbitration between Model-based and Model-free Con …

Hybrid Data-Model Parallel Training for Sequence-to-Sequence Recurrent Neural Network Machine Translation


Title	Hybrid Data-Model Parallel Training for Sequence-to-Sequence Recurrent Neural Network Machine Translation
Authors	Junya Ono, Masao Utiyama, Eiichiro Sumita
Abstract	Reduction of training time is an important issue in many tasks like patent translation involving neural networks. Data parallelism and model parallelism are two common approaches for reducing training time using multiple graphics processing units (GPUs) on one machine. In this paper, we propose a hybrid data-model parallel approach for sequence-to-sequence (Seq2Seq) recurrent neural network (RNN) machine translation. We apply a model parallel approach to the RNN encoder-decoder part of the Seq2Seq model and a data parallel approach to the attention-softmax part of the model. We achieved a speed-up of 4.13 to 4.20 times when using 4 GPUs compared with the training speed when using 1 GPU without affecting machine translation accuracy as measured in terms of BLEU scores.
Tasks	Machine Translation
Published	2019-09-02
URL	https://arxiv.org/abs/1909.00562v2
PDF	https://arxiv.org/pdf/1909.00562v2.pdf
PWC	https://paperswithcode.com/paper/hybrid-data-model-parallel-training-for
Repo
Framework

AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models


Title	AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models
Authors	Ke Sun, Zhouchen Lin, Zhanxing Zhu
Abstract	The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way. In this paper, we propose a novel RNN-like deep graph neural network architecture by incorporating AdaBoost into the computation of network; and the proposed graph convolutional network called AdaGCN~(AdaBoosting Graph Convolutional Network) has the ability to efficiently extract knowledge from high-order neighbors and integrate knowledge from different hops of neighbors into the network in an AdaBoost way. We also present the architectural difference between AdaGCN and existing graph convolutional methods to show the benefits of our proposal. Finally, extensive experiments demonstrate the state-of-the-art prediction performance and the computational advantage of our approach AdaGCN.
Tasks	Node Classification
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05081v1
PDF	https://arxiv.org/pdf/1908.05081v1.pdf
PWC	https://paperswithcode.com/paper/adagcn-adaboosting-graph-convolutional
Repo
Framework

On the Detection of Mutual Influences and Their Consideration in Reinforcement Learning Processes


Title	On the Detection of Mutual Influences and Their Consideration in Reinforcement Learning Processes
Authors	Stefan Rudolph, Sven Tomforde, Jörg Hähner
Abstract	Self-adaptation has been proposed as a mechanism to counter complexity in control problems of technical systems. A major driver behind self-adaptation is the idea to transfer traditional design-time decisions to runtime and into the responsibility of systems themselves. In order to deal with unforeseen events and conditions, systems need creativity – typically realized by means of machine learning capabilities. Such learning mechanisms are based on different sources of knowledge. Feedback from the environment used for reinforcement purposes is probably the most prominent one within the self-adapting and self-organizing (SASO) systems community. However, the impact of other (sub-)systems on the success of the individual system’s learning performance has mostly been neglected in this context. In this article, we propose a novel methodology to identify effects of actions performed by other systems in a shared environment on the utility achievement of an autonomous system. Consider smart cameras (SC) as illustrating example: For goals such as 3D reconstruction of objects, the most promising configuration of one SC in terms of pan/tilt/zoom parameters depends largely on the configuration of other SCs in the vicinity. Since such mutual influences cannot be pre-defined for dynamic systems, they have to be learned at runtime. Furthermore, they have to be taken into consideration when self-improving the own configuration decisions based on a feedback loop concept, e.g., known from the SASO domain or the Autonomic and Organic Computing initiatives. We define a methodology to detect such influences at runtime, present an approach to consider this information in a reinforcement learning technique, and analyze the behavior in artificial as well as real-world SASO system settings.
Tasks	3D Reconstruction
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04205v1
PDF	https://arxiv.org/pdf/1905.04205v1.pdf
PWC	https://paperswithcode.com/paper/on-the-detection-of-mutual-influences-and
Repo
Framework

Value-of-Information based Arbitration between Model-based and Model-free Control


Title	Value-of-Information based Arbitration between Model-based and Model-free Control
Authors	Krishn Bera, Yash Mandilwar, Bapi Raju
Abstract	There have been numerous attempts in explaining the general learning behaviours using model-based and model-free methods. While the model-based control is flexible yet computationally expensive in planning, the model-free control is quick but inflexible. The model-based control is therefore immune from reward devaluation and contingency degradation. Multiple arbitration schemes have been suggested to achieve the data efficiency and computational efficiency of model-based and model-free control respectively. In this context, we propose a quantitative ‘value of information’ based arbitration between both the controllers in order to establish a general computational framework for skill learning. The interacting model-based and model-free reinforcement learning processes are arbitrated using an uncertainty-based value of information. We further show that our algorithm performs better than Q-learning as well as Q-learning with experience replay.
Tasks	Q-Learning
Published	2019-12-08
URL	https://arxiv.org/abs/1912.05453v1
PDF	https://arxiv.org/pdf/1912.05453v1.pdf
PWC	https://paperswithcode.com/paper/value-of-information-based-arbitration
Repo
Framework

Vehicle Shape and Color Classification Using Convolutional Neural Network


Title	Vehicle Shape and Color Classification Using Convolutional Neural Network
Authors	Mohamed Nafzi, Michael Brauckmann, Tobias Glasmachers
Abstract	This paper presents a module of vehicle reidentification based on make/model and color classification. It could be used by the Automated Vehicular Surveillance (AVS) or by the fast analysis of video data. Many of problems, that are related to this topic, had to be addressed. In order to facilitate and accelerate the progress in this subject, we will present our way to collect and to label a large scale data set. We used deeper neural networks in our training. They showed a good classification accuracy. We show the results of make/model and color classification on controlled and video data set. We demonstrate with the help of a developed application the re-identification of vehicles on video images based on make/model and color classification. This work was partially funded under the grant.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.08612v1
PDF	https://arxiv.org/pdf/1905.08612v1.pdf
PWC	https://paperswithcode.com/paper/190508612
Repo
Framework

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation


Title	A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Authors	Pan Xu, Quanquan Gu
Abstract	Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning remains virtually unknown. In this paper, we present a finite-time analysis of a neural Q-learning algorithm, where the data are generated from a Markov decision process and the action-value function is approximated by a deep ReLU neural network. We prove that neural Q-learning finds the optimal policy with $O(1/\sqrt{T})$ convergence rate if the neural function approximator is sufficiently overparameterized, where $T$ is the number of iterations. To our best knowledge, our result is the first finite-time analysis of neural Q-learning under non-i.i.d. data assumption.
Tasks	Q-Learning
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04511v2
PDF	https://arxiv.org/pdf/1912.04511v2.pdf
PWC	https://paperswithcode.com/paper/a-finite-time-analysis-of-q-learning-with-1
Repo
Framework

Effect of Imbalanced Datasets on Security of Industrial IoT Using Machine Learning


Title	Effect of Imbalanced Datasets on Security of Industrial IoT Using Machine Learning
Authors	Maede Zolanvari, Marcio A. Teixeira, Raj Jain
Abstract	Machine learning algorithms have been shown to be suitable for securing platforms for IT systems. However, due to the fundamental differences between the industrial internet of things (IIoT) and regular IT networks, a special performance review needs to be considered. The vulnerabilities and security requirements of IIoT systems demand different considerations. In this paper, we study the reasons why machine learning must be integrated into the security mechanisms of the IIoT, and where it currently falls short in having a satisfactory performance. The challenges and real-world considerations associated with this matter are studied in our experimental design. We use an IIoT testbed resembling a real industrial plant to show our proof of concept.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.02651v1
PDF	https://arxiv.org/pdf/1912.02651v1.pdf
PWC	https://paperswithcode.com/paper/effect-of-imbalanced-datasets-on-security-of
Repo
Framework

The Temporal Dynamics of Belief-based Updating of Epistemic Trust: Light at the End of the Tunnel?


Title	The Temporal Dynamics of Belief-based Updating of Epistemic Trust: Light at the End of the Tunnel?
Authors	Momme von Sydow, Christoph Merdes, Ulrike Hahn
Abstract	We start with the distinction of outcome- and belief-based Bayesian models of the sequential update of agents’ beliefs and subjective reliability of sources (trust). We then focus on discussing the influential Bayesian model of belief-based trust update by Eric Olsson, which models dichotomic events and explicitly represents anti-reliability. After sketching some disastrous recent results for this perhaps most promising model of belief update, we show new simulation results for the temporal dynamics of learning belief with and without trust update and with and without communication. The results seem to shed at least a somewhat more positive light on the communicating-and-trust-updating agents. This may be a light at the end of the tunnel of belief-based models of trust updating, but the interpretation of the clear findings is much less clear.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.13380v1
PDF	https://arxiv.org/pdf/1912.13380v1.pdf
PWC	https://paperswithcode.com/paper/the-temporal-dynamics-of-belief-based
Repo
Framework

Learning Accurate Extended-Horizon Predictions of High Dimensional Trajectories


Title	Learning Accurate Extended-Horizon Predictions of High Dimensional Trajectories
Authors	Brian Gaudet, Richard Linares, Roberto Furfaro
Abstract	We present a novel predictive model architecture based on the principles of predictive coding that enables open loop prediction of future observations over extended horizons. There are two key innovations. First, whereas current methods typically learn to make long-horizon open-loop predictions using a multi-step cost function, we instead run the model open loop in the forward pass during training. Second, current predictive coding models initialize the representation layer’s hidden state to a constant value at the start of an episode, and consequently typically require multiple steps of interaction with the environment before the model begins to produce accurate predictions. Instead, we learn a mapping from the first observation in an episode to the hidden state, allowing the trained model to immediately produce accurate predictions. We compare the performance of our architecture to a standard predictive coding model and demonstrate the ability of the model to make accurate long horizon open-loop predictions of simulated Doppler radar altimeter readings during a six degree of freedom Mars landing. Finally, we demonstrate a 2X reduction in sample complexity by using the model to implement a Dyna style algorithm to accelerate policy learning with proximal policy optimization.
Tasks
Published	2019-01-12
URL	http://arxiv.org/abs/1901.03895v1
PDF	http://arxiv.org/pdf/1901.03895v1.pdf
PWC	https://paperswithcode.com/paper/learning-accurate-extended-horizon
Repo
Framework

Visualizing Attention in Transformer-Based Language Representation Models


Title	Visualizing Attention in Transformer-Based Language Representation Models
Authors	Jesse Vig
Abstract	We present an open-source tool for visualizing multi-head self-attention in Transformer-based language representation models. The tool extends earlier work by visualizing attention at three levels of granularity: the attention-head level, the model level, and the neuron level. We describe how each of these views can help to interpret the model, and we demonstrate the tool on the BERT model and the OpenAI GPT-2 model. We also present three use cases for analyzing GPT-2: detecting model bias, identifying recurring patterns, and linking neurons to model behavior.
Tasks	Language Modelling
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02679v2
PDF	http://arxiv.org/pdf/1904.02679v2.pdf
PWC	https://paperswithcode.com/paper/visualizing-attention-in-transformer-based
Repo
Framework

Clustering Higher Order Data: Finite Mixtures of Multidimensional Arrays


Title	Clustering Higher Order Data: Finite Mixtures of Multidimensional Arrays
Authors	Peter A. Tait, Paul D. McNicholas
Abstract	An approach for clustering multi-way data is introduced based on a finite mixture of multidimensional arrays. Attention to the use of multidimensional arrays for clustering has thus far been limited to two-dimensional arrays, i.e., matrices or order-two tensors. Accordingly, this is the first paper to develop an approach for clustering d-dimensional arrays for d>2 or, in other words, for clustering using order-d tensors.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08566v2
PDF	https://arxiv.org/pdf/1907.08566v2.pdf
PWC	https://paperswithcode.com/paper/clustering-higher-order-data-finite-mixtures
Repo
Framework

Adaptive Loss-aware Quantization for Multi-bit Networks


Title	Adaptive Loss-aware Quantization for Multi-bit Networks
Authors	Zhongnan Qu, Zimu Zhou, Yun Cheng, Lothar Thiele
Abstract	We investigate the compression of deep neural networks by quantizing their weights and activations into multiple binary bases, known as multi-bit networks (MBNs), which accelerate the inference and reduce the storage for the deployment on low-resource mobile and embedded platforms. We propose Adaptive Loss-aware Quantization (ALQ), a new MBN quantization pipeline that is able to achieve an average bitwidth below one-bit without notable loss in inference accuracy. Unlike previous MBN quantization solutions that train a quantizer by minimizing the error to reconstruct full precision weights, ALQ directly minimizes the quantization-induced error on the loss function involving neither gradient approximation nor full precision maintenance. ALQ also exploits strategies including adaptive bitwidth, smooth bitwidth reduction, and iterative trained quantization to allow a smaller network size without loss in accuracy. Experiment results on popular image datasets show that ALQ outperforms state-of-the-art compressed networks in terms of both storage and accuracy.
Tasks	Quantization
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08883v2
PDF	https://arxiv.org/pdf/1912.08883v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-loss-aware-quantization-for-multi
Repo
Framework

The Gambler’s Problem and Beyond


Title	The Gambler’s Problem and Beyond
Authors	Baoxiang Wang, Shuai Li, Jiajin Li, Siu On Chan
Abstract	We analyze the Gambler’s problem, a simple reinforcement learning problem where the gambler has the chance to double or lose their bets until the target is reached. This is an early example introduced in the reinforcement learning textbook by Sutton and Barto (2018), where they mention an interesting pattern of the optimal value function with high-frequency components and repeating non-smooth points. It is however without further investigation. We provide the exact formula for the optimal value function for both the discrete and the continuous cases. Though simple as it might seem, the value function is pathological: fractal, self-similar, derivative taking either zero or infinity, not smooth on any interval, and not written as elementary functions. It is in fact one of the generalized Cantor functions, where it holds a complexity that has been uncharted thus far. Our analyses could lead insights into improving value function approximation, gradient-based algorithms, and Q-learning, in real applications and implementations.
Tasks	Q-Learning
Published	2019-12-31
URL	https://arxiv.org/abs/2001.00102v1
PDF	https://arxiv.org/pdf/2001.00102v1.pdf
PWC	https://paperswithcode.com/paper/the-gamblers-problem-and-beyond-1
Repo
Framework


Title	Voice-Face Cross-modal Matching and Retrieval: A Benchmark
Authors	Chuyuan Xiong, Deyuan Zhang, Tao Liu, Xiaoyong Du
Abstract	Cross-modal associations between voice and face from a person can be learnt algorithmically, which can benefit a lot of applications. The problem can be defined as voice-face matching and retrieval tasks. Much research attention has been paid on these tasks recently. However, this research is still in the early stage. Test schemes based on random tuple mining tend to have low test confidence. Generalization ability of models can not be evaluated by small scale datasets. Performance metrics on various tasks are scarce. A benchmark for this problem needs to be established. In this paper, first, a framework based on comprehensive studies is proposed for voice-face matching and retrieval. It achieves state-of-the-art performance with various performance metrics on different tasks and with high test confidence on large scale datasets, which can be taken as a baseline for the follow-up research. In this framework, a voice anchored L2-Norm constrained metric space is proposed, and cross-modal embeddings are learned with CNN-based networks and triplet loss in the metric space. The embedding learning process can be more effective and efficient with this strategy. Different network structures of the framework and the cross language transfer abilities of the model are also analyzed. Second, a voice-face dataset (with 1.15M face data and 0.29M audio data) from Chinese speakers is constructed, and a convenient and quality controllable dataset collection tool is developed. The dataset and source code of the paper will be published together with this paper.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09338v2
PDF	https://arxiv.org/pdf/1911.09338v2.pdf
PWC	https://paperswithcode.com/paper/voice-face-cross-modal-matching-and-retrieval
Repo
Framework

The Impact of Data Preparation on the Fairness of Software Systems


Title	The Impact of Data Preparation on the Fairness of Software Systems
Authors	Inês Valentim, Nuno Lourenço, Nuno Antunes
Abstract	Machine learning models are widely adopted in scenarios that directly affect people. The development of software systems based on these models raises societal and legal concerns, as their decisions may lead to the unfair treatment of individuals based on attributes like race or gender. Data preparation is key in any machine learning pipeline, but its effect on fairness is yet to be studied in detail. In this paper, we evaluate how the fairness and effectiveness of the learned models are affected by the removal of the sensitive attribute, the encoding of the categorical attributes, and instance selection methods (including cross-validators and random undersampling). We used the Adult Income and the German Credit Data datasets, which are widely studied and known to have fairness concerns. We applied each data preparation technique individually to analyse the difference in predictive performance and fairness, using statistical parity difference, disparate impact, and the normalised prejudice index. The results show that fairness is affected by transformations made to the training data, particularly in imbalanced datasets. Removing the sensitive attribute is insufficient to eliminate all the unfairness in the predictions, as expected, but it is key to achieve fairer models. Additionally, the standard random undersampling with respect to the true labels is sometimes more prejudicial than performing no random undersampling.
Tasks
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02321v1
PDF	https://arxiv.org/pdf/1910.02321v1.pdf
PWC	https://paperswithcode.com/paper/the-impact-of-data-preparation-on-the
Repo
Framework