October 19, 2019

3093 words 15 mins read

Paper Group ANR 245

State Distribution-aware Sampling for Deep Q-learning. Deep learning for dehazing: Comparison and analysis. On Attention Models for Human Activity Recognition. The MeMAD Submission to the IWSLT 2018 Speech Translation Task. Learning and Querying Fast Generative Models for Reinforcement Learning. Investigating Linguistic Pattern Ordering in Hierarch …

State Distribution-aware Sampling for Deep Q-learning


Title	State Distribution-aware Sampling for Deep Q-learning
Authors	Weichao Li, Fuxian Huang, Xi Li, Gang Pan, Fei Wu
Abstract	A critical and challenging problem in reinforcement learning is how to learn the state-action value function from the experience replay buffer and simultaneously keep sample efficiency and faster convergence to a high quality solution. In prior works, transitions are uniformly sampled at random from the replay buffer or sampled based on their priority measured by temporal-difference (TD) error. However, these approaches do not fully take into consideration the intrinsic characteristics of transition distribution in the state space and could result in redundant and unnecessary TD updates, slowing down the convergence of the learning procedure. To overcome this problem, we propose a novel state distribution-aware sampling method to balance the replay times for transitions with skew distribution, which takes into account both the occurrence frequencies of transitions and the uncertainty of state-action values. Consequently, our approach could reduce the unnecessary TD updates and increase the TD updates for state-action value with more uncertainty, making the experience replay more effective and efficient. Extensive experiments are conducted on both classic control tasks and Atari 2600 games based on OpenAI gym platform and the experimental results demonstrate the effectiveness of our approach in comparison with the standard DQN approach.
Tasks	Atari Games, Q-Learning
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08619v1
PDF	http://arxiv.org/pdf/1804.08619v1.pdf
PWC	https://paperswithcode.com/paper/state-distribution-aware-sampling-for-deep-q
Repo
Framework

Deep learning for dehazing: Comparison and analysis


Title	Deep learning for dehazing: Comparison and analysis
Authors	A Benoit, Leonel Cuevas, Jean-Baptiste Thomas
Abstract	We compare a recent dehazing method based on deep learning, Dehazenet, with traditional state-of-the-art approaches , on benchmark data with reference. Dehazenet estimates the depth map from transmission factor on a single color image, which is used to inverse the Koschmieder model of imaging in the presence of haze. In this sense, the solution is still attached to the Koschmieder model. We demonstrate that the transmission is very well estimated by the network, but also that this method exhibits the same limitation than others due to the use of the same imaging model.
Tasks
Published	2018-06-28
URL	http://arxiv.org/abs/1806.10923v1
PDF	http://arxiv.org/pdf/1806.10923v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-dehazing-comparison-and
Repo
Framework

On Attention Models for Human Activity Recognition


Title	On Attention Models for Human Activity Recognition
Authors	Vishvak S Murahari, Thomas Ploetz
Abstract	Most approaches that model time-series data in human activity recognition based on body-worn sensing (HAR) use a fixed size temporal context to represent different activities. This might, however, not be apt for sets of activities with individ- ually varying durations. We introduce attention models into HAR research as a data driven approach for exploring relevant temporal context. Attention models learn a set of weights over input data, which we leverage to weight the temporal context being considered to model each sensor reading. We construct attention models for HAR by adding attention layers to a state- of-the-art deep learning HAR model (DeepConvLSTM) and evaluate our approach on benchmark datasets achieving sig- nificant increase in performance. Finally, we visualize the learned weights to better understand what constitutes relevant temporal context.
Tasks	Activity Recognition, Human Activity Recognition, Time Series
Published	2018-05-19
URL	http://arxiv.org/abs/1805.07648v1
PDF	http://arxiv.org/pdf/1805.07648v1.pdf
PWC	https://paperswithcode.com/paper/on-attention-models-for-human-activity
Repo
Framework

The MeMAD Submission to the IWSLT 2018 Speech Translation Task


Title	The MeMAD Submission to the IWSLT 2018 Speech Translation Task
Authors	Umut Sulubacak, Jörg Tiedemann, Aku Rouhe, Stig-Arne Grönroos, Mikko Kurimo
Abstract	This paper describes the MeMAD project entry to the IWSLT Speech Translation Shared Task, addressing the translation of English audio into German text. Between the pipeline and end-to-end model tracks, we participated only in the former, with three contrastive systems. We tried also the latter, but were not able to finish our end-to-end model in time. All of our systems start by transcribing the audio into text through an automatic speech recognition (ASR) model trained on the TED-LIUM English Speech Recognition Corpus (TED-LIUM). Afterwards, we feed the transcripts into English-German text-based neural machine translation (NMT) models. Our systems employ three different translation models trained on separate training sets compiled from the English-German part of the TED Speech Translation Corpus (TED-Trans) and the OpenSubtitles2018 section of the OPUS collection. In this paper, we also describe the experiments leading up to our final systems. Our experiments indicate that using OpenSubtitles2018 in training significantly improves translation performance. We also experimented with various pre- and postprocessing routines for the NMT module, but we did not have much success with these. Our best-scoring system attains a BLEU score of 16.45 on the test set for this year’s task.
Tasks	Machine Translation, Speech Recognition
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10320v1
PDF	http://arxiv.org/pdf/1810.10320v1.pdf
PWC	https://paperswithcode.com/paper/the-memad-submission-to-the-iwslt-2018-speech
Repo
Framework

Learning and Querying Fast Generative Models for Reinforcement Learning


Title	Learning and Querying Fast Generative Models for Reinforcement Learning
Authors	Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David P. Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra
Abstract	A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models. We show that carefully designed generative models that learn and operate on compact state representations, so-called state-space models, substantially reduce the computational costs for predicting outcomes of sequences of actions. Extensive experiments establish that state-space models accurately capture the dynamics of Atari games from the Arcade Learning Environment from raw pixels. The computational speed-up of state-space models while maintaining high accuracy makes their application in RL feasible: We demonstrate that agents which query these models for decision making outperform strong model-free baselines on the game MSPACMAN, demonstrating the potential of using learned environment models for planning.
Tasks	Atari Games, Decision Making
Published	2018-02-08
URL	http://arxiv.org/abs/1802.03006v1
PDF	http://arxiv.org/pdf/1802.03006v1.pdf
PWC	https://paperswithcode.com/paper/learning-and-querying-fast-generative-models
Repo
Framework

Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation


Title	Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation
Authors	Shang-Yu Su, Yun-Nung Chen
Abstract	Natural language generation (NLG) is a critical component in spoken dialogue system, which can be divided into two phases: (1) sentence planning: deciding the overall sentence structure, (2) surface realization: determining specific word forms and flattening the sentence structure into a string. With the rise of deep learning, most modern NLG models are based on a sequence-to-sequence (seq2seq) model, which basically contains an encoder-decoder structure; these NLG models generate sentences from scratch by jointly optimizing sentence planning and surface realization. However, such simple encoder-decoder architecture usually fail to generate complex and long sentences, because the decoder has difficulty learning all grammar and diction knowledge well. This paper introduces an NLG model with a hierarchical attentional decoder, where the hierarchy focuses on leveraging linguistic knowledge in a specific order. The experiments show that the proposed method significantly outperforms the traditional seq2seq model with a smaller model size, and the design of the hierarchical attentional decoder can be applied to various NLG systems. Furthermore, different generation strategies based on linguistic patterns are investigated and analyzed in order to guide future NLG research work.
Tasks	Text Generation
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07629v1
PDF	http://arxiv.org/pdf/1809.07629v1.pdf
PWC	https://paperswithcode.com/paper/investigating-linguistic-pattern-ordering-in
Repo
Framework

Baselines for Reinforcement Learning in Text Games


Title	Baselines for Reinforcement Learning in Text Games
Authors	Mikuláš Zelinka
Abstract	The ability to learn optimal control policies in systems where action space is defined by sentences in natural language would allow many interesting real-world applications such as automatic optimisation of dialogue systems. Text-based games with multiple endings and rewards are a promising platform for this task, since their feedback allows us to employ reinforcement learning techniques to jointly learn text representations and control policies. We argue that the key property of AI agents, especially in the text-games context, is their ability to generalise to previously unseen games. We present a minimalistic text-game playing agent, testing its generalisation and transfer learning performance and showing its ability to play multiple games at once. We also present pyfiction, an open-source library for universal access to different text games that could, together with our agent that implements its interface, serve as a baseline for future research.
Tasks	Transfer Learning
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02872v2
PDF	http://arxiv.org/pdf/1811.02872v2.pdf
PWC	https://paperswithcode.com/paper/baselines-for-reinforcement-learning-in-text
Repo
Framework

Improving Data Quality through Deep Learning and Statistical Models


Title	Improving Data Quality through Deep Learning and Statistical Models
Authors	Wei Dai, Kenji Yoshigoe, William Parsley
Abstract	Traditional data quality control methods are based on users experience or previously established business rules, and this limits performance in addition to being a very time consuming process with lower than desirable accuracy. Utilizing deep learning, we can leverage computing resources and advanced techniques to overcome these challenges and provide greater value to users. In this paper, we, the authors, first review relevant works and discuss machine learning techniques, tools, and statistical quality models. Second, we offer a creative data quality framework based on deep learning and statistical model algorithm for identifying data quality. Third, we use data involving salary levels from an open dataset published by the state of Arkansas to demonstrate how to identify outlier data and how to improve data quality via deep learning. Finally, we discuss future work.
Tasks
Published	2018-10-16
URL	http://arxiv.org/abs/1810.07132v1
PDF	http://arxiv.org/pdf/1810.07132v1.pdf
PWC	https://paperswithcode.com/paper/improving-data-quality-through-deep-learning
Repo
Framework

ARiA: Utilizing Richard’s Curve for Controlling the Non-monotonicity of the Activation Function in Deep Neural Nets


Title	ARiA: Utilizing Richard’s Curve for Controlling the Non-monotonicity of the Activation Function in Deep Neural Nets
Authors	Narendra Patwardhan, Madhura Ingalhalikar, Rahee Walambe
Abstract	This work introduces a novel activation unit that can be efficiently employed in deep neural nets (DNNs) and performs significantly better than the traditional Rectified Linear Units (ReLU). The function developed is a two parameter version of the specialized Richard’s Curve and we call it Adaptive Richard’s Curve weighted Activation (ARiA). This function is non-monotonous, analogous to the newly introduced Swish, however allows a precise control over its non-monotonous convexity by varying the hyper-parameters. We first demonstrate the mathematical significance of the two parameter ARiA followed by its application to benchmark problems such as MNIST, CIFAR-10 and CIFAR-100, where we compare the performance with ReLU and Swish units. Our results illustrate a significantly superior performance on all these datasets, making ARiA a potential replacement for ReLU and other activations in DNNs.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08878v1
PDF	http://arxiv.org/pdf/1805.08878v1.pdf
PWC	https://paperswithcode.com/paper/aria-utilizing-richards-curve-for-controlling
Repo
Framework

Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter


Title	Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter
Authors	Ziqi Zhang, Lei Luo
Abstract	In recent years, the increasing propagation of hate speech on social media and the urgent need for effective counter-measures have drawn significant investment from governments, companies, and researchers. A large number of methods have been developed for automated hate speech detection online. This aims to classify textual content into non-hate or hate speech, in which case the method may also identify the targeting characteristics (i.e., types of hate, such as race, and religion) in the hate speech. However, we notice significant difference between the performance of the two (i.e., non-hate v.s. hate). In this work, we argue for a focus on the latter problem for practical reasons. We show that it is a much more challenging task, as our analysis of the language in the typical datasets shows that hate speech lacks unique, discriminative features and therefore is found in the ‘long tail’ in a dataset that is difficult to discover. We then propose Deep Neural Network structures serving as feature extractors that are particularly effective for capturing the semantics of hate speech. Our methods are evaluated on the largest collection of hate speech datasets based on Twitter, and are shown to be able to outperform the best performing method by up to 5 percentage points in macro-average F1, or 8 percentage points in the more challenging case of identifying hateful content.
Tasks	Hate Speech Detection
Published	2018-02-27
URL	http://arxiv.org/abs/1803.03662v2
PDF	http://arxiv.org/pdf/1803.03662v2.pdf
PWC	https://paperswithcode.com/paper/hate-speech-detection-a-solved-problem-the
Repo
Framework

Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning


Title	Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning
Authors	Hao Yu, Sen Yang, Shenghuo Zhu
Abstract	In distributed training of deep neural networks, parallel mini-batch SGD is widely used to speed up the training process by using multiple workers. It uses multiple workers to sample local stochastic gradient in parallel, aggregates all gradients in a single server to obtain the average, and update each worker’s local model using a SGD update with the averaged gradient. Ideally, parallel mini-batch SGD can achieve a linear speed-up of the training time (with respect to the number of workers) compared with SGD over a single worker. However, such linear scalability in practice is significantly limited by the growing demand for gradient communication as more workers are involved. Model averaging, which periodically averages individual models trained over parallel workers, is another common practice used for distributed training of deep neural networks since (Zinkevich et al. 2010) (McDonald, Hall, and Mann 2010). Compared with parallel mini-batch SGD, the communication overhead of model averaging is significantly reduced. Impressively, tremendous experimental works have verified that model averaging can still achieve a good speed-up of the training time as long as the averaging interval is carefully controlled. However, it remains a mystery in theory why such a simple heuristic works so well. This paper provides a thorough and rigorous theoretical study on why model averaging can work as well as parallel mini-batch SGD with significantly less communication overhead.
Tasks
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06629v3
PDF	http://arxiv.org/pdf/1807.06629v3.pdf
PWC	https://paperswithcode.com/paper/parallel-restarted-sgd-with-faster
Repo
Framework

Multi-Fidelity Recursive Behavior Prediction


Title	Multi-Fidelity Recursive Behavior Prediction
Authors	Mihir Jain, Kyle Brown, Ahmed K. Sadek
Abstract	Predicting the behavior of surrounding vehicles is a critical problem in automated driving. We present a novel game theoretic behavior prediction model that achieves state of the art prediction accuracy by explicitly reasoning about possible future interaction between agents. We evaluate our approach on the NGSIM vehicle trajectory data set and demonstrate lower root mean square error than state-of-the-art methods.
Tasks
Published	2018-12-18
URL	http://arxiv.org/abs/1901.01831v1
PDF	http://arxiv.org/pdf/1901.01831v1.pdf
PWC	https://paperswithcode.com/paper/multi-fidelity-recursive-behavior-prediction
Repo
Framework

Theoretical Guarantees of Transfer Learning


Title	Theoretical Guarantees of Transfer Learning
Authors	Zirui Wang
Abstract	Transfer learning has been proven effective when within-target labeled data is scarce. A lot of works have developed successful algorithms and empirically observed positive transfer effect that improves target generalization error using source knowledge. However, theoretical analysis of transfer learning is more challenging due to the nature of the problem and thus is less studied. In this report, we do a survey of theoretical works in transfer learning and summarize key theoretical guarantees that prove the effectiveness of transfer learning. The theoretical bounds are derived using model complexity and learning algorithm stability. As we should see, these works exhibit a trade-off between tight bounds and restrictive assumptions. Moreover, we also prove a new generalization bound for the multi-source transfer learning problem using the VC-theory, which is more informative than the one proved in previous work.
Tasks	Transfer Learning
Published	2018-10-14
URL	http://arxiv.org/abs/1810.05986v2
PDF	http://arxiv.org/pdf/1810.05986v2.pdf
PWC	https://paperswithcode.com/paper/theoretical-guarantees-of-transfer-learning
Repo
Framework

SREdgeNet: Edge Enhanced Single Image Super Resolution using Dense Edge Detection Network and Feature Merge Network


Title	SREdgeNet: Edge Enhanced Single Image Super Resolution using Dense Edge Detection Network and Feature Merge Network
Authors	Kwanyoung Kim, Se Young Chun
Abstract	Deep learning based single image super-resolution (SR) methods have been rapidly evolved over the past few years and have yielded state-of-the-art performances over conventional methods. Since these methods usually minimized l1 loss between the output SR image and the ground truth image, they yielded very high peak signal-to-noise ratio (PSNR) that is inversely proportional to these losses. Unfortunately, minimizing these losses inevitably lead to blurred edges due to averaging of plausible solutions. Recently, SRGAN was proposed to avoid this average effect by minimizing perceptual losses instead of l1 loss and it yielded perceptually better SR images (or images with sharp edges) at the price of lowering PSNR. In this paper, we propose SREdgeNet, edge enhanced single image SR network, that was inspired by conventional SR theories so that average effect could be avoided not by changing the loss, but by changing the SR network property with the same l1 loss. Our SREdgeNet consists of 3 sequential deep neural network modules: the first module is any state-of-the-art SR network and we selected a variant of EDSR. The second module is any edge detection network taking the output of the first SR module as an input and we propose DenseEdgeNet for this module. Lastly, the third module is merging the outputs of the first and second modules to yield edge enhanced SR image and we propose MergeNet for this module. Qualitatively, our proposed method yielded images with sharp edges compared to other state-of-the-art SR methods. Quantitatively, our SREdgeNet yielded state-of-the-art performance in terms of structural similarity (SSIM) while maintained comparable PSNR for x8 enlargement.
Tasks	Edge Detection, Image Super-Resolution, Super-Resolution
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07174v1
PDF	http://arxiv.org/pdf/1812.07174v1.pdf
PWC	https://paperswithcode.com/paper/sredgenet-edge-enhanced-single-image-super
Repo
Framework

Multi-target Unsupervised Domain Adaptation without Exactly Shared Categories


Title	Multi-target Unsupervised Domain Adaptation without Exactly Shared Categories
Authors	Huanhuan Yu, Menglei Hu, Songcan Chen
Abstract	Unsupervised domain adaptation (UDA) aims to learn the unlabeled target domain by transferring the knowledge of the labeled source domain. To date, most of the existing works focus on the scenario of one source domain and one target domain (1S1T), and just a few works concern the scenario of multiple source domains and one target domain (mS1T). While, to the best of our knowledge, almost no work concerns the scenario of one source domain and multiple target domains (1SmT), in which these unlabeled target domains may not necessarily share the same categories, therefore, contrasting to mS1T, 1SmT is more challenging. Accordingly, for such a new UDA scenario, we propose a UDA framework through the model parameter adaptation (PA-1SmT). A key ingredient of PA-1SmT is to transfer knowledge through adaptive learning of a common model parameter dictionary, which is completely different from existing popular methods for UDA, such as subspace alignment, distribution matching etc., and can also be directly used for DA of privacy protection due to the fact that the knowledge is transferred just via the model parameters rather than data itself. Finally, our experimental results on three domain adaptation benchmark datasets demonstrate the superiority of our framework.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2018-09-04
URL	http://arxiv.org/abs/1809.00852v2
PDF	http://arxiv.org/pdf/1809.00852v2.pdf
PWC	https://paperswithcode.com/paper/multi-target-unsupervised-domain-adaptation
Repo
Framework