Paper Group ANR 245
State Distribution-aware Sampling for Deep Q-learning. Deep learning for dehazing: Comparison and analysis. On Attention Models for Human Activity Recognition. The MeMAD Submission to the IWSLT 2018 Speech Translation Task. Learning and Querying Fast Generative Models for Reinforcement Learning. Investigating Linguistic Pattern Ordering in Hierarch …
State Distribution-aware Sampling for Deep Q-learning
Title | State Distribution-aware Sampling for Deep Q-learning |
Authors | Weichao Li, Fuxian Huang, Xi Li, Gang Pan, Fei Wu |
Abstract | A critical and challenging problem in reinforcement learning is how to learn the state-action value function from the experience replay buffer and simultaneously keep sample efficiency and faster convergence to a high quality solution. In prior works, transitions are uniformly sampled at random from the replay buffer or sampled based on their priority measured by temporal-difference (TD) error. However, these approaches do not fully take into consideration the intrinsic characteristics of transition distribution in the state space and could result in redundant and unnecessary TD updates, slowing down the convergence of the learning procedure. To overcome this problem, we propose a novel state distribution-aware sampling method to balance the replay times for transitions with skew distribution, which takes into account both the occurrence frequencies of transitions and the uncertainty of state-action values. Consequently, our approach could reduce the unnecessary TD updates and increase the TD updates for state-action value with more uncertainty, making the experience replay more effective and efficient. Extensive experiments are conducted on both classic control tasks and Atari 2600 games based on OpenAI gym platform and the experimental results demonstrate the effectiveness of our approach in comparison with the standard DQN approach. |
Tasks | Atari Games, Q-Learning |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08619v1 |
http://arxiv.org/pdf/1804.08619v1.pdf | |
PWC | https://paperswithcode.com/paper/state-distribution-aware-sampling-for-deep-q |
Repo | |
Framework | |
Deep learning for dehazing: Comparison and analysis
Title | Deep learning for dehazing: Comparison and analysis |
Authors | A Benoit, Leonel Cuevas, Jean-Baptiste Thomas |
Abstract | We compare a recent dehazing method based on deep learning, Dehazenet, with traditional state-of-the-art approaches , on benchmark data with reference. Dehazenet estimates the depth map from transmission factor on a single color image, which is used to inverse the Koschmieder model of imaging in the presence of haze. In this sense, the solution is still attached to the Koschmieder model. We demonstrate that the transmission is very well estimated by the network, but also that this method exhibits the same limitation than others due to the use of the same imaging model. |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.10923v1 |
http://arxiv.org/pdf/1806.10923v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-dehazing-comparison-and |
Repo | |
Framework | |
On Attention Models for Human Activity Recognition
Title | On Attention Models for Human Activity Recognition |
Authors | Vishvak S Murahari, Thomas Ploetz |
Abstract | Most approaches that model time-series data in human activity recognition based on body-worn sensing (HAR) use a fixed size temporal context to represent different activities. This might, however, not be apt for sets of activities with individ- ually varying durations. We introduce attention models into HAR research as a data driven approach for exploring relevant temporal context. Attention models learn a set of weights over input data, which we leverage to weight the temporal context being considered to model each sensor reading. We construct attention models for HAR by adding attention layers to a state- of-the-art deep learning HAR model (DeepConvLSTM) and evaluate our approach on benchmark datasets achieving sig- nificant increase in performance. Finally, we visualize the learned weights to better understand what constitutes relevant temporal context. |
Tasks | Activity Recognition, Human Activity Recognition, Time Series |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07648v1 |
http://arxiv.org/pdf/1805.07648v1.pdf | |
PWC | https://paperswithcode.com/paper/on-attention-models-for-human-activity |
Repo | |
Framework | |
The MeMAD Submission to the IWSLT 2018 Speech Translation Task
Title | The MeMAD Submission to the IWSLT 2018 Speech Translation Task |
Authors | Umut Sulubacak, Jörg Tiedemann, Aku Rouhe, Stig-Arne Grönroos, Mikko Kurimo |
Abstract | This paper describes the MeMAD project entry to the IWSLT Speech Translation Shared Task, addressing the translation of English audio into German text. Between the pipeline and end-to-end model tracks, we participated only in the former, with three contrastive systems. We tried also the latter, but were not able to finish our end-to-end model in time. All of our systems start by transcribing the audio into text through an automatic speech recognition (ASR) model trained on the TED-LIUM English Speech Recognition Corpus (TED-LIUM). Afterwards, we feed the transcripts into English-German text-based neural machine translation (NMT) models. Our systems employ three different translation models trained on separate training sets compiled from the English-German part of the TED Speech Translation Corpus (TED-Trans) and the OpenSubtitles2018 section of the OPUS collection. In this paper, we also describe the experiments leading up to our final systems. Our experiments indicate that using OpenSubtitles2018 in training significantly improves translation performance. We also experimented with various pre- and postprocessing routines for the NMT module, but we did not have much success with these. Our best-scoring system attains a BLEU score of 16.45 on the test set for this year’s task. |
Tasks | Machine Translation, Speech Recognition |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10320v1 |
http://arxiv.org/pdf/1810.10320v1.pdf | |
PWC | https://paperswithcode.com/paper/the-memad-submission-to-the-iwslt-2018-speech |
Repo | |
Framework | |
Learning and Querying Fast Generative Models for Reinforcement Learning
Title | Learning and Querying Fast Generative Models for Reinforcement Learning |
Authors | Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David P. Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra |
Abstract | A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models. We show that carefully designed generative models that learn and operate on compact state representations, so-called state-space models, substantially reduce the computational costs for predicting outcomes of sequences of actions. Extensive experiments establish that state-space models accurately capture the dynamics of Atari games from the Arcade Learning Environment from raw pixels. The computational speed-up of state-space models while maintaining high accuracy makes their application in RL feasible: We demonstrate that agents which query these models for decision making outperform strong model-free baselines on the game MSPACMAN, demonstrating the potential of using learned environment models for planning. |
Tasks | Atari Games, Decision Making |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.03006v1 |
http://arxiv.org/pdf/1802.03006v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-and-querying-fast-generative-models |
Repo | |
Framework | |
Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation
Title | Investigating Linguistic Pattern Ordering in Hierarchical Natural Language Generation |
Authors | Shang-Yu Su, Yun-Nung Chen |
Abstract | Natural language generation (NLG) is a critical component in spoken dialogue system, which can be divided into two phases: (1) sentence planning: deciding the overall sentence structure, (2) surface realization: determining specific word forms and flattening the sentence structure into a string. With the rise of deep learning, most modern NLG models are based on a sequence-to-sequence (seq2seq) model, which basically contains an encoder-decoder structure; these NLG models generate sentences from scratch by jointly optimizing sentence planning and surface realization. However, such simple encoder-decoder architecture usually fail to generate complex and long sentences, because the decoder has difficulty learning all grammar and diction knowledge well. This paper introduces an NLG model with a hierarchical attentional decoder, where the hierarchy focuses on leveraging linguistic knowledge in a specific order. The experiments show that the proposed method significantly outperforms the traditional seq2seq model with a smaller model size, and the design of the hierarchical attentional decoder can be applied to various NLG systems. Furthermore, different generation strategies based on linguistic patterns are investigated and analyzed in order to guide future NLG research work. |
Tasks | Text Generation |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07629v1 |
http://arxiv.org/pdf/1809.07629v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-linguistic-pattern-ordering-in |
Repo | |
Framework | |
Baselines for Reinforcement Learning in Text Games
Title | Baselines for Reinforcement Learning in Text Games |
Authors | Mikuláš Zelinka |
Abstract | The ability to learn optimal control policies in systems where action space is defined by sentences in natural language would allow many interesting real-world applications such as automatic optimisation of dialogue systems. Text-based games with multiple endings and rewards are a promising platform for this task, since their feedback allows us to employ reinforcement learning techniques to jointly learn text representations and control policies. We argue that the key property of AI agents, especially in the text-games context, is their ability to generalise to previously unseen games. We present a minimalistic text-game playing agent, testing its generalisation and transfer learning performance and showing its ability to play multiple games at once. We also present pyfiction, an open-source library for universal access to different text games that could, together with our agent that implements its interface, serve as a baseline for future research. |
Tasks | Transfer Learning |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.02872v2 |
http://arxiv.org/pdf/1811.02872v2.pdf | |
PWC | https://paperswithcode.com/paper/baselines-for-reinforcement-learning-in-text |
Repo | |
Framework | |
Improving Data Quality through Deep Learning and Statistical Models
Title | Improving Data Quality through Deep Learning and Statistical Models |
Authors | Wei Dai, Kenji Yoshigoe, William Parsley |
Abstract | Traditional data quality control methods are based on users experience or previously established business rules, and this limits performance in addition to being a very time consuming process with lower than desirable accuracy. Utilizing deep learning, we can leverage computing resources and advanced techniques to overcome these challenges and provide greater value to users. In this paper, we, the authors, first review relevant works and discuss machine learning techniques, tools, and statistical quality models. Second, we offer a creative data quality framework based on deep learning and statistical model algorithm for identifying data quality. Third, we use data involving salary levels from an open dataset published by the state of Arkansas to demonstrate how to identify outlier data and how to improve data quality via deep learning. Finally, we discuss future work. |
Tasks | |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.07132v1 |
http://arxiv.org/pdf/1810.07132v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-data-quality-through-deep-learning |
Repo | |
Framework | |
ARiA: Utilizing Richard’s Curve for Controlling the Non-monotonicity of the Activation Function in Deep Neural Nets
Title | ARiA: Utilizing Richard’s Curve for Controlling the Non-monotonicity of the Activation Function in Deep Neural Nets |
Authors | Narendra Patwardhan, Madhura Ingalhalikar, Rahee Walambe |
Abstract | This work introduces a novel activation unit that can be efficiently employed in deep neural nets (DNNs) and performs significantly better than the traditional Rectified Linear Units (ReLU). The function developed is a two parameter version of the specialized Richard’s Curve and we call it Adaptive Richard’s Curve weighted Activation (ARiA). This function is non-monotonous, analogous to the newly introduced Swish, however allows a precise control over its non-monotonous convexity by varying the hyper-parameters. We first demonstrate the mathematical significance of the two parameter ARiA followed by its application to benchmark problems such as MNIST, CIFAR-10 and CIFAR-100, where we compare the performance with ReLU and Swish units. Our results illustrate a significantly superior performance on all these datasets, making ARiA a potential replacement for ReLU and other activations in DNNs. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08878v1 |
http://arxiv.org/pdf/1805.08878v1.pdf | |
PWC | https://paperswithcode.com/paper/aria-utilizing-richards-curve-for-controlling |
Repo | |
Framework | |
Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter
Title | Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter |
Authors | Ziqi Zhang, Lei Luo |
Abstract | In recent years, the increasing propagation of hate speech on social media and the urgent need for effective counter-measures have drawn significant investment from governments, companies, and researchers. A large number of methods have been developed for automated hate speech detection online. This aims to classify textual content into non-hate or hate speech, in which case the method may also identify the targeting characteristics (i.e., types of hate, such as race, and religion) in the hate speech. However, we notice significant difference between the performance of the two (i.e., non-hate v.s. hate). In this work, we argue for a focus on the latter problem for practical reasons. We show that it is a much more challenging task, as our analysis of the language in the typical datasets shows that hate speech lacks unique, discriminative features and therefore is found in the ‘long tail’ in a dataset that is difficult to discover. We then propose Deep Neural Network structures serving as feature extractors that are particularly effective for capturing the semantics of hate speech. Our methods are evaluated on the largest collection of hate speech datasets based on Twitter, and are shown to be able to outperform the best performing method by up to 5 percentage points in macro-average F1, or 8 percentage points in the more challenging case of identifying hateful content. |
Tasks | Hate Speech Detection |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1803.03662v2 |
http://arxiv.org/pdf/1803.03662v2.pdf | |
PWC | https://paperswithcode.com/paper/hate-speech-detection-a-solved-problem-the |
Repo | |
Framework | |
Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning
Title | Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning |
Authors | Hao Yu, Sen Yang, Shenghuo Zhu |
Abstract | In distributed training of deep neural networks, parallel mini-batch SGD is widely used to speed up the training process by using multiple workers. It uses multiple workers to sample local stochastic gradient in parallel, aggregates all gradients in a single server to obtain the average, and update each worker’s local model using a SGD update with the averaged gradient. Ideally, parallel mini-batch SGD can achieve a linear speed-up of the training time (with respect to the number of workers) compared with SGD over a single worker. However, such linear scalability in practice is significantly limited by the growing demand for gradient communication as more workers are involved. Model averaging, which periodically averages individual models trained over parallel workers, is another common practice used for distributed training of deep neural networks since (Zinkevich et al. 2010) (McDonald, Hall, and Mann 2010). Compared with parallel mini-batch SGD, the communication overhead of model averaging is significantly reduced. Impressively, tremendous experimental works have verified that model averaging can still achieve a good speed-up of the training time as long as the averaging interval is carefully controlled. However, it remains a mystery in theory why such a simple heuristic works so well. This paper provides a thorough and rigorous theoretical study on why model averaging can work as well as parallel mini-batch SGD with significantly less communication overhead. |
Tasks | |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06629v3 |
http://arxiv.org/pdf/1807.06629v3.pdf | |
PWC | https://paperswithcode.com/paper/parallel-restarted-sgd-with-faster |
Repo | |
Framework | |
Multi-Fidelity Recursive Behavior Prediction
Title | Multi-Fidelity Recursive Behavior Prediction |
Authors | Mihir Jain, Kyle Brown, Ahmed K. Sadek |
Abstract | Predicting the behavior of surrounding vehicles is a critical problem in automated driving. We present a novel game theoretic behavior prediction model that achieves state of the art prediction accuracy by explicitly reasoning about possible future interaction between agents. We evaluate our approach on the NGSIM vehicle trajectory data set and demonstrate lower root mean square error than state-of-the-art methods. |
Tasks | |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1901.01831v1 |
http://arxiv.org/pdf/1901.01831v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-fidelity-recursive-behavior-prediction |
Repo | |
Framework | |
Theoretical Guarantees of Transfer Learning
Title | Theoretical Guarantees of Transfer Learning |
Authors | Zirui Wang |
Abstract | Transfer learning has been proven effective when within-target labeled data is scarce. A lot of works have developed successful algorithms and empirically observed positive transfer effect that improves target generalization error using source knowledge. However, theoretical analysis of transfer learning is more challenging due to the nature of the problem and thus is less studied. In this report, we do a survey of theoretical works in transfer learning and summarize key theoretical guarantees that prove the effectiveness of transfer learning. The theoretical bounds are derived using model complexity and learning algorithm stability. As we should see, these works exhibit a trade-off between tight bounds and restrictive assumptions. Moreover, we also prove a new generalization bound for the multi-source transfer learning problem using the VC-theory, which is more informative than the one proved in previous work. |
Tasks | Transfer Learning |
Published | 2018-10-14 |
URL | http://arxiv.org/abs/1810.05986v2 |
http://arxiv.org/pdf/1810.05986v2.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-guarantees-of-transfer-learning |
Repo | |
Framework | |
SREdgeNet: Edge Enhanced Single Image Super Resolution using Dense Edge Detection Network and Feature Merge Network
Title | SREdgeNet: Edge Enhanced Single Image Super Resolution using Dense Edge Detection Network and Feature Merge Network |
Authors | Kwanyoung Kim, Se Young Chun |
Abstract | Deep learning based single image super-resolution (SR) methods have been rapidly evolved over the past few years and have yielded state-of-the-art performances over conventional methods. Since these methods usually minimized l1 loss between the output SR image and the ground truth image, they yielded very high peak signal-to-noise ratio (PSNR) that is inversely proportional to these losses. Unfortunately, minimizing these losses inevitably lead to blurred edges due to averaging of plausible solutions. Recently, SRGAN was proposed to avoid this average effect by minimizing perceptual losses instead of l1 loss and it yielded perceptually better SR images (or images with sharp edges) at the price of lowering PSNR. In this paper, we propose SREdgeNet, edge enhanced single image SR network, that was inspired by conventional SR theories so that average effect could be avoided not by changing the loss, but by changing the SR network property with the same l1 loss. Our SREdgeNet consists of 3 sequential deep neural network modules: the first module is any state-of-the-art SR network and we selected a variant of EDSR. The second module is any edge detection network taking the output of the first SR module as an input and we propose DenseEdgeNet for this module. Lastly, the third module is merging the outputs of the first and second modules to yield edge enhanced SR image and we propose MergeNet for this module. Qualitatively, our proposed method yielded images with sharp edges compared to other state-of-the-art SR methods. Quantitatively, our SREdgeNet yielded state-of-the-art performance in terms of structural similarity (SSIM) while maintained comparable PSNR for x8 enlargement. |
Tasks | Edge Detection, Image Super-Resolution, Super-Resolution |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07174v1 |
http://arxiv.org/pdf/1812.07174v1.pdf | |
PWC | https://paperswithcode.com/paper/sredgenet-edge-enhanced-single-image-super |
Repo | |
Framework | |
Multi-target Unsupervised Domain Adaptation without Exactly Shared Categories
Title | Multi-target Unsupervised Domain Adaptation without Exactly Shared Categories |
Authors | Huanhuan Yu, Menglei Hu, Songcan Chen |
Abstract | Unsupervised domain adaptation (UDA) aims to learn the unlabeled target domain by transferring the knowledge of the labeled source domain. To date, most of the existing works focus on the scenario of one source domain and one target domain (1S1T), and just a few works concern the scenario of multiple source domains and one target domain (mS1T). While, to the best of our knowledge, almost no work concerns the scenario of one source domain and multiple target domains (1SmT), in which these unlabeled target domains may not necessarily share the same categories, therefore, contrasting to mS1T, 1SmT is more challenging. Accordingly, for such a new UDA scenario, we propose a UDA framework through the model parameter adaptation (PA-1SmT). A key ingredient of PA-1SmT is to transfer knowledge through adaptive learning of a common model parameter dictionary, which is completely different from existing popular methods for UDA, such as subspace alignment, distribution matching etc., and can also be directly used for DA of privacy protection due to the fact that the knowledge is transferred just via the model parameters rather than data itself. Finally, our experimental results on three domain adaptation benchmark datasets demonstrate the superiority of our framework. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.00852v2 |
http://arxiv.org/pdf/1809.00852v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-target-unsupervised-domain-adaptation |
Repo | |
Framework | |