July 30, 2019

3197 words 16 mins read

Paper Group AWR 37

Paper Group AWR 37

Dual Long Short-Term Memory Networks for Sub-Character Representation Learning. JamBot: Music Theory Aware Chord Based Generation of Polyphonic Music with LSTMs. Black-Box Data-efficient Policy Search for Robotics. Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization. An Architecture Combining Convolutional Neural Network (CN …

Dual Long Short-Term Memory Networks for Sub-Character Representation Learning

Title Dual Long Short-Term Memory Networks for Sub-Character Representation Learning
Authors Han He, Lei Wu, Xiaokun Yang, Hua Yan, Zhimin Gao, Yi Feng, George Townsend
Abstract Characters have commonly been regarded as the minimal processing unit in Natural Language Processing (NLP). But many non-latin languages have hieroglyphic writing systems, involving a big alphabet with thousands or millions of characters. Each character is composed of even smaller parts, which are often ignored by the previous work. In this paper, we propose a novel architecture employing two stacked Long Short-Term Memory Networks (LSTMs) to learn sub-character level representation and capture deeper level of semantic meanings. To build a concrete study and substantiate the efficiency of our neural architecture, we take Chinese Word Segmentation as a research case example. Among those languages, Chinese is a typical case, for which every character contains several components called radicals. Our networks employ a shared radical level embedding to solve both Simplified and Traditional Chinese Word Segmentation, without extra Traditional to Simplified Chinese conversion, in such a highly end-to-end way the word segmentation can be significantly simplified compared to the previous work. Radical level embeddings can also capture deeper semantic meaning below character level and improve the system performance of learning. By tying radical and character embeddings together, the parameter count is reduced whereas semantic knowledge is shared and transferred between two levels, boosting the performance largely. On 3 out of 4 Bakeoff 2005 datasets, our method surpassed state-of-the-art results by up to 0.4%. Our results are reproducible, source codes and corpora are available on GitHub.
Tasks Chinese Word Segmentation, Representation Learning
Published 2017-12-23
URL http://arxiv.org/abs/1712.08841v2
PDF http://arxiv.org/pdf/1712.08841v2.pdf
PWC https://paperswithcode.com/paper/dual-long-short-term-memory-networks-for-sub
Repo https://github.com/hankcs/sub-character-cws
Framework none

JamBot: Music Theory Aware Chord Based Generation of Polyphonic Music with LSTMs

Title JamBot: Music Theory Aware Chord Based Generation of Polyphonic Music with LSTMs
Authors Gino Brunner, Yuyi Wang, Roger Wattenhofer, Jonas Wiesendanger
Abstract We propose a novel approach for the generation of polyphonic music based on LSTMs. We generate music in two steps. First, a chord LSTM predicts a chord progression based on a chord embedding. A second LSTM then generates polyphonic music from the predicted chord progression. The generated music sounds pleasing and harmonic, with only few dissonant notes. It has clear long-term structure that is similar to what a musician would play during a jam session. We show that our approach is sensible from a music theory perspective by evaluating the learned chord embeddings. Surprisingly, our simple model managed to extract the circle of fifths, an important tool in music theory, from the dataset.
Tasks
Published 2017-11-21
URL http://arxiv.org/abs/1711.07682v1
PDF http://arxiv.org/pdf/1711.07682v1.pdf
PWC https://paperswithcode.com/paper/jambot-music-theory-aware-chord-based
Repo https://github.com/brunnergino/JamBot
Framework tf

Black-Box Data-efficient Policy Search for Robotics

Title Black-Box Data-efficient Policy Search for Robotics
Authors Konstantinos Chatzilygeroudis, Roberto Rama, Rituraj Kaushik, Dorian Goepp, Vassilis Vassiliades, Jean-Baptiste Mouret
Abstract The most data-efficient algorithms for reinforcement learning (RL) in robotics are based on uncertain dynamical models: after each episode, they first learn a dynamical model of the robot, then they use an optimization algorithm to find a policy that maximizes the expected return given the model and its uncertainties. It is often believed that this optimization can be tractable only if analytical, gradient-based algorithms are used; however, these algorithms require using specific families of reward functions and policies, which greatly limits the flexibility of the overall approach. In this paper, we introduce a novel model-based RL algorithm, called Black-DROPS (Black-box Data-efficient RObot Policy Search) that: (1) does not impose any constraint on the reward function or the policy (they are treated as black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for data-efficient RL in robotics, and (3) is as fast (or faster) than analytical approaches when several cores are available. The key idea is to replace the gradient-based optimization algorithm with a parallel, black-box algorithm that takes into account the model uncertainties. We demonstrate the performance of our new algorithm on two standard control benchmark problems (in simulation) and a low-cost robotic manipulator (with a real robot).
Tasks Continuous Control
Published 2017-03-21
URL http://arxiv.org/abs/1703.07261v2
PDF http://arxiv.org/pdf/1703.07261v2.pdf
PWC https://paperswithcode.com/paper/black-box-data-efficient-policy-search-for
Repo https://github.com/resibots/blackdrops
Framework none

Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization

Title Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization
Authors Yinpeng Dong, Renkun Ni, Jianguo Li, Yurong Chen, Jun Zhu, Hang Su
Abstract Low-bit deep neural networks (DNNs) become critical for embedded applications due to their low storage requirement and computing efficiency. However, they suffer much from the non-negligible accuracy drop. This paper proposes the stochastic quantization (SQ) algorithm for learning accurate low-bit DNNs. The motivation is due to the following observation. Existing training algorithms approximate the real-valued elements/filters with low-bit representation all together in each iteration. The quantization errors may be small for some elements/filters, while are remarkable for others, which lead to inappropriate gradient direction during training, and thus bring notable accuracy drop. Instead, SQ quantizes a portion of elements/filters to low-bit with a stochastic probability inversely proportional to the quantization error, while keeping the other portion unchanged with full-precision. The quantized and full-precision portions are updated with corresponding gradients separately in each iteration. The SQ ratio is gradually increased until the whole network is quantized. This procedure can greatly compensate the quantization error and thus yield better accuracy for low-bit DNNs. Experiments show that SQ can consistently and significantly improve the accuracy for different low-bit DNNs on various datasets and various network structures.
Tasks Quantization
Published 2017-08-03
URL http://arxiv.org/abs/1708.01001v1
PDF http://arxiv.org/pdf/1708.01001v1.pdf
PWC https://paperswithcode.com/paper/learning-accurate-low-bit-deep-neural
Repo https://github.com/dongyp13/Stochastic-Quantization
Framework none

An Architecture Combining Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for Image Classification

Title An Architecture Combining Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for Image Classification
Authors Abien Fred Agarap
Abstract Convolutional neural networks (CNNs) are similar to “ordinary” neural networks in the sense that they are made up of hidden layers consisting of neurons with “learnable” parameters. These neurons receive inputs, performs a dot product, and then follows it with a non-linearity. The whole network expresses the mapping between raw image pixels and their class scores. Conventionally, the Softmax function is the classifier used at the last layer of this network. However, there have been studies (Alalshekmubarak and Smith, 2013; Agarap, 2017; Tang, 2013) conducted to challenge this norm. The cited studies introduce the usage of linear support vector machine (SVM) in an artificial neural network architecture. This project is yet another take on the subject, and is inspired by (Tang, 2013). Empirical data has shown that the CNN-SVM model was able to achieve a test accuracy of ~99.04% using the MNIST dataset (LeCun, Cortes, and Burges, 2010). On the other hand, the CNN-Softmax was able to achieve a test accuracy of ~99.23% using the same dataset. Both models were also tested on the recently-published Fashion-MNIST dataset (Xiao, Rasul, and Vollgraf, 2017), which is suppose to be a more difficult image classification dataset than MNIST (Zalandoresearch, 2017). This proved to be the case as CNN-SVM reached a test accuracy of ~90.72%, while the CNN-Softmax reached a test accuracy of ~91.86%. The said results may be improved if data preprocessing techniques were employed on the datasets, and if the base CNN model was a relatively more sophisticated than the one used in this study.
Tasks Image Classification
Published 2017-12-10
URL http://arxiv.org/abs/1712.03541v2
PDF http://arxiv.org/pdf/1712.03541v2.pdf
PWC https://paperswithcode.com/paper/an-architecture-combining-convolutional
Repo https://github.com/AFAgarap/cnn-svm
Framework tf

Negative Sampling Improves Hypernymy Extraction Based on Projection Learning

Title Negative Sampling Improves Hypernymy Extraction Based on Projection Learning
Authors Dmitry Ustalov, Nikolay Arefyev, Chris Biemann, Alexander Panchenko
Abstract We present a new approach to extraction of hypernyms based on projection learning and word embeddings. In contrast to classification-based approaches, projection-based methods require no candidate hyponym-hypernym pairs. While it is natural to use both positive and negative training examples in supervised relation extraction, the impact of negative examples on hypernym prediction was not studied so far. In this paper, we show that explicit negative examples used for regularization of the model significantly improve performance compared to the state-of-the-art approach of Fu et al. (2014) on three datasets from different languages.
Tasks Relation Extraction, Word Embeddings
Published 2017-07-12
URL http://arxiv.org/abs/1707.03903v2
PDF http://arxiv.org/pdf/1707.03903v2.pdf
PWC https://paperswithcode.com/paper/negative-sampling-improves-hypernymy
Repo https://github.com/nlpub/projlearn
Framework tf

Embodied Question Answering

Title Embodied Question Answering
Authors Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
Abstract We present a new AI task – Embodied Question Answering (EmbodiedQA) – where an agent is spawned at a random location in a 3D environment and asked a question (“What color is the car?"). In order to answer, the agent must first intelligently navigate to explore the environment, gather information through first-person (egocentric) vision, and then answer the question (“orange”). This challenging task requires a range of AI skills – active perception, language understanding, goal-driven navigation, commonsense reasoning, and grounding of language into actions. In this work, we develop the environments, end-to-end-trained reinforcement learning agents, and evaluation protocols for EmbodiedQA.
Tasks Embodied Question Answering, Question Answering
Published 2017-11-30
URL http://arxiv.org/abs/1711.11543v2
PDF http://arxiv.org/pdf/1711.11543v2.pdf
PWC https://paperswithcode.com/paper/embodied-question-answering
Repo https://github.com/abhshkdz/House3D
Framework none

Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward

Title Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward
Authors Kaiyang Zhou, Yu Qiao, Tao Xiang
Abstract Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos. In this paper, we formulate video summarization as a sequential decision-making process and develop a deep summarization network (DSN) to summarize videos. DSN predicts for each video frame a probability, which indicates how likely a frame is selected, and then takes actions based on the probability distributions to select frames, forming video summaries. To train our DSN, we propose an end-to-end, reinforcement learning-based framework, where we design a novel reward function that jointly accounts for diversity and representativeness of generated summaries and does not rely on labels or user interactions at all. During training, the reward function judges how diverse and representative the generated summaries are, while DSN strives for earning higher rewards by learning to produce more diverse and more representative summaries. Since labels are not required, our method can be fully unsupervised. Extensive experiments on two benchmark datasets show that our unsupervised method not only outperforms other state-of-the-art unsupervised methods, but also is comparable to or even superior than most of published supervised approaches.
Tasks Decision Making, Supervised Video Summarization, Unsupervised Video Summarization, Video Summarization
Published 2017-12-29
URL http://arxiv.org/abs/1801.00054v3
PDF http://arxiv.org/pdf/1801.00054v3.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-learning-for-unsupervised
Repo https://github.com/KaiyangZhou/vsumm-reinforce
Framework pytorch

Automatic Keyword Extraction for Text Summarization: A Survey

Title Automatic Keyword Extraction for Text Summarization: A Survey
Authors Santosh Kumar Bharti, Korra Sathya Babu
Abstract In recent times, data is growing rapidly in every domain such as news, social media, banking, education, etc. Due to the excessiveness of data, there is a need of automatic summarizer which will be capable to summarize the data especially textual data in original document without losing any critical purposes. Text summarization is emerged as an important research area in recent past. In this regard, review of existing work on text summarization process is useful for carrying out further research. In this paper, recent literature on automatic keyword extraction and text summarization are presented since text summarization process is highly depend on keyword extraction. This literature includes the discussion about different methodology used for keyword extraction and text summarization. It also discusses about different databases used for text summarization in several domains along with evaluation matrices. Finally, it discusses briefly about issues and research challenges faced by researchers along with future direction.
Tasks Keyword Extraction, Text Summarization
Published 2017-04-11
URL http://arxiv.org/abs/1704.03242v1
PDF http://arxiv.org/pdf/1704.03242v1.pdf
PWC https://paperswithcode.com/paper/automatic-keyword-extraction-for-text
Repo https://github.com/dddfgkl/paper
Framework none

Gray-box optimization and factorized distribution algorithms: where two worlds collide

Title Gray-box optimization and factorized distribution algorithms: where two worlds collide
Authors Roberto Santana
Abstract The concept of gray-box optimization, in juxtaposition to black-box optimization, revolves about the idea of exploiting the problem structure to implement more efficient evolutionary algorithms (EAs). Work on factorized distribution algorithms (FDAs), whose factorizations are directly derived from the problem structure, has also contributed to show how exploiting the problem structure produces important gains in the efficiency of EAs. In this paper we analyze the general question of using problem structure in EAs focusing on confronting work done in gray-box optimization with related research accomplished in FDAs. This contrasted analysis helps us to identify, in current studies on the use problem structure in EAs, two distinct analytical characterizations of how these algorithms work. Moreover, we claim that these two characterizations collide and compete at the time of providing a coherent framework to investigate this type of algorithms. To illustrate this claim, we present a contrasted analysis of formalisms, questions, and results produced in FDAs and gray-box optimization. Common underlying principles in the two approaches, which are usually overlooked, are identified and discussed. Besides, an extensive review of previous research related to different uses of the problem structure in EAs is presented. The paper also elaborates on some of the questions that arise when extending the use of problem structure in EAs, such as the question of evolvability, high cardinality of the variables and large definition sets, constrained and multi-objective problems, etc. Finally, emergent approaches that exploit neural models to capture the problem structure are covered.
Tasks
Published 2017-07-11
URL http://arxiv.org/abs/1707.03093v1
PDF http://arxiv.org/pdf/1707.03093v1.pdf
PWC https://paperswithcode.com/paper/gray-box-optimization-and-factorized
Repo https://github.com/rsantana-isg/graybox_fda_paper
Framework none

Dynamic Computational Time for Visual Attention

Title Dynamic Computational Time for Visual Attention
Authors Zhichao Li, Yi Yang, Xiao Liu, Feng Zhou, Shilei Wen, Wei Xu
Abstract We propose a dynamic computational time model to accelerate the average processing time for recurrent visual attention (RAM). Rather than attention with a fixed number of steps for each input image, the model learns to decide when to stop on the fly. To achieve this, we add an additional continue/stop action per time step to RAM and use reinforcement learning to learn both the optimal attention policy and stopping policy. The modification is simple but could dramatically save the average computational time while keeping the same recognition performance as RAM. Experimental results on CUB-200-2011 and Stanford Cars dataset demonstrate the dynamic computational model can work effectively for fine-grained image recognition.The source code of this paper can be obtained from https://github.com/baidu-research/DT-RAM
Tasks
Published 2017-03-30
URL http://arxiv.org/abs/1703.10332v3
PDF http://arxiv.org/pdf/1703.10332v3.pdf
PWC https://paperswithcode.com/paper/dynamic-computational-time-for-visual
Repo https://github.com/baidu-research/DT-RAM
Framework torch

Changing Fashion Cultures

Title Changing Fashion Cultures
Authors Kaori Abe, Teppei Suzuki, Shunya Ueta, Akio Nakamura, Yutaka Satoh, Hirokatsu Kataoka
Abstract The paper presents a novel concept that analyzes and visualizes worldwide fashion trends. Our goal is to reveal cutting-edge fashion trends without displaying an ordinary fashion style. To achieve the fashion-based analysis, we created a new fashion culture database (FCDB), which consists of 76 million geo-tagged images in 16 cosmopolitan cities. By grasping a fashion trend of mixed fashion styles,the paper also proposes an unsupervised fashion trend descriptor (FTD) using a fashion descriptor, a codeword vetor, and temporal analysis. To unveil fashion trends in the FCDB, the temporal analysis in FTD effectively emphasizes consecutive features between two different times. In experiments, we clearly show the analysis of fashion trends and fashion-based city similarity. As the result of large-scale data collection and an unsupervised analyzer, the proposed approach achieves world-level fashion visualization in a time series. The code, model, and FCDB will be publicly available after the construction of the project page.
Tasks Time Series
Published 2017-03-23
URL http://arxiv.org/abs/1703.07920v1
PDF http://arxiv.org/pdf/1703.07920v1.pdf
PWC https://paperswithcode.com/paper/changing-fashion-cultures
Repo https://github.com/hurutoriya/hurutoriya.github.io
Framework none

Discovering topics in text datasets by visualizing relevant words

Title Discovering topics in text datasets by visualizing relevant words
Authors Franziska Horn, Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek
Abstract When dealing with large collections of documents, it is imperative to quickly get an overview of the texts’ contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which distinguish a group of documents from the rest of the texts, to summarize the contents of the documents belonging to each topic. We demonstrate our approach by discovering trending topics in a collection of New York Times article snippets.
Tasks
Published 2017-07-18
URL http://arxiv.org/abs/1707.06100v1
PDF http://arxiv.org/pdf/1707.06100v1.pdf
PWC https://paperswithcode.com/paper/discovering-topics-in-text-datasets-by
Repo https://github.com/cod3licious/textcatvis
Framework tf

The N-Tuple Bandit Evolutionary Algorithm for Automatic Game Improvement

Title The N-Tuple Bandit Evolutionary Algorithm for Automatic Game Improvement
Authors Kamolwan Kunanusont, Raluca D. Gaina, Jialin Liu, Diego Perez-Liebana, Simon M. Lucas
Abstract This paper describes a new evolutionary algorithm that is especially well suited to AI-Assisted Game Design. The approach adopted in this paper is to use observations of AI agents playing the game to estimate the game’s quality. Some of best agents for this purpose are General Video Game AI agents, since they can be deployed directly on a new game without game-specific tuning; these agents tend to be based on stochastic algorithms which give robust but noisy results and tend to be expensive to run. This motivates the main contribution of the paper: the development of the novel N-Tuple Bandit Evolutionary Algorithm, where a model is used to estimate the fitness of unsampled points and a bandit approach is used to balance exploration and exploitation of the search space. Initial results on optimising a Space Battle game variant suggest that the algorithm offers far more robust results than the Random Mutation Hill Climber and a Biased Mutation variant, which are themselves known to offer competitive performance across a range of problems. Subjective observations are also given by human players on the nature of the evolved games, which indicate a preference towards games generated by the N-Tuple algorithm.
Tasks
Published 2017-03-18
URL http://arxiv.org/abs/1705.01080v1
PDF http://arxiv.org/pdf/1705.01080v1.pdf
PWC https://paperswithcode.com/paper/the-n-tuple-bandit-evolutionary-algorithm-for-1
Repo https://github.com/Bam4d/NTBEA
Framework none

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Title StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
Authors Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas
Abstract Although Generative Adversarial Networks (GANs) have shown remarkable success in various tasks, they still face challenges in generating high quality images. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images. First, we propose a two-stage generative adversarial network architecture, StackGAN-v1, for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and colors of the object based on given text description, yielding low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. Second, an advanced multi-stage generative adversarial network architecture, StackGAN-v2, is proposed for both conditional and unconditional generative tasks. Our StackGAN-v2 consists of multiple generators and discriminators in a tree-like structure; images at multiple scales corresponding to the same scene are generated from different branches of the tree. StackGAN-v2 shows more stable training behavior than StackGAN-v1 by jointly approximating multiple distributions. Extensive experiments demonstrate that the proposed stacked generative adversarial networks significantly outperform other state-of-the-art methods in generating photo-realistic images.
Tasks Image Generation, Text-to-Image Generation
Published 2017-10-19
URL http://arxiv.org/abs/1710.10916v3
PDF http://arxiv.org/pdf/1710.10916v3.pdf
PWC https://paperswithcode.com/paper/stackgan-realistic-image-synthesis-with
Repo https://github.com/Maymaher/StackGANv2
Framework pytorch
comments powered by Disqus