February 1, 2020

3368 words 16 mins read

Paper Group AWR 104

Paper Group AWR 104

Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer. Attention-aware Multi-stroke Style Transfer. Polyglot Contextual Representations Improve Crosslingual Transfer. Billion-scale semi-supervised learning for image classification. HELP: A Dataset for Identifying Shortcomings of Neural Models in Monotonicity Reasoning. Recommend …

Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer

Title Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer
Authors Xinyuan Chen, Chang Xu, Xiaokang Yang, Li Song, Dacheng Tao
Abstract Style transfer describes the rendering of an image semantic content as different artistic styles. Recently, generative adversarial networks (GANs) have emerged as an effective approach in style transfer by adversarially training the generator to synthesize convincing counterfeits. However, traditional GAN suffers from the mode collapse issue, resulting in unstable training and making style transfer quality difficult to guarantee. In addition, the GAN generator is only compatible with one style, so a series of GANs must be trained to provide users with choices to transfer more than one kind of style. In this paper, we focus on tackling these challenges and limitations to improve style transfer. We propose adversarial gated networks (Gated GAN) to transfer multiple styles in a single model. The generative networks have three modules: an encoder, a gated transformer, and a decoder. Different styles can be achieved by passing input images through different branches of the gated transformer. To stabilize training, the encoder and decoder are combined as an autoencoder to reconstruct the input images. The discriminative networks are used to distinguish whether the input image is a stylized or genuine image. An auxiliary classifier is used to recognize the style categories of transferred images, thereby helping the generative networks generate images in multiple styles. In addition, Gated GAN makes it possible to explore a new style by investigating styles learned from artists or genres. Our extensive experiments demonstrate the stability and effectiveness of the proposed model for multistyle transfer.
Tasks Style Transfer
Published 2019-04-04
URL http://arxiv.org/abs/1904.02296v1
PDF http://arxiv.org/pdf/1904.02296v1.pdf
PWC https://paperswithcode.com/paper/gated-gan-adversarial-gated-networks-for
Repo https://github.com/colemiller94/gatedgan
Framework pytorch

Attention-aware Multi-stroke Style Transfer

Title Attention-aware Multi-stroke Style Transfer
Authors Yuan Yao, Jianqiang Ren, Xuansong Xie, Weidong Liu, Yong-Jin Liu, Jun Wang
Abstract Neural style transfer has drawn considerable attention from both academic and industrial field. Although visual effect and efficiency have been significantly improved, existing methods are unable to coordinate spatial distribution of visual attention between the content image and stylized image, or render diverse level of detail via different brush strokes. In this paper, we tackle these limitations by developing an attention-aware multi-stroke style transfer model. We first propose to assemble self-attention mechanism into a style-agnostic reconstruction autoencoder framework, from which the attention map of a content image can be derived. By performing multi-scale style swap on content features and style features, we produce multiple feature maps reflecting different stroke patterns. A flexible fusion strategy is further presented to incorporate the salient characteristics from the attention map, which allows integrating multiple stroke patterns into different spatial regions of the output image harmoniously. We demonstrate the effectiveness of our method, as well as generate comparable stylized images with multiple stroke patterns against the state-of-the-art methods.
Tasks Style Transfer
Published 2019-01-16
URL http://arxiv.org/abs/1901.05127v1
PDF http://arxiv.org/pdf/1901.05127v1.pdf
PWC https://paperswithcode.com/paper/attention-aware-multi-stroke-style-transfer
Repo https://github.com/JianqiangRen/AAMS
Framework tf

Polyglot Contextual Representations Improve Crosslingual Transfer

Title Polyglot Contextual Representations Improve Crosslingual Transfer
Authors Phoebe Mulcaire, Jungo Kasai, Noah A. Smith
Abstract We introduce Rosita, a method to produce multilingual contextual word representations by training a single language model on text from multiple languages. Our method combines the advantages of contextual word representations with those of multilingual representation learning. We produce language models from dissimilar language pairs (English/Arabic and English/Chinese) and use them in dependency parsing, semantic role labeling, and named entity recognition, with comparisons to monolingual and non-contextual variants. Our results provide further evidence for the benefits of polyglot learning, in which representations are shared across multiple languages.
Tasks Dependency Parsing, Language Modelling, Named Entity Recognition, Representation Learning, Semantic Role Labeling
Published 2019-02-26
URL http://arxiv.org/abs/1902.09697v2
PDF http://arxiv.org/pdf/1902.09697v2.pdf
PWC https://paperswithcode.com/paper/polyglot-contextual-representations-improve
Repo https://github.com/pmulcaire/rosita
Framework pytorch

Billion-scale semi-supervised learning for image classification

Title Billion-scale semi-supervised learning for image classification
Authors I. Zeki Yalniz, Hervé Jégou, Kan Chen, Manohar Paluri, Dhruv Mahajan
Abstract This paper presents a study of semi-supervised learning with large convolutional networks. We propose a pipeline, based on a teacher/student paradigm, that leverages a large collection of unlabelled images (up to 1 billion). Our main goal is to improve the performance for a given target architecture, like ResNet-50 or ResNext. We provide an extensive analysis of the success factors of our approach, which leads us to formulate some recommendations to produce high-accuracy models for image classification with semi-supervised learning. As a result, our approach brings important gains to standard architectures for image, video and fine-grained classification. For instance, by leveraging one billion unlabelled images, our learned vanilla ResNet-50 achieves 81.2% top-1 accuracy on the ImageNet benchmark.
Tasks Image Classification, Video Classification
Published 2019-05-02
URL http://arxiv.org/abs/1905.00546v1
PDF http://arxiv.org/pdf/1905.00546v1.pdf
PWC https://paperswithcode.com/paper/billion-scale-semi-supervised-learning-for
Repo https://github.com/leaderj1001/Billion-scale-semi-supervised-learning
Framework pytorch

HELP: A Dataset for Identifying Shortcomings of Neural Models in Monotonicity Reasoning

Title HELP: A Dataset for Identifying Shortcomings of Neural Models in Monotonicity Reasoning
Authors Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, Kentaro Inui, Satoshi Sekine, Lasha Abzianidze, Johan Bos
Abstract Large crowdsourced datasets are widely used for training and evaluating neural models on natural language inference (NLI). Despite these efforts, neural models have a hard time capturing logical inferences, including those licensed by phrase replacements, so-called monotonicity reasoning. Since no large dataset has been developed for monotonicity reasoning, it is still unclear whether the main obstacle is the size of datasets or the model architectures themselves. To investigate this issue, we introduce a new dataset, called HELP, for handling entailments with lexical and logical phenomena. We add it to training data for the state-of-the-art neural models and evaluate them on test sets for monotonicity phenomena. The results showed that our data augmentation improved the overall accuracy. We also find that the improvement is better on monotonicity inferences with lexical replacements than on downward inferences with disjunction and modification. This suggests that some types of inferences can be improved by our data augmentation while others are immune to it.
Tasks Data Augmentation, Natural Language Inference
Published 2019-04-27
URL http://arxiv.org/abs/1904.12166v1
PDF http://arxiv.org/pdf/1904.12166v1.pdf
PWC https://paperswithcode.com/paper/help-a-dataset-for-identifying-shortcomings
Repo https://github.com/verypluming/HELP
Framework none

Recommender Systems with Heterogeneous Side Information

Title Recommender Systems with Heterogeneous Side Information
Authors Tianqiao Liu, Zhiwei Wang, Jiliang Tang, Songfan Yang, Gale Yan Huang, Zitao Liu
Abstract In modern recommender systems, both users and items are associated with rich side information, which can help understand users and items. Such information is typically heterogeneous and can be roughly categorized into flat and hierarchical side information. While side information has been proved to be valuable, the majority of existing systems have exploited either only flat side information or only hierarchical side information due to the challenges brought by the heterogeneity. In this paper, we investigate the problem of exploiting heterogeneous side information for recommendations. Specifically, we propose a novel framework jointly captures flat and hierarchical side information with mathematical coherence. We demonstrate the effectiveness of the proposed framework via extensive experiments on various real-world datasets. Empirical results show that our approach is able to lead a significant performance gain over the state-of-the-art methods.
Tasks Recommendation Systems
Published 2019-07-18
URL https://arxiv.org/abs/1907.08679v1
PDF https://arxiv.org/pdf/1907.08679v1.pdf
PWC https://paperswithcode.com/paper/recommender-systems-with-heterogeneous-side
Repo https://github.com/tal-ai/Recommender-Systems-with-Heterogeneous-Side-Information
Framework none

Audio-Visual Model Distillation Using Acoustic Images

Title Audio-Visual Model Distillation Using Acoustic Images
Authors Andrés F. Pérez, Valentina Sanguineti, Pietro Morerio, Vittorio Murino
Abstract In this paper, we investigate how to learn rich and robust feature representations for audio classification from visual data and acoustic images, a novel audio data modality. Former models learn audio representations from raw signals or spectral data acquired by a single microphone, with remarkable results in classification and retrieval. However, such representations are not so robust towards variable environmental sound conditions. We tackle this drawback by exploiting a new multimodal labeled action recognition dataset acquired by a hybrid audio-visual sensor that provides RGB video, raw audio signals, and spatialized acoustic data, also known as acoustic images, where the visual and acoustic images are aligned in space and synchronized in time. Using this richer information, we train audio deep learning models in a teacher-student fashion. In particular, we distill knowledge into audio networks from both visual and acoustic image teachers. Our experiments suggest that the learned representations are more powerful and have better generalization capabilities than the features learned from models trained using just single-microphone audio data.
Tasks Audio Classification, Temporal Action Localization
Published 2019-04-16
URL https://arxiv.org/abs/1904.07933v2
PDF https://arxiv.org/pdf/1904.07933v2.pdf
PWC https://paperswithcode.com/paper/audio-visual-model-distillation-using
Repo https://github.com/afperezm/acoustic-images-distillation
Framework tf
Title MPC-Net: A First Principles Guided Policy Search
Authors Jan Carius, Farbod Farshidian, Marco Hutter
Abstract We present an Imitation Learning approach for the control of dynamical systems with a known model. Our policy search method is guided by solutions from MPC. Typical policy search methods of this kind minimize a distance metric between the guiding demonstrations and the learned policy. Our loss function, however, corresponds to the minimization of the control Hamiltonian, which derives from the principle of optimality. Therefore, our algorithm directly attempts to solve the optimality conditions with a parameterized class of control laws. Additionally, the proposed loss function explicitly encodes the constraints of the optimal control problem and we provide numerical evidence that its minimization achieves improved constraint satisfaction. We train a mixture-of-expert neural network architecture for controlling a quadrupedal robot and show that this policy structure is well suited for such multimodal systems. The learned policy can successfully stabilize different gaits on the real walking robot from less than 10 min of demonstration data.
Tasks Imitation Learning
Published 2019-09-11
URL https://arxiv.org/abs/1909.05197v2
PDF https://arxiv.org/pdf/1909.05197v2.pdf
PWC https://paperswithcode.com/paper/mpc-net-a-first-principles-guided-policy
Repo https://github.com/leggedrobotics/MPC-Net
Framework pytorch

Targeted sampling from massive Blockmodel graphs with personalized PageRank

Title Targeted sampling from massive Blockmodel graphs with personalized PageRank
Authors Fan Chen, Yini Zhang, Karl Rohe
Abstract This paper provides statistical theory and intuition for Personalized PageRank (PPR), a popular technique that samples a small community from a massive network. We study a setting where the entire network is expensive to thoroughly obtain or maintain, but we can start from a seed node of interest and “crawl” the network to find other nodes through their connections. By crawling the graph in a designed way, the PPR vector can be approximated without querying the entire massive graph, making it an alternative to snowball sampling. Using the Degree-Corrected Stochastic Blockmodel, we study whether the PPR vector can select nodes that belong to the same block as the seed node. We provide a simple and interpretable form for the PPR vector, highlighting its biases towards high degree nodes outside of the target block. We examine a simple adjustment based on node degrees and establish consistency results for PPR clustering that allows for directed graphs. We illustrate the method with the Twitter friendship graph and find that (i) the adjusted and unadjusted PPR techniques are complementary approaches, where the adjustment makes the results particularly localized around the seed node and (ii) the bias adjustment greatly benefits from degree regularization.
Tasks
Published 2019-10-04
URL https://arxiv.org/abs/1910.12937v1
PDF https://arxiv.org/pdf/1910.12937v1.pdf
PWC https://paperswithcode.com/paper/targeted-sampling-from-massive-blockmodel
Repo https://github.com/RoheLab/aPPR
Framework none

Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks

Title Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks
Authors Saurabh Singh, Shankar Krishnan
Abstract Batch Normalization (BN) uses mini-batch statistics to normalize the activations during training, introducing dependence between mini-batch elements. This dependency can hurt the performance if the mini-batch size is too small, or if the elements are correlated. Several alternatives, such as Batch Renormalization and Group Normalization (GN), have been proposed to address this issue. However, they either do not match the performance of BN for large batches, or still exhibit degradation in performance for smaller batches, or introduce artificial constraints on the model architecture. In this paper we propose the Filter Response Normalization (FRN) layer, a novel combination of a normalization and an activation function, that can be used as a replacement for other normalizations and activations. Our method operates on each activation channel of each batch element independently, eliminating the dependency on other batch elements. Our method outperforms BN and other alternatives in a variety of settings for all batch sizes. FRN layer performs $\approx 0.7-1.0%$ better than BN on top-1 validation accuracy with large mini-batch sizes for Imagenet classification using InceptionV3 and ResnetV2-50 architectures. Further, it performs $>1%$ better than GN on the same problem in the small mini-batch size regime. For object detection problem on COCO dataset, FRN layer outperforms all other methods by at least $0.3-0.5%$ in all batch size regimes.
Tasks Image Classification, Object Detection
Published 2019-11-21
URL https://arxiv.org/abs/1911.09737v2
PDF https://arxiv.org/pdf/1911.09737v2.pdf
PWC https://paperswithcode.com/paper/filter-response-normalization-layer
Repo https://github.com/CarloLepelaars/filter_response_normalization_keras
Framework none

Highly Parallelized Data-driven MPC for Minimal Intervention Shared Control

Title Highly Parallelized Data-driven MPC for Minimal Intervention Shared Control
Authors Alexander Broad, Todd Murphey, Brenna Argall
Abstract We present a shared control paradigm that improves a user’s ability to operate complex, dynamic systems in potentially dangerous environments without a priori knowledge of the user’s objective. In this paradigm, the role of the autonomous partner is to improve the general safety of the system without constraining the user’s ability to achieve unspecified behaviors. Our approach relies on a data-driven, model-based representation of the joint human-machine system to evaluate, in parallel, a significant number of potential inputs that the user may wish to provide. These samples are used to (1) predict the safety of the system over a receding horizon, and (2) minimize the influence of the autonomous partner. The resulting shared control algorithm maximizes the authority allocated to the human partner to improve their sense of agency, while improving safety. We evaluate the efficacy of our shared control algorithm with a human subjects study (n=20) conducted in two simulated environments: a balance bot and a race car. During the experiment, users are free to operate each system however they would like (i.e., there is no specified task) and are only asked to try to avoid unsafe regions of the state space. Using modern computational resources (i.e., GPUs) our approach is able to consider more than 10,000 potential trajectories at each time step in a control loop running at 100Hz for the balance bot and 60Hz for the race car. The results of the study show that our shared control paradigm improves system safety without knowledge of the user’s goal, while maintaining high-levels of user satisfaction and low-levels of frustration. Our code is available online at https://github.com/asbroad/mpmi_shared_control.
Tasks
Published 2019-06-05
URL https://arxiv.org/abs/1906.02318v1
PDF https://arxiv.org/pdf/1906.02318v1.pdf
PWC https://paperswithcode.com/paper/highly-parallelized-data-driven-mpc-for
Repo https://github.com/asbroad/mpmi_shared_control
Framework none

A Neural-Network-Based Model Predictive Control of Three-Phase Inverter With an Output LC Filter

Title A Neural-Network-Based Model Predictive Control of Three-Phase Inverter With an Output LC Filter
Authors Ihab S. Mohamed, Stefano Rovetta, Ton Duc Do, Tomislav Dragicevic, Ahmed A. Zaki Diab
Abstract Model predictive control (MPC) has become one of the well-established modern control methods for three-phase inverters with an output LC filter, where a high-quality voltage with low total harmonic distortion (THD) is needed. Although it is an intuitive controller, easy to understand and implement, it has the significant disadvantage of requiring a large number of online calculations for solving the optimization problem. On the other hand, the application of model-free approaches such as those based on artificial neural networks approaches is currently growing rapidly in the area of power electronics and drives. This paper presents a new control scheme for a two-level converter based on combining MPC and feed-forward ANN, with the aim of getting lower THD and improving the steady and dynamic performance of the system for different types of loads. First, MPC is used, as an expert, in the training phase to generate data required for training the proposed neural network. Then, once the neural network is fine-tuned, it can be successfully used online for voltage tracking purpose, without the need of using MPC. The proposed ANN-based control strategy is validated through simulation, using MATLAB/Simulink tools, taking into account different loads conditions. Moreover, the performance of the ANN-based controller is evaluated, on several samples of linear and non-linear loads under various operating conditions, and compared to that of MPC, demonstrating the excellent steady-state and dynamic performance of the proposed ANN-based control strategy.
Tasks
Published 2019-02-22
URL https://arxiv.org/abs/1902.09964v3
PDF https://arxiv.org/pdf/1902.09964v3.pdf
PWC https://paperswithcode.com/paper/a-neural-network-based-model-predictive
Repo https://github.com/IhabMohamed/ANN-MPC
Framework none

Relational Knowledge Distillation

Title Relational Knowledge Distillation
Authors Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho
Abstract Knowledge distillation aims at transferring knowledge acquired in one model (a teacher) to another model (a student) that is typically smaller. Previous approaches can be expressed as a form of training the student to mimic output activations of individual data examples represented by the teacher. We introduce a novel approach, dubbed relational knowledge distillation (RKD), that transfers mutual relations of data examples instead. For concrete realizations of RKD, we propose distance-wise and angle-wise distillation losses that penalize structural differences in relations. Experiments conducted on different tasks show that the proposed method improves educated student models with a significant margin. In particular for metric learning, it allows students to outperform their teachers’ performance, achieving the state of the arts on standard benchmark datasets.
Tasks Metric Learning
Published 2019-04-10
URL http://arxiv.org/abs/1904.05068v2
PDF http://arxiv.org/pdf/1904.05068v2.pdf
PWC https://paperswithcode.com/paper/relational-knowledge-distillation
Repo https://github.com/lenscloth/RKD
Framework pytorch

Stabilizing Transformers for Reinforcement Learning

Title Stabilizing Transformers for Reinforcement Learning
Authors Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant M. Jayakumar, Max Jaderberg, Raphael Lopez Kaufman, Aidan Clark, Seb Noury, Matthew M. Botvinick, Nicolas Heess, Raia Hadsell
Abstract Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state-of-the-art results in domains such as language modeling and machine translation. Harnessing the transformer’s ability to process long time horizons of information could provide a similar performance boost in partially observable reinforcement learning (RL) domains, but the large-scale transformers used in NLP have yet to be successfully applied to the RL setting. In this work we demonstrate that the standard transformer architecture is difficult to optimize, which was previously observed in the supervised learning setting but becomes especially pronounced with RL objectives. We propose architectural modifications that substantially improve the stability and learning speed of the original Transformer and XL variant. The proposed architecture, the Gated Transformer-XL (GTrXL), surpasses LSTMs on challenging memory environments and achieves state-of-the-art results on the multi-task DMLab-30 benchmark suite, exceeding the performance of an external memory architecture. We show that the GTrXL, trained using the same losses, has stability and performance that consistently matches or exceeds a competitive LSTM baseline, including on more reactive tasks where memory is less critical. GTrXL offers an easy-to-train, simple-to-implement but substantially more expressive architectural alternative to the standard multi-layer LSTM ubiquitously used for RL agents in partially observable environments.
Tasks Language Modelling, Machine Translation
Published 2019-10-13
URL https://arxiv.org/abs/1910.06764v1
PDF https://arxiv.org/pdf/1910.06764v1.pdf
PWC https://paperswithcode.com/paper/stabilizing-transformers-for-reinforcement-1
Repo https://github.com/jdenalil/Gated-Transformer-XL
Framework pytorch

Hateful People or Hateful Bots? Detection and Characterization of Bots Spreading Religious Hatred in Arabic Social Media

Title Hateful People or Hateful Bots? Detection and Characterization of Bots Spreading Religious Hatred in Arabic Social Media
Authors Nuha Albadi, Maram Kurdi, Shivakant Mishra
Abstract Arabic Twitter space is crawling with bots that fuel political feuds, spread misinformation, and proliferate sectarian rhetoric. While efforts have long existed to analyze and detect English bots, Arabic bot detection and characterization remains largely understudied. In this work, we contribute new insights into the role of bots in spreading religious hatred on Arabic Twitter and introduce a novel regression model that can accurately identify Arabic language bots. Our assessment shows that existing tools that are highly accurate in detecting English bots don’t perform as well on Arabic bots. We identify the possible reasons for this poor performance, perform a thorough analysis of linguistic, content, behavioral and network features, and report on the most informative features that distinguish Arabic bots from humans as well as the differences between Arabic and English bots. Our results mark an important step toward understanding the behavior of malicious bots on Arabic Twitter and pave the way for a more effective Arabic bot detection tools.
Tasks
Published 2019-08-01
URL https://arxiv.org/abs/1908.00153v2
PDF https://arxiv.org/pdf/1908.00153v2.pdf
PWC https://paperswithcode.com/paper/hateful-people-or-hateful-bots-detection-and
Repo https://github.com/nuhaalbadi/ArabicBots
Framework none
comments powered by Disqus