July 29, 2019

2922 words 14 mins read

Paper Group AWR 170

Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones. LSTM Fully Convolutional Networks for Time Series Classification. Morphable Face Models - An Open Framework. High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks. Listen, Interact and Talk: Learning to Speak via Interaction. Robust importance-weigh …

Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones


Title	Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones
Authors	Zhenisbek Assylbekov, Rustem Takhanov, Bagdat Myrzakhmetov, Jonathan N. Washington
Abstract	Syllabification does not seem to improve word-level RNN language modeling quality when compared to character-based segmentation. However, our best syllable-aware language model, achieving performance comparable to the competitive character-aware model, has 18%-33% fewer parameters and is trained 1.2-2.2 times faster.
Tasks	Language Modelling
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06480v1
PDF	http://arxiv.org/pdf/1707.06480v1.pdf
PWC	https://paperswithcode.com/paper/syllable-aware-neural-language-models-a
Repo	https://github.com/zh3nis/lstm-syl
Framework	tf

LSTM Fully Convolutional Networks for Time Series Classification


Title	LSTM Fully Convolutional Networks for Time Series Classification
Authors	Fazle Karim, Somshubra Majumdar, Houshang Darabi, Shun Chen
Abstract	Fully convolutional neural networks (FCN) have been shown to achieve state-of-the-art performance on the task of classifying time series sequences. We propose the augmentation of fully convolutional networks with long short term memory recurrent neural network (LSTM RNN) sub-modules for time series classification. Our proposed models significantly enhance the performance of fully convolutional networks with a nominal increase in model size and require minimal preprocessing of the dataset. The proposed Long Short Term Memory Fully Convolutional Network (LSTM-FCN) achieves state-of-the-art performance compared to others. We also explore the usage of attention mechanism to improve time series classification with the Attention Long Short Term Memory Fully Convolutional Network (ALSTM-FCN). Utilization of the attention mechanism allows one to visualize the decision process of the LSTM cell. Furthermore, we propose fine-tuning as a method to enhance the performance of trained models. An overall analysis of the performance of our model is provided and compared to other techniques.
Tasks	Outlier Detection, Time Series, Time Series Classification
Published	2017-09-08
URL	http://arxiv.org/abs/1709.05206v1
PDF	http://arxiv.org/pdf/1709.05206v1.pdf
PWC	https://paperswithcode.com/paper/lstm-fully-convolutional-networks-for-time
Repo	https://github.com/phuijse/MATIC
Framework	pytorch

Morphable Face Models - An Open Framework


Title	Morphable Face Models - An Open Framework
Authors	Thomas Gerig, Andreas Morel-Forster, Clemens Blumer, Bernhard Egger, Marcel Lüthi, Sandro Schönborn, Thomas Vetter
Abstract	In this paper, we present a novel open-source pipeline for face registration based on Gaussian processes as well as an application to face image analysis. Non-rigid registration of faces is significant for many applications in computer vision, such as the construction of 3D Morphable face models (3DMMs). Gaussian Process Morphable Models (GPMMs) unify a variety of non-rigid deformation models with B-splines and PCA models as examples. GPMM separate problem specific requirements from the registration algorithm by incorporating domain-specific adaptions as a prior model. The novelties of this paper are the following: (i) We present a strategy and modeling technique for face registration that considers symmetry, multi-scale and spatially-varying details. The registration is applied to neutral faces and facial expressions. (ii) We release an open-source software framework for registration and model-building, demonstrated on the publicly available BU3D-FE database. The released pipeline also contains an implementation of an Analysis-by-Synthesis model adaption of 2D face images, tested on the Multi-PIE and LFW database. This enables the community to reproduce, evaluate and compare the individual steps of registration to model-building and 3D/2D model fitting. (iii) Along with the framework release, we publish a new version of the Basel Face Model (BFM-2017) with an improved age distribution and an additional facial expression model.
Tasks	Gaussian Processes
Published	2017-09-25
URL	http://arxiv.org/abs/1709.08398v2
PDF	http://arxiv.org/pdf/1709.08398v2.pdf
PWC	https://paperswithcode.com/paper/morphable-face-models-an-open-framework
Repo	https://github.com/unibas-gravis/parametric-face-image-generator
Framework	none

High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks


Title	High-Quality Facial Photo-Sketch Synthesis Using Multi-Adversarial Networks
Authors	Lidan Wang, Vishwanath A. Sindagi, Vishal M. Patel
Abstract	Synthesizing face sketches from real photos and its inverse have many applications. However, photo/sketch synthesis remains a challenging problem due to the fact that photo and sketch have different characteristics. In this work, we consider this task as an image-to-image translation problem and explore the recently popular generative models (GANs) to generate high-quality realistic photos from sketches and sketches from photos. Recent GAN-based methods have shown promising results on image-to-image translation problems and photo-to-sketch synthesis in particular, however, they are known to have limited abilities in generating high-resolution realistic images. To this end, we propose a novel synthesis framework called Photo-Sketch Synthesis using Multi-Adversarial Networks, (PS2-MAN) that iteratively generates low resolution to high resolution images in an adversarial way. The hidden layers of the generator are supervised to first generate lower resolution images followed by implicit refinement in the network to generate higher resolution images. Furthermore, since photo-sketch synthesis is a coupled/paired translation problem, we leverage the pair information using CycleGAN framework. Both Image Quality Assessment (IQA) and Photo-Sketch Matching experiments are conducted to demonstrate the superior performance of our framework in comparison to existing state-of-the-art solutions. Code available at: https://github.com/lidan1/PhotoSketchMAN.
Tasks	Face Sketch Synthesis, Image Quality Assessment, Image-to-Image Translation
Published	2017-10-27
URL	http://arxiv.org/abs/1710.10182v2
PDF	http://arxiv.org/pdf/1710.10182v2.pdf
PWC	https://paperswithcode.com/paper/high-quality-facial-photo-sketch-synthesis
Repo	https://github.com/lidan1/PhotoSketchMAN
Framework	pytorch

Listen, Interact and Talk: Learning to Speak via Interaction


Title	Listen, Interact and Talk: Learning to Speak via Interaction
Authors	Haichao Zhang, Haonan Yu, Wei Xu
Abstract	One of the long-term goals of artificial intelligence is to build an agent that can communicate intelligently with human in natural language. Most existing work on natural language learning relies heavily on training over a pre-collected dataset with annotated labels, leading to an agent that essentially captures the statistics of the fixed external training data. As the training data is essentially a static snapshot representation of the knowledge from the annotator, the agent trained this way is limited in adaptiveness and generalization of its behavior. Moreover, this is very different from the language learning process of humans, where language is acquired during communication by taking speaking action and learning from the consequences of speaking action in an interactive manner. This paper presents an interactive setting for grounded natural language learning, where an agent learns natural language by interacting with a teacher and learning from feedback, thus learning and improving language skills while taking part in the conversation. To achieve this goal, we propose a model which incorporates both imitation and reinforcement by leveraging jointly sentence and reward feedbacks from the teacher. Experiments are conducted to validate the effectiveness of the proposed approach.
Tasks
Published	2017-05-28
URL	http://arxiv.org/abs/1705.09906v1
PDF	http://arxiv.org/pdf/1705.09906v1.pdf
PWC	https://paperswithcode.com/paper/listen-interact-and-talk-learning-to-speak
Repo	https://github.com/PaddlePaddle/XWorld
Framework	none

Robust importance-weighted cross-validation under sample selection bias


Title	Robust importance-weighted cross-validation under sample selection bias
Authors	Wouter M. Kouw, Jesse H. Krijthe, Marco Loog
Abstract	Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces sub-optimal hyperparameter estimates in problem settings where large weights arise with high probability. We study its sampling variance as a function of the training data distribution and introduce a control variate to increase its robustness to problematically large weights.
Tasks
Published	2017-10-17
URL	https://arxiv.org/abs/1710.06514v3
PDF	https://arxiv.org/pdf/1710.06514v3.pdf
PWC	https://paperswithcode.com/paper/reducing-variance-in-importance-weighted
Repo	https://github.com/wmkouw/ctrl-iwxval
Framework	none

Replication issues in syntax-based aspect extraction for opinion mining


Title	Replication issues in syntax-based aspect extraction for opinion mining
Authors	Edison Marrese-Taylor, Yutaka Matsuo
Abstract	Reproducing experiments is an important instrument to validate previous work and build upon existing approaches. It has been tackled numerous times in different areas of science. In this paper, we introduce an empirical replicability study of three well-known algorithms for syntactic centric aspect-based opinion mining. We show that reproducing results continues to be a difficult endeavor, mainly due to the lack of details regarding preprocessing and parameter setting, as well as due to the absence of available implementations that clarify these details. We consider these are important threats to validity of the research on the field, specifically when compared to other problems in NLP where public datasets and code availability are critical validity components. We conclude by encouraging code-based research, which we think has a key role in helping researchers to understand the meaning of the state-of-the-art better and to generate continuous advances.
Tasks	Aspect Extraction, Opinion Mining
Published	2017-01-06
URL	http://arxiv.org/abs/1701.01565v1
PDF	http://arxiv.org/pdf/1701.01565v1.pdf
PWC	https://paperswithcode.com/paper/replication-issues-in-syntax-based-aspect
Repo	https://github.com/epochx/opminreplicability
Framework	none

Low-Rank RNN Adaptation for Context-Aware Language Modeling


Title	Low-Rank RNN Adaptation for Context-Aware Language Modeling
Authors	Aaron Jaech, Mari Ostendorf
Abstract	A context-aware language model uses location, user and/or domain metadata (context) to adapt its predictions. In neural language models, context information is typically represented as an embedding and it is given to the RNN as an additional input, which has been shown to be useful in many applications. We introduce a more powerful mechanism for using context to adapt an RNN by letting the context vector control a low-rank transformation of the recurrent layer weight matrix. Experiments show that allowing a greater fraction of the model parameters to be adjusted has benefits in terms of perplexity and classification for several different types of context.
Tasks	Language Modelling
Published	2017-10-06
URL	http://arxiv.org/abs/1710.02603v2
PDF	http://arxiv.org/pdf/1710.02603v2.pdf
PWC	https://paperswithcode.com/paper/low-rank-rnn-adaptation-for-context-aware
Repo	https://github.com/ajaech/calm
Framework	tf

Text Annotation Graphs: Annotating Complex Natural Language Phenomena


Title	Text Annotation Graphs: Annotating Complex Natural Language Phenomena
Authors	Angus G. Forbes, Kristine Lee, Gus Hahn-Powell, Marco A. Valenzuela-Escárcega, Mihai Surdeanu
Abstract	This paper introduces a new web-based software tool for annotating text, Text Annotation Graphs, or TAG. It provides functionality for representing complex relationships between words and word phrases that are not available in other software tools, including the ability to define and visualize relationships between the relationships themselves (semantic hypergraphs). Additionally, we include an approach to representing text annotations in which annotation subgraphs, or semantic summaries, are used to show relationships outside of the sequential context of the text itself. Users can use these subgraphs to quickly find similar structures within the current document or external annotated documents. Initially, TAG was developed to support information extraction tasks on a large database of biomedical articles. However, our software is flexible enough to support a wide range of annotation tasks for any domain. Examples are provided that showcase TAG’s capabilities on morphological parsing and event extraction tasks. The TAG software is available at: https://github.com/ CreativeCodingLab/TextAnnotationGraphs.
Tasks
Published	2017-11-01
URL	http://arxiv.org/abs/1711.00529v2
PDF	http://arxiv.org/pdf/1711.00529v2.pdf
PWC	https://paperswithcode.com/paper/text-annotation-graphs-annotating-complex
Repo	https://github.com/CreativeCodingLab/TextAnnotationGraphs
Framework	none

Imitating Driver Behavior with Generative Adversarial Networks


Title	Imitating Driver Behavior with Generative Adversarial Networks
Authors	Alex Kuefler, Jeremy Morton, Tim Wheeler, Mykel Kochenderfer
Abstract	The ability to accurately predict and simulate human driving behavior is critical for the development of intelligent transportation systems. Traditional modeling methods have employed simple parametric models and behavioral cloning. This paper adopts a method for overcoming the problem of cascading errors inherent in prior approaches, resulting in realistic behavior that is robust to trajectory perturbations. We extend Generative Adversarial Imitation Learning to the training of recurrent policies, and we demonstrate that our model outperforms rule-based controllers and maximum likelihood models in realistic highway simulations. Our model both reproduces emergent behavior of human drivers, such as lane change rate, while maintaining realistic control over long time horizons.
Tasks	Imitation Learning
Published	2017-01-24
URL	http://arxiv.org/abs/1701.06699v1
PDF	http://arxiv.org/pdf/1701.06699v1.pdf
PWC	https://paperswithcode.com/paper/imitating-driver-behavior-with-generative
Repo	https://github.com/sisl/gail-driver
Framework	none

DeepBreath: Deep Learning of Breathing Patterns for Automatic Stress Recognition using Low-Cost Thermal Imaging in Unconstrained Settings


Title	DeepBreath: Deep Learning of Breathing Patterns for Automatic Stress Recognition using Low-Cost Thermal Imaging in Unconstrained Settings
Authors	Youngjun Cho, Nadia Bianchi-Berthouze, Simon J. Julier
Abstract	We propose DeepBreath, a deep learning model which automatically recognises people’s psychological stress level (mental overload) from their breathing patterns. Using a low cost thermal camera, we track a person’s breathing patterns as temperature changes around his/her nostril. The paper’s technical contribution is threefold. First of all, instead of creating hand-crafted features to capture aspects of the breathing patterns, we transform the uni-dimensional breathing signals into two dimensional respiration variability spectrogram (RVS) sequences. The spectrograms easily capture the complexity of the breathing dynamics. Second, a spatial pattern analysis based on a deep Convolutional Neural Network (CNN) is directly applied to the spectrogram sequences without the need of hand-crafting features. Finally, a data augmentation technique, inspired from solutions for over-fitting problems in deep learning, is applied to allow the CNN to learn with a small-scale dataset from short-term measurements (e.g., up to a few hours). The model is trained and tested with data collected from people exposed to two types of cognitive tasks (Stroop Colour Word Test, Mental Computation test) with sessions of different difficulty levels. Using normalised self-report as ground truth, the CNN reaches 84.59% accuracy in discriminating between two levels of stress and 56.52% in discriminating between three levels. In addition, the CNN outperformed powerful shallow learning methods based on a single layer neural network. Finally, the dataset of labelled thermal images will be open to the community.
Tasks	Data Augmentation
Published	2017-08-20
URL	http://arxiv.org/abs/1708.06026v1
PDF	http://arxiv.org/pdf/1708.06026v1.pdf
PWC	https://paperswithcode.com/paper/deepbreath-deep-learning-of-breathing
Repo	https://github.com/deepneuroscience/Paced-Math-Test
Framework	none

Automatic 3D Cardiovascular MR Segmentation with Densely-Connected Volumetric ConvNets


Title	Automatic 3D Cardiovascular MR Segmentation with Densely-Connected Volumetric ConvNets
Authors	Lequan Yu, Jie-Zhi Cheng, Qi Dou, Xin Yang, Hao Chen, Jing Qin, Pheng-Ann Heng
Abstract	Automatic and accurate whole-heart and great vessel segmentation from 3D cardiac magnetic resonance (MR) images plays an important role in the computer-assisted diagnosis and treatment of cardiovascular disease. However, this task is very challenging due to ambiguous cardiac borders and large anatomical variations among different subjects. In this paper, we propose a novel densely-connected volumetric convolutional neural network, referred as DenseVoxNet, to automatically segment the cardiac and vascular structures from 3D cardiac MR images. The DenseVoxNet adopts the 3D fully convolutional architecture for effective volume-to-volume prediction. From the learning perspective, our DenseVoxNet has three compelling advantages. First, it preserves the maximum information flow between layers by a densely-connected mechanism and hence eases the network training. Second, it avoids learning redundant feature maps by encouraging feature reuse and hence requires fewer parameters to achieve high performance, which is essential for medical applications with limited training data. Third, we add auxiliary side paths to strengthen the gradient propagation and stabilize the learning process. We demonstrate the effectiveness of DenseVoxNet by comparing it with the state-of-the-art approaches from HVSMR 2016 challenge in conjunction with MICCAI, and our network achieves the best dice coefficient. We also show that our network can achieve better performance than other 3D ConvNets but with fewer parameters.
Tasks
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00573v1
PDF	http://arxiv.org/pdf/1708.00573v1.pdf
PWC	https://paperswithcode.com/paper/automatic-3d-cardiovascular-mr-segmentation
Repo	https://github.com/yulequan/HeartSeg
Framework	none

RRA: Recurrent Residual Attention for Sequence Learning


Title	RRA: Recurrent Residual Attention for Sequence Learning
Authors	Cheng Wang
Abstract	In this paper, we propose a recurrent neural network (RNN) with residual attention (RRA) to learn long-range dependencies from sequential data. We propose to add residual connections across timesteps to RNN, which explicitly enhances the interaction between current state and hidden states that are several timesteps apart. This also allows training errors to be directly back-propagated through residual connections and effectively alleviates gradient vanishing problem. We further reformulate an attention mechanism over residual connections. An attention gate is defined to summarize the individual contribution from multiple previous hidden states in computing the current state. We evaluate RRA on three tasks: the adding problem, pixel-by-pixel MNIST classification and sentiment analysis on the IMDB dataset. Our experiments demonstrate that RRA yields better performance, faster convergence and more stable training compared to a standard LSTM network. Furthermore, RRA shows highly competitive performance to the state-of-the-art methods.
Tasks	Sentiment Analysis
Published	2017-09-12
URL	http://arxiv.org/abs/1709.03714v1
PDF	http://arxiv.org/pdf/1709.03714v1.pdf
PWC	https://paperswithcode.com/paper/rra-recurrent-residual-attention-for-sequence
Repo	https://github.com/JRC1995/Abstractive-Summarization
Framework	tf

Multi-style Generative Network for Real-time Transfer


Title	Multi-style Generative Network for Real-time Transfer
Authors	Hang Zhang, Kristin Dana
Abstract	Despite the rapid progress in style transfer, existing approaches using feed-forward generative network for multi-style or arbitrary-style transfer are usually compromised of image quality and model flexibility. We find it is fundamentally difficult to achieve comprehensive style modeling using 1-dimensional style embedding. Motivated by this, we introduce CoMatch Layer that learns to match the second order feature statistics with the target styles. With the CoMatch Layer, we build a Multi-style Generative Network (MSG-Net), which achieves real-time performance. We also employ an specific strategy of upsampled convolution which avoids checkerboard artifacts caused by fractionally-strided convolution. Our method has achieved superior image quality comparing to state-of-the-art approaches. The proposed MSG-Net as a general approach for real-time style transfer is compatible with most existing techniques including content-style interpolation, color-preserving, spatial control and brush stroke size control. MSG-Net is the first to achieve real-time brush-size control in a purely feed-forward manner for style transfer. Our implementations and pre-trained models for Torch, PyTorch and MXNet frameworks will be publicly available.
Tasks	Style Transfer
Published	2017-03-20
URL	http://arxiv.org/abs/1703.06953v2
PDF	http://arxiv.org/pdf/1703.06953v2.pdf
PWC	https://paperswithcode.com/paper/multi-style-generative-network-for-real-time
Repo	https://github.com/zhanghang1989/PyTorch-Multi-Style-Transfer
Framework	pytorch

Stochastic Training of Graph Convolutional Networks with Variance Reduction


Title	Stochastic Training of Graph Convolutional Networks with Variance Reduction
Authors	Jianfei Chen, Jun Zhu, Le Song
Abstract	Graph convolutional networks (GCNs) are powerful deep neural networks for graph-structured data. However, GCN computes the representation of a node recursively from its neighbors, making the receptive field size grow exponentially with the number of layers. Previous attempts on reducing the receptive field size by subsampling neighbors do not have a convergence guarantee, and their receptive field size per node is still in the order of hundreds. In this paper, we develop control variate based algorithms which allow sampling an arbitrarily small neighbor size. Furthermore, we prove new theoretical guarantee for our algorithms to converge to a local optimum of GCN. Empirical results show that our algorithms enjoy a similar convergence with the exact algorithm using only two neighbors per node. The runtime of our algorithms on a large Reddit dataset is only one seventh of previous neighbor sampling algorithms.
Tasks
Published	2017-10-29
URL	http://arxiv.org/abs/1710.10568v3
PDF	http://arxiv.org/pdf/1710.10568v3.pdf
PWC	https://paperswithcode.com/paper/stochastic-training-of-graph-convolutional
Repo	https://github.com/thu-ml/stochastic_gcn
Framework	tf