October 21, 2019

2944 words 14 mins read

Paper Group AWR 113

Learning to Color from Language. RankME: Reliable Human Ratings for Natural Language Generation. Adversarially Regularized Graph Autoencoder for Graph Embedding. Probabilistic Recurrent State-Space Models. DARTS: Differentiable Architecture Search. Learning to Describe Differences Between Pairs of Similar Images. Federated Optimization in Heterogen …

Learning to Color from Language


Title	Learning to Color from Language
Authors	Varun Manjunatha, Mohit Iyyer, Jordan Boyd-Graber, Larry Davis
Abstract	Automatic colorization is the process of adding color to greyscale images. We condition this process on language, allowing end users to manipulate a colorized image by feeding in different captions. We present two different architectures for language-conditioned colorization, both of which produce more accurate and plausible colorizations than a language-agnostic version. Through this language-based framework, we can dramatically alter colorizations by manipulating descriptive color words in captions.
Tasks	Colorization
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06026v1
PDF	http://arxiv.org/pdf/1804.06026v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-color-from-language
Repo	https://github.com/superhans/colorfromlanguage
Framework	pytorch

RankME: Reliable Human Ratings for Natural Language Generation


Title	RankME: Reliable Human Ratings for Natural Language Generation
Authors	Jekaterina Novikova, Ondřej Dušek, Verena Rieser
Abstract	Human evaluation for natural language generation (NLG) often suffers from inconsistent user ratings. While previous research tends to attribute this problem to individual user preferences, we show that the quality of human judgements can also be improved by experimental design. We present a novel rank-based magnitude estimation method (RankME), which combines the use of continuous scales and relative assessments. We show that RankME significantly improves the reliability and consistency of human ratings compared to traditional evaluation methods. In addition, we show that it is possible to evaluate NLG systems according to multiple, distinct criteria, which is important for error analysis. Finally, we demonstrate that RankME, in combination with Bayesian estimation of system quality, is a cost-effective alternative for ranking multiple NLG systems.
Tasks	Text Generation
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05928v1
PDF	http://arxiv.org/pdf/1803.05928v1.pdf
PWC	https://paperswithcode.com/paper/rankme-reliable-human-ratings-for-natural
Repo	https://github.com/jeknov/RankME
Framework	none

Adversarially Regularized Graph Autoencoder for Graph Embedding


Title	Adversarially Regularized Graph Autoencoder for Graph Embedding
Authors	Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang
Abstract	Graph embedding is an effective method to represent graph data in a low dimensional space for graph analytics. Most existing embedding algorithms typically focus on preserving the topological structure or minimizing the reconstruction errors of graph data, but they have mostly ignored the data distribution of the latent codes from the graphs, which often results in inferior embedding in real-world graph data. In this paper, we propose a novel adversarial graph embedding framework for graph data. The framework encodes the topological structure and node content in a graph to a compact representation, on which a decoder is trained to reconstruct the graph structure. Furthermore, the latent representation is enforced to match a prior distribution via an adversarial training scheme. To learn a robust embedding, two variants of adversarial approaches, adversarially regularized graph autoencoder (ARGA) and adversarially regularized variational graph autoencoder (ARVGA), are developed. Experimental studies on real-world graphs validate our design and demonstrate that our algorithms outperform baselines by a wide margin in link prediction, graph clustering, and graph visualization tasks.
Tasks	Graph Clustering, Graph Embedding, Link Prediction
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04407v2
PDF	http://arxiv.org/pdf/1802.04407v2.pdf
PWC	https://paperswithcode.com/paper/adversarially-regularized-graph-autoencoder
Repo	https://github.com/Ruiqi-Hu/ARGA
Framework	tf

Probabilistic Recurrent State-Space Models


Title	Probabilistic Recurrent State-Space Models
Authors	Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, Sebastian Trimpe
Abstract	State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series data. Fully probabilistic SSMs, however, are often found hard to train, even for smaller problems. To overcome this limitation, we propose a novel model formulation and a scalable training algorithm based on doubly stochastic variational inference and Gaussian processes. In contrast to existing work, the proposed variational approximation allows one to fully capture the latent state temporal correlations. These correlations are the key to robust training. The effectiveness of the proposed PR-SSM is evaluated on a set of real-world benchmark datasets in comparison to state-of-the-art probabilistic model learning methods. Scalability and robustness are demonstrated on a high dimensional problem.
Tasks	Gaussian Processes, Time Series
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10395v2
PDF	http://arxiv.org/pdf/1801.10395v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-recurrent-state-space-models
Repo	https://github.com/boschresearch/PR-SSM
Framework	tf

DARTS: Differentiable Architecture Search


Title	DARTS: Differentiable Architecture Search
Authors	Hanxiao Liu, Karen Simonyan, Yiming Yang
Abstract	This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.
Tasks	Image Classification, Language Modelling, Neural Architecture Search
Published	2018-06-24
URL	http://arxiv.org/abs/1806.09055v2
PDF	http://arxiv.org/pdf/1806.09055v2.pdf
PWC	https://paperswithcode.com/paper/darts-differentiable-architecture-search
Repo	https://github.com/yochaiz/darts-UNIQ
Framework	pytorch

Learning to Describe Differences Between Pairs of Similar Images


Title	Learning to Describe Differences Between Pairs of Similar Images
Authors	Harsh Jhamtani, Taylor Berg-Kirkpatrick
Abstract	In this paper, we introduce the task of automatically generating text to describe the differences between two similar images. We collect a new dataset by crowd-sourcing difference descriptions for pairs of image frames extracted from video-surveillance footage. Annotators were asked to succinctly describe all the differences in a short paragraph. As a result, our novel dataset provides an opportunity to explore models that align language and vision, and capture visual salience. The dataset may also be a useful benchmark for coherent multi-sentence generation. We perform a firstpass visual analysis that exposes clusters of differing pixels as a proxy for object-level differences. We propose a model that captures visual salience by using a latent variable to align clusters of differing pixels with output sentences. We find that, for both single-sentence generation and as well as multi-sentence generation, the proposed model outperforms the models that use attention alone.
Tasks
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10584v1
PDF	http://arxiv.org/pdf/1808.10584v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-describe-differences-between
Repo	https://github.com/harsh19/spot-the-diff
Framework	none

Federated Optimization in Heterogeneous Networks


Title	Federated Optimization in Heterogeneous Networks
Authors	Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith
Abstract	Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedProx, to tackle heterogeneity in federated networks. FedProx can be viewed as a generalization and re-parametrization of FedAvg, the current state-of-the-art method for federated learning. While FedProx makes only minor algorithmic modifications to FedAvg, these modifications have important ramifications both in theory and in practice. Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity). Practically, we demonstrate that FedProx allows for more robust convergence than FedAvg across a suite of federated datasets. In particular, in highly heterogeneous settings, FedProx demonstrates significantly more stable and accurate convergence behavior relative to FedAvg—improving absolute test accuracy by 22% on average.
Tasks	Distributed Optimization
Published	2018-12-14
URL	https://arxiv.org/abs/1812.06127v4
PDF	https://arxiv.org/pdf/1812.06127v4.pdf
PWC	https://paperswithcode.com/paper/federated-optimization-for-heterogeneous
Repo	https://github.com/litian96/FedProx
Framework	none

DeepScores and Deep Watershed Detection: current state and open issues


Title	DeepScores and Deep Watershed Detection: current state and open issues
Authors	Ismail Elezi, Lukas Tuggener, Marcello Pelillo, Thilo Stadelmann
Abstract	This paper gives an overview of our current Optical Music Recognition (OMR) research. We recently released the OMR dataset \emph{DeepScores} as well as the object detection method \emph{Deep Watershed Detector}. We are currently taking some additional steps to improve both of them. Here we summarize current and future efforts, aimed at improving usefulness on real-world task and tackling extreme class imbalance.
Tasks	Object Detection
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05423v1
PDF	http://arxiv.org/pdf/1810.05423v1.pdf
PWC	https://paperswithcode.com/paper/deepscores-and-deep-watershed-detection
Repo	https://github.com/tuggeluk/DeepWatershedDetection
Framework	tf

Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning


Title	Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning
Authors	Ulysse Côté-Allard, Cheikh Latyr Fall, Alexandre Drouin, Alexandre Campeau-Lecours, Clément Gosselin, Kyrre Glette, François Laviolette, Benoit Gosselin
Abstract	In recent years, deep learning algorithms have become increasingly more prominent for their unparalleled ability to automatically learn discriminant features from large amounts of data. However, within the field of electromyography-based gesture recognition, deep learning algorithms are seldom employed as they require an unreasonable amount of effort from a single person, to generate tens of thousands of examples. This work’s hypothesis is that general, informative features can be learned from the large amounts of data generated by aggregating the signals of multiple users, thus reducing the recording burden while enhancing gesture recognition. Consequently, this paper proposes applying transfer learning on aggregated data from multiple users, while leveraging the capacity of deep learning algorithms to learn discriminant features from large datasets. Two datasets comprised of 19 and 17 able-bodied participants respectively (the first one is employed for pre-training) were recorded for this work, using the Myo Armband. A third Myo Armband dataset was taken from the NinaPro database and is comprised of 10 able-bodied participants. Three different deep learning networks employing three different modalities as input (raw EMG, Spectrograms and Continuous Wavelet Transform (CWT)) are tested on the second and third dataset. The proposed transfer learning scheme is shown to systematically and significantly enhance the performance for all three networks on the two datasets, achieving an offline accuracy of 98.31% for 7 gestures over 17 participants for the CWT-based ConvNet and 68.98% for 18 gestures over 10 participants for the raw EMG-based ConvNet. Finally, a use-case study employing eight able-bodied participants suggests that real-time feedback allows users to adapt their muscle activation strategy which reduces the degradation in accuracy normally experienced over time.
Tasks	Gesture Recognition, Transfer Learning
Published	2018-01-10
URL	http://arxiv.org/abs/1801.07756v5
PDF	http://arxiv.org/pdf/1801.07756v5.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-electromyographic-hand
Repo	https://github.com/UlysseCoteAllard/MyoArmbandDataset
Framework	pytorch

TADAM: Task dependent adaptive metric for improved few-shot learning


Title	TADAM: Task dependent adaptive metric for improved few-shot learning
Authors	Boris N. Oreshkin, Pau Rodriguez, Alexandre Lacoste
Abstract	Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14% in accuracy for certain metrics on the mini-Imagenet 5-way 5-shot classification task. We further propose a simple and effective way of conditioning a learner on the task sample set, resulting in learning a task-dependent metric space. Moreover, we propose and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space. The resulting few-shot learning model based on the task-dependent scaled metric achieves state of the art on mini-Imagenet. We confirm these results on another few-shot dataset that we introduce in this paper based on CIFAR100. Our code is publicly available at https://github.com/ElementAI/TADAM.
Tasks	Few-Shot Image Classification, Few-Shot Learning
Published	2018-05-23
URL	http://arxiv.org/abs/1805.10123v4
PDF	http://arxiv.org/pdf/1805.10123v4.pdf
PWC	https://paperswithcode.com/paper/tadam-task-dependent-adaptive-metric-for
Repo	https://github.com/yaoyao-liu/meta-transfer-learning
Framework	pytorch

On First-Order Meta-Learning Algorithms


Title	On First-Order Meta-Learning Algorithms
Authors	Alex Nichol, Joshua Achiam, John Schulman
Abstract	This paper considers meta-learning problems, where there is a distribution of tasks, and we would like to obtain an agent that performs well (i.e., learns quickly) when presented with a previously unseen task sampled from this distribution. We analyze a family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates. This family includes and generalizes first-order MAML, an approximation to MAML obtained by ignoring second-order derivatives. It also includes Reptile, a new algorithm that we introduce here, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task. We expand on the results from Finn et al. showing that first-order meta-learning algorithms perform well on some well-established benchmarks for few-shot classification, and we provide theoretical analysis aimed at understanding why these algorithms work.
Tasks	Few-Shot Image Classification, Few-Shot Learning, Meta-Learning
Published	2018-03-08
URL	http://arxiv.org/abs/1803.02999v3
PDF	http://arxiv.org/pdf/1803.02999v3.pdf
PWC	https://paperswithcode.com/paper/on-first-order-meta-learning-algorithms
Repo	https://github.com/peisungtsai/Reptile-Pytorch-Implementation
Framework	pytorch

Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?


Title	Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?
Authors	Nam Vo, James Hays
Abstract	This work studies deep metric learning under small to medium scale data as we believe that better generalization could be a contributing factor to the improvement of previous fine-grained image retrieval methods; it should be considered when designing future techniques. In particular, we investigate using other layers in a deep metric learning system (besides the embedding layer) for feature extraction and analyze how well they perform on training data and generalize to testing data. From this study, we suggest a new regularization practice where one can add or choose a more optimal layer for feature extraction. State-of-the-art performance is demonstrated on 3 fine-grained image retrieval benchmarks: Cars-196, CUB-200-2011, and Stanford Online Product.
Tasks	Image Retrieval, Metric Learning
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03310v2
PDF	http://arxiv.org/pdf/1803.03310v2.pdf
PWC	https://paperswithcode.com/paper/generalization-in-metric-learning-should-the
Repo	https://github.com/lugiavn/generalization-dml
Framework	pytorch

How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks


Title	How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks
Authors	Linli Xu, Liang Jiang, Chuan Qin, Zhe Wang, Dongfang Du
Abstract	With the recent advances of neural models and natural language processing, automatic generation of classical Chinese poetry has drawn significant attention due to its artistic and cultural value. Previous works mainly focus on generating poetry given keywords or other text information, while visual inspirations for poetry have been rarely explored. Generating poetry from images is much more challenging than generating poetry from text, since images contain very rich visual information which cannot be described completely using several keywords, and a good poem should convey the image accurately. In this paper, we propose a memory based neural model which exploits images to generate poems. Specifically, an Encoder-Decoder model with a topic memory network is proposed to generate classical Chinese poetry from images. To the best of our knowledge, this is the first work attempting to generate classical Chinese poetry from images with neural networks. A comprehensive experimental investigation with both human evaluation and quantitative analysis demonstrates that the proposed model can generate poems which convey images accurately.
Tasks
Published	2018-03-08
URL	http://arxiv.org/abs/1803.02994v1
PDF	http://arxiv.org/pdf/1803.02994v1.pdf
PWC	https://paperswithcode.com/paper/how-images-inspire-poems-generating-classical
Repo	https://github.com/forrestbing/chinese-poetry-generation
Framework	none

Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling


Title	Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling
Authors	Zhiqing Sun, Zhi-Hong Deng
Abstract	Previous traditional approaches to unsupervised Chinese word segmentation (CWS) can be roughly classified into discriminative and generative models. The former uses the carefully designed goodness measures for candidate segmentation, while the latter focuses on finding the optimal segmentation of the highest generative probability. However, while there exists a trivial way to extend the discriminative models into neural version by using neural language models, those of generative ones are non-trivial. In this paper, we propose the segmental language models (SLMs) for CWS. Our approach explicitly focuses on the segmental nature of Chinese, as well as preserves several properties of language models. In SLMs, a context encoder encodes the previous context and a segment decoder generates each segment incrementally. As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff.
Tasks	Chinese Word Segmentation, Language Modelling
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03167v1
PDF	http://arxiv.org/pdf/1810.03167v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-neural-word-segmentation-for
Repo	https://github.com/Edward-Sun/SLM
Framework	pytorch

Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions


Title	Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions
Authors	Mahsa Lotfi, Mathukumalli Vidyasagar
Abstract	In this paper, we study the problem of compressed sensing using binary measurement matrices, and $\ell_1$-norm minimization (basis pursuit) as the recovery algorithm. We derive new upper and lower bounds on the number of measurements to achieve robust sparse recovery with binary matrices. We establish sufficient conditions for a column-regular binary matrix to satisfy the robust null space property (RNSP), and show that the sparsity bounds for robust sparse recovery obtained using the RNSP are better by a factor of $(3 \sqrt{3})/2 \approx 2.6$ compared to the restricted isometry property (RIP). Next we derive universal \textit{lower} bounds on the number of measurements that any binary matrix needs to have in order to satisfy the weaker sufficient condition based on the RNSP, and show that bipartite graphs of girth six are optimal. Then we display two classes of binary matrices, namely parity check matrices of array codes, and Euler squares, that have girth six and are nearly optimal in the sense of almost satisfying the lower bound. In principle randomly generated Gaussian measurement matrices are `order-optimal.’ So we compare the phase transition behavior of the basis pursuit formulation using binary array code and Gaussian matrices, and show that (i) there is essentially no difference between the phase transition boundaries in the two cases, and (ii) the CPU time of basis pursuit with binary matrices is hundreds of times faster than with Gaussian matrices, and the storage requirements are less. Therefore it is suggested that binary matrices are a viable alternative to Gaussian matrices for compressed sensing using basis pursuit. \|
Tasks
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03001v2
PDF	http://arxiv.org/pdf/1808.03001v2.pdf
PWC	https://paperswithcode.com/paper/compressed-sensing-using-binary-matrices-of
Repo	https://github.com/monajemi/CompressedSensing
Framework	none