October 21, 2019

2944 words 14 mins read

Paper Group AWR 113

Paper Group AWR 113

Learning to Color from Language. RankME: Reliable Human Ratings for Natural Language Generation. Adversarially Regularized Graph Autoencoder for Graph Embedding. Probabilistic Recurrent State-Space Models. DARTS: Differentiable Architecture Search. Learning to Describe Differences Between Pairs of Similar Images. Federated Optimization in Heterogen …

Learning to Color from Language

Title Learning to Color from Language
Authors Varun Manjunatha, Mohit Iyyer, Jordan Boyd-Graber, Larry Davis
Abstract Automatic colorization is the process of adding color to greyscale images. We condition this process on language, allowing end users to manipulate a colorized image by feeding in different captions. We present two different architectures for language-conditioned colorization, both of which produce more accurate and plausible colorizations than a language-agnostic version. Through this language-based framework, we can dramatically alter colorizations by manipulating descriptive color words in captions.
Tasks Colorization
Published 2018-04-17
URL http://arxiv.org/abs/1804.06026v1
PDF http://arxiv.org/pdf/1804.06026v1.pdf
PWC https://paperswithcode.com/paper/learning-to-color-from-language
Repo https://github.com/superhans/colorfromlanguage
Framework pytorch

RankME: Reliable Human Ratings for Natural Language Generation

Title RankME: Reliable Human Ratings for Natural Language Generation
Authors Jekaterina Novikova, Ondřej Dušek, Verena Rieser
Abstract Human evaluation for natural language generation (NLG) often suffers from inconsistent user ratings. While previous research tends to attribute this problem to individual user preferences, we show that the quality of human judgements can also be improved by experimental design. We present a novel rank-based magnitude estimation method (RankME), which combines the use of continuous scales and relative assessments. We show that RankME significantly improves the reliability and consistency of human ratings compared to traditional evaluation methods. In addition, we show that it is possible to evaluate NLG systems according to multiple, distinct criteria, which is important for error analysis. Finally, we demonstrate that RankME, in combination with Bayesian estimation of system quality, is a cost-effective alternative for ranking multiple NLG systems.
Tasks Text Generation
Published 2018-03-15
URL http://arxiv.org/abs/1803.05928v1
PDF http://arxiv.org/pdf/1803.05928v1.pdf
PWC https://paperswithcode.com/paper/rankme-reliable-human-ratings-for-natural
Repo https://github.com/jeknov/RankME
Framework none

Adversarially Regularized Graph Autoencoder for Graph Embedding

Title Adversarially Regularized Graph Autoencoder for Graph Embedding
Authors Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang
Abstract Graph embedding is an effective method to represent graph data in a low dimensional space for graph analytics. Most existing embedding algorithms typically focus on preserving the topological structure or minimizing the reconstruction errors of graph data, but they have mostly ignored the data distribution of the latent codes from the graphs, which often results in inferior embedding in real-world graph data. In this paper, we propose a novel adversarial graph embedding framework for graph data. The framework encodes the topological structure and node content in a graph to a compact representation, on which a decoder is trained to reconstruct the graph structure. Furthermore, the latent representation is enforced to match a prior distribution via an adversarial training scheme. To learn a robust embedding, two variants of adversarial approaches, adversarially regularized graph autoencoder (ARGA) and adversarially regularized variational graph autoencoder (ARVGA), are developed. Experimental studies on real-world graphs validate our design and demonstrate that our algorithms outperform baselines by a wide margin in link prediction, graph clustering, and graph visualization tasks.
Tasks Graph Clustering, Graph Embedding, Link Prediction
Published 2018-02-13
URL http://arxiv.org/abs/1802.04407v2
PDF http://arxiv.org/pdf/1802.04407v2.pdf
PWC https://paperswithcode.com/paper/adversarially-regularized-graph-autoencoder
Repo https://github.com/Ruiqi-Hu/ARGA
Framework tf

Probabilistic Recurrent State-Space Models

Title Probabilistic Recurrent State-Space Models
Authors Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, Sebastian Trimpe
Abstract State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series data. Fully probabilistic SSMs, however, are often found hard to train, even for smaller problems. To overcome this limitation, we propose a novel model formulation and a scalable training algorithm based on doubly stochastic variational inference and Gaussian processes. In contrast to existing work, the proposed variational approximation allows one to fully capture the latent state temporal correlations. These correlations are the key to robust training. The effectiveness of the proposed PR-SSM is evaluated on a set of real-world benchmark datasets in comparison to state-of-the-art probabilistic model learning methods. Scalability and robustness are demonstrated on a high dimensional problem.
Tasks Gaussian Processes, Time Series
Published 2018-01-31
URL http://arxiv.org/abs/1801.10395v2
PDF http://arxiv.org/pdf/1801.10395v2.pdf
PWC https://paperswithcode.com/paper/probabilistic-recurrent-state-space-models
Repo https://github.com/boschresearch/PR-SSM
Framework tf
Title DARTS: Differentiable Architecture Search
Authors Hanxiao Liu, Karen Simonyan, Yiming Yang
Abstract This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.
Tasks Image Classification, Language Modelling, Neural Architecture Search
Published 2018-06-24
URL http://arxiv.org/abs/1806.09055v2
PDF http://arxiv.org/pdf/1806.09055v2.pdf
PWC https://paperswithcode.com/paper/darts-differentiable-architecture-search
Repo https://github.com/yochaiz/darts-UNIQ
Framework pytorch

Learning to Describe Differences Between Pairs of Similar Images

Title Learning to Describe Differences Between Pairs of Similar Images
Authors Harsh Jhamtani, Taylor Berg-Kirkpatrick
Abstract In this paper, we introduce the task of automatically generating text to describe the differences between two similar images. We collect a new dataset by crowd-sourcing difference descriptions for pairs of image frames extracted from video-surveillance footage. Annotators were asked to succinctly describe all the differences in a short paragraph. As a result, our novel dataset provides an opportunity to explore models that align language and vision, and capture visual salience. The dataset may also be a useful benchmark for coherent multi-sentence generation. We perform a firstpass visual analysis that exposes clusters of differing pixels as a proxy for object-level differences. We propose a model that captures visual salience by using a latent variable to align clusters of differing pixels with output sentences. We find that, for both single-sentence generation and as well as multi-sentence generation, the proposed model outperforms the models that use attention alone.
Tasks
Published 2018-08-31
URL http://arxiv.org/abs/1808.10584v1
PDF http://arxiv.org/pdf/1808.10584v1.pdf
PWC https://paperswithcode.com/paper/learning-to-describe-differences-between
Repo https://github.com/harsh19/spot-the-diff
Framework none

Federated Optimization in Heterogeneous Networks

Title Federated Optimization in Heterogeneous Networks
Authors Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith
Abstract Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedProx, to tackle heterogeneity in federated networks. FedProx can be viewed as a generalization and re-parametrization of FedAvg, the current state-of-the-art method for federated learning. While FedProx makes only minor algorithmic modifications to FedAvg, these modifications have important ramifications both in theory and in practice. Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity). Practically, we demonstrate that FedProx allows for more robust convergence than FedAvg across a suite of federated datasets. In particular, in highly heterogeneous settings, FedProx demonstrates significantly more stable and accurate convergence behavior relative to FedAvg—improving absolute test accuracy by 22% on average.
Tasks Distributed Optimization
Published 2018-12-14
URL https://arxiv.org/abs/1812.06127v4
PDF https://arxiv.org/pdf/1812.06127v4.pdf
PWC https://paperswithcode.com/paper/federated-optimization-for-heterogeneous
Repo https://github.com/litian96/FedProx
Framework none

DeepScores and Deep Watershed Detection: current state and open issues

Title DeepScores and Deep Watershed Detection: current state and open issues
Authors Ismail Elezi, Lukas Tuggener, Marcello Pelillo, Thilo Stadelmann
Abstract This paper gives an overview of our current Optical Music Recognition (OMR) research. We recently released the OMR dataset \emph{DeepScores} as well as the object detection method \emph{Deep Watershed Detector}. We are currently taking some additional steps to improve both of them. Here we summarize current and future efforts, aimed at improving usefulness on real-world task and tackling extreme class imbalance.
Tasks Object Detection
Published 2018-10-12
URL http://arxiv.org/abs/1810.05423v1
PDF http://arxiv.org/pdf/1810.05423v1.pdf
PWC https://paperswithcode.com/paper/deepscores-and-deep-watershed-detection
Repo https://github.com/tuggeluk/DeepWatershedDetection
Framework tf

Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning

Title Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning
Authors Ulysse Côté-Allard, Cheikh Latyr Fall, Alexandre Drouin, Alexandre Campeau-Lecours, Clément Gosselin, Kyrre Glette, François Laviolette, Benoit Gosselin
Abstract In recent years, deep learning algorithms have become increasingly more prominent for their unparalleled ability to automatically learn discriminant features from large amounts of data. However, within the field of electromyography-based gesture recognition, deep learning algorithms are seldom employed as they require an unreasonable amount of effort from a single person, to generate tens of thousands of examples. This work’s hypothesis is that general, informative features can be learned from the large amounts of data generated by aggregating the signals of multiple users, thus reducing the recording burden while enhancing gesture recognition. Consequently, this paper proposes applying transfer learning on aggregated data from multiple users, while leveraging the capacity of deep learning algorithms to learn discriminant features from large datasets. Two datasets comprised of 19 and 17 able-bodied participants respectively (the first one is employed for pre-training) were recorded for this work, using the Myo Armband. A third Myo Armband dataset was taken from the NinaPro database and is comprised of 10 able-bodied participants. Three different deep learning networks employing three different modalities as input (raw EMG, Spectrograms and Continuous Wavelet Transform (CWT)) are tested on the second and third dataset. The proposed transfer learning scheme is shown to systematically and significantly enhance the performance for all three networks on the two datasets, achieving an offline accuracy of 98.31% for 7 gestures over 17 participants for the CWT-based ConvNet and 68.98% for 18 gestures over 10 participants for the raw EMG-based ConvNet. Finally, a use-case study employing eight able-bodied participants suggests that real-time feedback allows users to adapt their muscle activation strategy which reduces the degradation in accuracy normally experienced over time.
Tasks Gesture Recognition, Transfer Learning
Published 2018-01-10
URL http://arxiv.org/abs/1801.07756v5
PDF http://arxiv.org/pdf/1801.07756v5.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-electromyographic-hand
Repo https://github.com/UlysseCoteAllard/MyoArmbandDataset
Framework pytorch

TADAM: Task dependent adaptive metric for improved few-shot learning

Title TADAM: Task dependent adaptive metric for improved few-shot learning
Authors Boris N. Oreshkin, Pau Rodriguez, Alexandre Lacoste
Abstract Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14% in accuracy for certain metrics on the mini-Imagenet 5-way 5-shot classification task. We further propose a simple and effective way of conditioning a learner on the task sample set, resulting in learning a task-dependent metric space. Moreover, we propose and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space. The resulting few-shot learning model based on the task-dependent scaled metric achieves state of the art on mini-Imagenet. We confirm these results on another few-shot dataset that we introduce in this paper based on CIFAR100. Our code is publicly available at https://github.com/ElementAI/TADAM.
Tasks Few-Shot Image Classification, Few-Shot Learning
Published 2018-05-23
URL http://arxiv.org/abs/1805.10123v4
PDF http://arxiv.org/pdf/1805.10123v4.pdf
PWC https://paperswithcode.com/paper/tadam-task-dependent-adaptive-metric-for
Repo https://github.com/yaoyao-liu/meta-transfer-learning
Framework pytorch

On First-Order Meta-Learning Algorithms

Title On First-Order Meta-Learning Algorithms
Authors Alex Nichol, Joshua Achiam, John Schulman
Abstract This paper considers meta-learning problems, where there is a distribution of tasks, and we would like to obtain an agent that performs well (i.e., learns quickly) when presented with a previously unseen task sampled from this distribution. We analyze a family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates. This family includes and generalizes first-order MAML, an approximation to MAML obtained by ignoring second-order derivatives. It also includes Reptile, a new algorithm that we introduce here, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task. We expand on the results from Finn et al. showing that first-order meta-learning algorithms perform well on some well-established benchmarks for few-shot classification, and we provide theoretical analysis aimed at understanding why these algorithms work.
Tasks Few-Shot Image Classification, Few-Shot Learning, Meta-Learning
Published 2018-03-08
URL http://arxiv.org/abs/1803.02999v3
PDF http://arxiv.org/pdf/1803.02999v3.pdf
PWC https://paperswithcode.com/paper/on-first-order-meta-learning-algorithms
Repo https://github.com/peisungtsai/Reptile-Pytorch-Implementation
Framework pytorch

Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?

Title Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?
Authors Nam Vo, James Hays
Abstract This work studies deep metric learning under small to medium scale data as we believe that better generalization could be a contributing factor to the improvement of previous fine-grained image retrieval methods; it should be considered when designing future techniques. In particular, we investigate using other layers in a deep metric learning system (besides the embedding layer) for feature extraction and analyze how well they perform on training data and generalize to testing data. From this study, we suggest a new regularization practice where one can add or choose a more optimal layer for feature extraction. State-of-the-art performance is demonstrated on 3 fine-grained image retrieval benchmarks: Cars-196, CUB-200-2011, and Stanford Online Product.
Tasks Image Retrieval, Metric Learning
Published 2018-03-08
URL http://arxiv.org/abs/1803.03310v2
PDF http://arxiv.org/pdf/1803.03310v2.pdf
PWC https://paperswithcode.com/paper/generalization-in-metric-learning-should-the
Repo https://github.com/lugiavn/generalization-dml
Framework pytorch

How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks

Title How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks
Authors Linli Xu, Liang Jiang, Chuan Qin, Zhe Wang, Dongfang Du
Abstract With the recent advances of neural models and natural language processing, automatic generation of classical Chinese poetry has drawn significant attention due to its artistic and cultural value. Previous works mainly focus on generating poetry given keywords or other text information, while visual inspirations for poetry have been rarely explored. Generating poetry from images is much more challenging than generating poetry from text, since images contain very rich visual information which cannot be described completely using several keywords, and a good poem should convey the image accurately. In this paper, we propose a memory based neural model which exploits images to generate poems. Specifically, an Encoder-Decoder model with a topic memory network is proposed to generate classical Chinese poetry from images. To the best of our knowledge, this is the first work attempting to generate classical Chinese poetry from images with neural networks. A comprehensive experimental investigation with both human evaluation and quantitative analysis demonstrates that the proposed model can generate poems which convey images accurately.
Tasks
Published 2018-03-08
URL http://arxiv.org/abs/1803.02994v1
PDF http://arxiv.org/pdf/1803.02994v1.pdf
PWC https://paperswithcode.com/paper/how-images-inspire-poems-generating-classical
Repo https://github.com/forrestbing/chinese-poetry-generation
Framework none

Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling

Title Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling
Authors Zhiqing Sun, Zhi-Hong Deng
Abstract Previous traditional approaches to unsupervised Chinese word segmentation (CWS) can be roughly classified into discriminative and generative models. The former uses the carefully designed goodness measures for candidate segmentation, while the latter focuses on finding the optimal segmentation of the highest generative probability. However, while there exists a trivial way to extend the discriminative models into neural version by using neural language models, those of generative ones are non-trivial. In this paper, we propose the segmental language models (SLMs) for CWS. Our approach explicitly focuses on the segmental nature of Chinese, as well as preserves several properties of language models. In SLMs, a context encoder encodes the previous context and a segment decoder generates each segment incrementally. As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff.
Tasks Chinese Word Segmentation, Language Modelling
Published 2018-10-07
URL http://arxiv.org/abs/1810.03167v1
PDF http://arxiv.org/pdf/1810.03167v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-neural-word-segmentation-for
Repo https://github.com/Edward-Sun/SLM
Framework pytorch

Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions

Title Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions
Authors Mahsa Lotfi, Mathukumalli Vidyasagar
Abstract In this paper, we study the problem of compressed sensing using binary measurement matrices, and $\ell_1$-norm minimization (basis pursuit) as the recovery algorithm. We derive new upper and lower bounds on the number of measurements to achieve robust sparse recovery with binary matrices. We establish sufficient conditions for a column-regular binary matrix to satisfy the robust null space property (RNSP), and show that the sparsity bounds for robust sparse recovery obtained using the RNSP are better by a factor of $(3 \sqrt{3})/2 \approx 2.6$ compared to the restricted isometry property (RIP). Next we derive universal \textit{lower} bounds on the number of measurements that any binary matrix needs to have in order to satisfy the weaker sufficient condition based on the RNSP, and show that bipartite graphs of girth six are optimal. Then we display two classes of binary matrices, namely parity check matrices of array codes, and Euler squares, that have girth six and are nearly optimal in the sense of almost satisfying the lower bound. In principle randomly generated Gaussian measurement matrices are `order-optimal.’ So we compare the phase transition behavior of the basis pursuit formulation using binary array code and Gaussian matrices, and show that (i) there is essentially no difference between the phase transition boundaries in the two cases, and (ii) the CPU time of basis pursuit with binary matrices is hundreds of times faster than with Gaussian matrices, and the storage requirements are less. Therefore it is suggested that binary matrices are a viable alternative to Gaussian matrices for compressed sensing using basis pursuit. |
Tasks
Published 2018-08-09
URL http://arxiv.org/abs/1808.03001v2
PDF http://arxiv.org/pdf/1808.03001v2.pdf
PWC https://paperswithcode.com/paper/compressed-sensing-using-binary-matrices-of
Repo https://github.com/monajemi/CompressedSensing
Framework none
comments powered by Disqus