Paper Group AWR 113
Learning to Color from Language. RankME: Reliable Human Ratings for Natural Language Generation. Adversarially Regularized Graph Autoencoder for Graph Embedding. Probabilistic Recurrent State-Space Models. DARTS: Differentiable Architecture Search. Learning to Describe Differences Between Pairs of Similar Images. Federated Optimization in Heterogen …
Learning to Color from Language
Title | Learning to Color from Language |
Authors | Varun Manjunatha, Mohit Iyyer, Jordan Boyd-Graber, Larry Davis |
Abstract | Automatic colorization is the process of adding color to greyscale images. We condition this process on language, allowing end users to manipulate a colorized image by feeding in different captions. We present two different architectures for language-conditioned colorization, both of which produce more accurate and plausible colorizations than a language-agnostic version. Through this language-based framework, we can dramatically alter colorizations by manipulating descriptive color words in captions. |
Tasks | Colorization |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06026v1 |
http://arxiv.org/pdf/1804.06026v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-color-from-language |
Repo | https://github.com/superhans/colorfromlanguage |
Framework | pytorch |
RankME: Reliable Human Ratings for Natural Language Generation
Title | RankME: Reliable Human Ratings for Natural Language Generation |
Authors | Jekaterina Novikova, Ondřej Dušek, Verena Rieser |
Abstract | Human evaluation for natural language generation (NLG) often suffers from inconsistent user ratings. While previous research tends to attribute this problem to individual user preferences, we show that the quality of human judgements can also be improved by experimental design. We present a novel rank-based magnitude estimation method (RankME), which combines the use of continuous scales and relative assessments. We show that RankME significantly improves the reliability and consistency of human ratings compared to traditional evaluation methods. In addition, we show that it is possible to evaluate NLG systems according to multiple, distinct criteria, which is important for error analysis. Finally, we demonstrate that RankME, in combination with Bayesian estimation of system quality, is a cost-effective alternative for ranking multiple NLG systems. |
Tasks | Text Generation |
Published | 2018-03-15 |
URL | http://arxiv.org/abs/1803.05928v1 |
http://arxiv.org/pdf/1803.05928v1.pdf | |
PWC | https://paperswithcode.com/paper/rankme-reliable-human-ratings-for-natural |
Repo | https://github.com/jeknov/RankME |
Framework | none |
Adversarially Regularized Graph Autoencoder for Graph Embedding
Title | Adversarially Regularized Graph Autoencoder for Graph Embedding |
Authors | Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang |
Abstract | Graph embedding is an effective method to represent graph data in a low dimensional space for graph analytics. Most existing embedding algorithms typically focus on preserving the topological structure or minimizing the reconstruction errors of graph data, but they have mostly ignored the data distribution of the latent codes from the graphs, which often results in inferior embedding in real-world graph data. In this paper, we propose a novel adversarial graph embedding framework for graph data. The framework encodes the topological structure and node content in a graph to a compact representation, on which a decoder is trained to reconstruct the graph structure. Furthermore, the latent representation is enforced to match a prior distribution via an adversarial training scheme. To learn a robust embedding, two variants of adversarial approaches, adversarially regularized graph autoencoder (ARGA) and adversarially regularized variational graph autoencoder (ARVGA), are developed. Experimental studies on real-world graphs validate our design and demonstrate that our algorithms outperform baselines by a wide margin in link prediction, graph clustering, and graph visualization tasks. |
Tasks | Graph Clustering, Graph Embedding, Link Prediction |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04407v2 |
http://arxiv.org/pdf/1802.04407v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarially-regularized-graph-autoencoder |
Repo | https://github.com/Ruiqi-Hu/ARGA |
Framework | tf |
Probabilistic Recurrent State-Space Models
Title | Probabilistic Recurrent State-Space Models |
Authors | Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, Sebastian Trimpe |
Abstract | State-space models (SSMs) are a highly expressive model class for learning patterns in time series data and for system identification. Deterministic versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex time series data. Fully probabilistic SSMs, however, are often found hard to train, even for smaller problems. To overcome this limitation, we propose a novel model formulation and a scalable training algorithm based on doubly stochastic variational inference and Gaussian processes. In contrast to existing work, the proposed variational approximation allows one to fully capture the latent state temporal correlations. These correlations are the key to robust training. The effectiveness of the proposed PR-SSM is evaluated on a set of real-world benchmark datasets in comparison to state-of-the-art probabilistic model learning methods. Scalability and robustness are demonstrated on a high dimensional problem. |
Tasks | Gaussian Processes, Time Series |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10395v2 |
http://arxiv.org/pdf/1801.10395v2.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-recurrent-state-space-models |
Repo | https://github.com/boschresearch/PR-SSM |
Framework | tf |
DARTS: Differentiable Architecture Search
Title | DARTS: Differentiable Architecture Search |
Authors | Hanxiao Liu, Karen Simonyan, Yiming Yang |
Abstract | This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms. |
Tasks | Image Classification, Language Modelling, Neural Architecture Search |
Published | 2018-06-24 |
URL | http://arxiv.org/abs/1806.09055v2 |
http://arxiv.org/pdf/1806.09055v2.pdf | |
PWC | https://paperswithcode.com/paper/darts-differentiable-architecture-search |
Repo | https://github.com/yochaiz/darts-UNIQ |
Framework | pytorch |
Learning to Describe Differences Between Pairs of Similar Images
Title | Learning to Describe Differences Between Pairs of Similar Images |
Authors | Harsh Jhamtani, Taylor Berg-Kirkpatrick |
Abstract | In this paper, we introduce the task of automatically generating text to describe the differences between two similar images. We collect a new dataset by crowd-sourcing difference descriptions for pairs of image frames extracted from video-surveillance footage. Annotators were asked to succinctly describe all the differences in a short paragraph. As a result, our novel dataset provides an opportunity to explore models that align language and vision, and capture visual salience. The dataset may also be a useful benchmark for coherent multi-sentence generation. We perform a firstpass visual analysis that exposes clusters of differing pixels as a proxy for object-level differences. We propose a model that captures visual salience by using a latent variable to align clusters of differing pixels with output sentences. We find that, for both single-sentence generation and as well as multi-sentence generation, the proposed model outperforms the models that use attention alone. |
Tasks | |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1808.10584v1 |
http://arxiv.org/pdf/1808.10584v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-describe-differences-between |
Repo | https://github.com/harsh19/spot-the-diff |
Framework | none |
Federated Optimization in Heterogeneous Networks
Title | Federated Optimization in Heterogeneous Networks |
Authors | Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith |
Abstract | Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedProx, to tackle heterogeneity in federated networks. FedProx can be viewed as a generalization and re-parametrization of FedAvg, the current state-of-the-art method for federated learning. While FedProx makes only minor algorithmic modifications to FedAvg, these modifications have important ramifications both in theory and in practice. Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity). Practically, we demonstrate that FedProx allows for more robust convergence than FedAvg across a suite of federated datasets. In particular, in highly heterogeneous settings, FedProx demonstrates significantly more stable and accurate convergence behavior relative to FedAvg—improving absolute test accuracy by 22% on average. |
Tasks | Distributed Optimization |
Published | 2018-12-14 |
URL | https://arxiv.org/abs/1812.06127v4 |
https://arxiv.org/pdf/1812.06127v4.pdf | |
PWC | https://paperswithcode.com/paper/federated-optimization-for-heterogeneous |
Repo | https://github.com/litian96/FedProx |
Framework | none |
DeepScores and Deep Watershed Detection: current state and open issues
Title | DeepScores and Deep Watershed Detection: current state and open issues |
Authors | Ismail Elezi, Lukas Tuggener, Marcello Pelillo, Thilo Stadelmann |
Abstract | This paper gives an overview of our current Optical Music Recognition (OMR) research. We recently released the OMR dataset \emph{DeepScores} as well as the object detection method \emph{Deep Watershed Detector}. We are currently taking some additional steps to improve both of them. Here we summarize current and future efforts, aimed at improving usefulness on real-world task and tackling extreme class imbalance. |
Tasks | Object Detection |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05423v1 |
http://arxiv.org/pdf/1810.05423v1.pdf | |
PWC | https://paperswithcode.com/paper/deepscores-and-deep-watershed-detection |
Repo | https://github.com/tuggeluk/DeepWatershedDetection |
Framework | tf |
Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning
Title | Deep Learning for Electromyographic Hand Gesture Signal Classification Using Transfer Learning |
Authors | Ulysse Côté-Allard, Cheikh Latyr Fall, Alexandre Drouin, Alexandre Campeau-Lecours, Clément Gosselin, Kyrre Glette, François Laviolette, Benoit Gosselin |
Abstract | In recent years, deep learning algorithms have become increasingly more prominent for their unparalleled ability to automatically learn discriminant features from large amounts of data. However, within the field of electromyography-based gesture recognition, deep learning algorithms are seldom employed as they require an unreasonable amount of effort from a single person, to generate tens of thousands of examples. This work’s hypothesis is that general, informative features can be learned from the large amounts of data generated by aggregating the signals of multiple users, thus reducing the recording burden while enhancing gesture recognition. Consequently, this paper proposes applying transfer learning on aggregated data from multiple users, while leveraging the capacity of deep learning algorithms to learn discriminant features from large datasets. Two datasets comprised of 19 and 17 able-bodied participants respectively (the first one is employed for pre-training) were recorded for this work, using the Myo Armband. A third Myo Armband dataset was taken from the NinaPro database and is comprised of 10 able-bodied participants. Three different deep learning networks employing three different modalities as input (raw EMG, Spectrograms and Continuous Wavelet Transform (CWT)) are tested on the second and third dataset. The proposed transfer learning scheme is shown to systematically and significantly enhance the performance for all three networks on the two datasets, achieving an offline accuracy of 98.31% for 7 gestures over 17 participants for the CWT-based ConvNet and 68.98% for 18 gestures over 10 participants for the raw EMG-based ConvNet. Finally, a use-case study employing eight able-bodied participants suggests that real-time feedback allows users to adapt their muscle activation strategy which reduces the degradation in accuracy normally experienced over time. |
Tasks | Gesture Recognition, Transfer Learning |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.07756v5 |
http://arxiv.org/pdf/1801.07756v5.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-electromyographic-hand |
Repo | https://github.com/UlysseCoteAllard/MyoArmbandDataset |
Framework | pytorch |
TADAM: Task dependent adaptive metric for improved few-shot learning
Title | TADAM: Task dependent adaptive metric for improved few-shot learning |
Authors | Boris N. Oreshkin, Pau Rodriguez, Alexandre Lacoste |
Abstract | Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14% in accuracy for certain metrics on the mini-Imagenet 5-way 5-shot classification task. We further propose a simple and effective way of conditioning a learner on the task sample set, resulting in learning a task-dependent metric space. Moreover, we propose and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space. The resulting few-shot learning model based on the task-dependent scaled metric achieves state of the art on mini-Imagenet. We confirm these results on another few-shot dataset that we introduce in this paper based on CIFAR100. Our code is publicly available at https://github.com/ElementAI/TADAM. |
Tasks | Few-Shot Image Classification, Few-Shot Learning |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.10123v4 |
http://arxiv.org/pdf/1805.10123v4.pdf | |
PWC | https://paperswithcode.com/paper/tadam-task-dependent-adaptive-metric-for |
Repo | https://github.com/yaoyao-liu/meta-transfer-learning |
Framework | pytorch |
On First-Order Meta-Learning Algorithms
Title | On First-Order Meta-Learning Algorithms |
Authors | Alex Nichol, Joshua Achiam, John Schulman |
Abstract | This paper considers meta-learning problems, where there is a distribution of tasks, and we would like to obtain an agent that performs well (i.e., learns quickly) when presented with a previously unseen task sampled from this distribution. We analyze a family of algorithms for learning a parameter initialization that can be fine-tuned quickly on a new task, using only first-order derivatives for the meta-learning updates. This family includes and generalizes first-order MAML, an approximation to MAML obtained by ignoring second-order derivatives. It also includes Reptile, a new algorithm that we introduce here, which works by repeatedly sampling a task, training on it, and moving the initialization towards the trained weights on that task. We expand on the results from Finn et al. showing that first-order meta-learning algorithms perform well on some well-established benchmarks for few-shot classification, and we provide theoretical analysis aimed at understanding why these algorithms work. |
Tasks | Few-Shot Image Classification, Few-Shot Learning, Meta-Learning |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.02999v3 |
http://arxiv.org/pdf/1803.02999v3.pdf | |
PWC | https://paperswithcode.com/paper/on-first-order-meta-learning-algorithms |
Repo | https://github.com/peisungtsai/Reptile-Pytorch-Implementation |
Framework | pytorch |
Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?
Title | Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer? |
Authors | Nam Vo, James Hays |
Abstract | This work studies deep metric learning under small to medium scale data as we believe that better generalization could be a contributing factor to the improvement of previous fine-grained image retrieval methods; it should be considered when designing future techniques. In particular, we investigate using other layers in a deep metric learning system (besides the embedding layer) for feature extraction and analyze how well they perform on training data and generalize to testing data. From this study, we suggest a new regularization practice where one can add or choose a more optimal layer for feature extraction. State-of-the-art performance is demonstrated on 3 fine-grained image retrieval benchmarks: Cars-196, CUB-200-2011, and Stanford Online Product. |
Tasks | Image Retrieval, Metric Learning |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.03310v2 |
http://arxiv.org/pdf/1803.03310v2.pdf | |
PWC | https://paperswithcode.com/paper/generalization-in-metric-learning-should-the |
Repo | https://github.com/lugiavn/generalization-dml |
Framework | pytorch |
How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks
Title | How Images Inspire Poems: Generating Classical Chinese Poetry from Images with Memory Networks |
Authors | Linli Xu, Liang Jiang, Chuan Qin, Zhe Wang, Dongfang Du |
Abstract | With the recent advances of neural models and natural language processing, automatic generation of classical Chinese poetry has drawn significant attention due to its artistic and cultural value. Previous works mainly focus on generating poetry given keywords or other text information, while visual inspirations for poetry have been rarely explored. Generating poetry from images is much more challenging than generating poetry from text, since images contain very rich visual information which cannot be described completely using several keywords, and a good poem should convey the image accurately. In this paper, we propose a memory based neural model which exploits images to generate poems. Specifically, an Encoder-Decoder model with a topic memory network is proposed to generate classical Chinese poetry from images. To the best of our knowledge, this is the first work attempting to generate classical Chinese poetry from images with neural networks. A comprehensive experimental investigation with both human evaluation and quantitative analysis demonstrates that the proposed model can generate poems which convey images accurately. |
Tasks | |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.02994v1 |
http://arxiv.org/pdf/1803.02994v1.pdf | |
PWC | https://paperswithcode.com/paper/how-images-inspire-poems-generating-classical |
Repo | https://github.com/forrestbing/chinese-poetry-generation |
Framework | none |
Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling
Title | Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling |
Authors | Zhiqing Sun, Zhi-Hong Deng |
Abstract | Previous traditional approaches to unsupervised Chinese word segmentation (CWS) can be roughly classified into discriminative and generative models. The former uses the carefully designed goodness measures for candidate segmentation, while the latter focuses on finding the optimal segmentation of the highest generative probability. However, while there exists a trivial way to extend the discriminative models into neural version by using neural language models, those of generative ones are non-trivial. In this paper, we propose the segmental language models (SLMs) for CWS. Our approach explicitly focuses on the segmental nature of Chinese, as well as preserves several properties of language models. In SLMs, a context encoder encodes the previous context and a segment decoder generates each segment incrementally. As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff. |
Tasks | Chinese Word Segmentation, Language Modelling |
Published | 2018-10-07 |
URL | http://arxiv.org/abs/1810.03167v1 |
http://arxiv.org/pdf/1810.03167v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-neural-word-segmentation-for |
Repo | https://github.com/Edward-Sun/SLM |
Framework | pytorch |
Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions
Title | Compressed Sensing Using Binary Matrices of Nearly Optimal Dimensions |
Authors | Mahsa Lotfi, Mathukumalli Vidyasagar |
Abstract | In this paper, we study the problem of compressed sensing using binary measurement matrices, and $\ell_1$-norm minimization (basis pursuit) as the recovery algorithm. We derive new upper and lower bounds on the number of measurements to achieve robust sparse recovery with binary matrices. We establish sufficient conditions for a column-regular binary matrix to satisfy the robust null space property (RNSP), and show that the sparsity bounds for robust sparse recovery obtained using the RNSP are better by a factor of $(3 \sqrt{3})/2 \approx 2.6$ compared to the restricted isometry property (RIP). Next we derive universal \textit{lower} bounds on the number of measurements that any binary matrix needs to have in order to satisfy the weaker sufficient condition based on the RNSP, and show that bipartite graphs of girth six are optimal. Then we display two classes of binary matrices, namely parity check matrices of array codes, and Euler squares, that have girth six and are nearly optimal in the sense of almost satisfying the lower bound. In principle randomly generated Gaussian measurement matrices are `order-optimal.’ So we compare the phase transition behavior of the basis pursuit formulation using binary array code and Gaussian matrices, and show that (i) there is essentially no difference between the phase transition boundaries in the two cases, and (ii) the CPU time of basis pursuit with binary matrices is hundreds of times faster than with Gaussian matrices, and the storage requirements are less. Therefore it is suggested that binary matrices are a viable alternative to Gaussian matrices for compressed sensing using basis pursuit. | |
Tasks | |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03001v2 |
http://arxiv.org/pdf/1808.03001v2.pdf | |
PWC | https://paperswithcode.com/paper/compressed-sensing-using-binary-matrices-of |
Repo | https://github.com/monajemi/CompressedSensing |
Framework | none |