Paper Group AWR 85
Automatic Building Extraction in Aerial Scenes Using Convolutional Networks. Tsallis Regularized Optimal Transport and Ecological Inference. Edinburgh Neural Machine Translation Systems for WMT 16. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation. Learning Convolutional Neural Networks using Hybrid …
Automatic Building Extraction in Aerial Scenes Using Convolutional Networks
Title | Automatic Building Extraction in Aerial Scenes Using Convolutional Networks |
Authors | Jiangye Yuan |
Abstract | Automatic building extraction from aerial and satellite imagery is highly challenging due to extremely large variations of building appearances. To attack this problem, we design a convolutional network with a final stage that integrates activations from multiple preceding stages for pixel-wise prediction, and introduce the signed distance function of building boundaries as the output representation, which has an enhanced representation power. We leverage abundant building footprint data available from geographic information systems (GIS) to compile training data. The trained network achieves superior performance on datasets that are significantly larger and more complex than those used in prior work, demonstrating that the proposed method provides a promising and scalable solution for automating this labor-intensive task. |
Tasks | |
Published | 2016-02-21 |
URL | http://arxiv.org/abs/1602.06564v1 |
http://arxiv.org/pdf/1602.06564v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-building-extraction-in-aerial |
Repo | https://github.com/statisticalplumber/Building_detection |
Framework | none |
Tsallis Regularized Optimal Transport and Ecological Inference
Title | Tsallis Regularized Optimal Transport and Ecological Inference |
Authors | Boris Muzellec, Richard Nock, Giorgio Patrini, Frank Nielsen |
Abstract | Optimal transport is a powerful framework for computing distances between probability distributions. We unify the two main approaches to optimal transport, namely Monge-Kantorovitch and Sinkhorn-Cuturi, into what we define as Tsallis regularized optimal transport (\trot). \trot~interpolates a rich family of distortions from Wasserstein to Kullback-Leibler, encompassing as well Pearson, Neyman and Hellinger divergences, to name a few. We show that metric properties known for Sinkhorn-Cuturi generalize to \trot, and provide efficient algorithms for finding the optimal transportation plan with formal convergence proofs. We also present the first application of optimal transport to the problem of ecological inference, that is, the reconstruction of joint distributions from their marginals, a problem of large interest in the social sciences. \trot~provides a convenient framework for ecological inference by allowing to compute the joint distribution — that is, the optimal transportation plan itself — when side information is available, which is \textit{e.g.} typically what census represents in political science. Experiments on data from the 2012 US presidential elections display the potential of \trot~in delivering a faithful reconstruction of the joint distribution of ethnic groups and voter preferences. |
Tasks | |
Published | 2016-09-15 |
URL | http://arxiv.org/abs/1609.04495v1 |
http://arxiv.org/pdf/1609.04495v1.pdf | |
PWC | https://paperswithcode.com/paper/tsallis-regularized-optimal-transport-and |
Repo | https://github.com/BorisMuzellec/TROT |
Framework | none |
Edinburgh Neural Machine Translation Systems for WMT 16
Title | Edinburgh Neural Machine Translation Systems for WMT 16 |
Authors | Rico Sennrich, Barry Haddow, Alexandra Birch |
Abstract | We participated in the WMT 2016 shared news translation task by building neural translation systems for four language pairs, each trained in both directions: English<->Czech, English<->German, English<->Romanian and English<->Russian. Our systems are based on an attentional encoder-decoder, using BPE subword segmentation for open-vocabulary translation with a fixed vocabulary. We experimented with using automatic back-translations of the monolingual News corpus as additional training data, pervasive dropout, and target-bidirectional models. All reported methods give substantial improvements, and we see improvements of 4.3–11.2 BLEU over our baseline systems. In the human evaluation, our systems were the (tied) best constrained system for 7 out of 8 translation directions in which we participated. |
Tasks | Machine Translation |
Published | 2016-06-09 |
URL | http://arxiv.org/abs/1606.02891v2 |
http://arxiv.org/pdf/1606.02891v2.pdf | |
PWC | https://paperswithcode.com/paper/edinburgh-neural-machine-translation-systems |
Repo | https://github.com/rsennrich/wmt16-scripts |
Framework | none |
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
Title | Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation |
Authors | Tejas D. Kulkarni, Karthik R. Narasimhan, Ardavan Saeedi, Joshua B. Tenenbaum |
Abstract | Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration, resulting in an agent being unable to learn robust value functions. Intrinsically motivated agents can explore new behavior for its own sake rather than to directly solve problems. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning. A top-level value function learns a policy over intrinsic goals, and a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse, delayed feedback: (1) a complex discrete stochastic decision process, and (2) the classic ATARI game `Montezuma’s Revenge’. | |
Tasks | Montezuma’s Revenge |
Published | 2016-04-20 |
URL | http://arxiv.org/abs/1604.06057v2 |
http://arxiv.org/pdf/1604.06057v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-deep-reinforcement-learning |
Repo | https://github.com/EthanMacdonald/h-DQN |
Framework | tf |
Learning Convolutional Neural Networks using Hybrid Orthogonal Projection and Estimation
Title | Learning Convolutional Neural Networks using Hybrid Orthogonal Projection and Estimation |
Authors | Hengyue Pan, Hui Jiang |
Abstract | Convolutional neural networks (CNNs) have yielded the excellent performance in a variety of computer vision tasks, where CNNs typically adopt a similar structure consisting of convolution layers, pooling layers and fully connected layers. In this paper, we propose to apply a novel method, namely Hybrid Orthogonal Projection and Estimation (HOPE), to CNNs in order to introduce orthogonality into the CNN structure. The HOPE model can be viewed as a hybrid model to combine feature extraction using orthogonal linear projection with mixture models. It is an effective model to extract useful information from the original high-dimension feature vectors and meanwhile filter out irrelevant noises. In this work, we present three different ways to apply the HOPE models to CNNs, i.e., {\em HOPE-Input}, {\em single-HOPE-Block} and {\em multi-HOPE-Blocks}. For {\em HOPE-Input} CNNs, a HOPE layer is directly used right after the input to de-correlate high-dimension input feature vectors. Alternatively, in {\em single-HOPE-Block} and {\em multi-HOPE-Blocks} CNNs, we consider to use HOPE layers to replace one or more blocks in the CNNs, where one block may include several convolutional layers and one pooling layer. The experimental results on both Cifar-10 and Cifar-100 data sets have shown that the orthogonal constraints imposed by the HOPE layers can significantly improve the performance of CNNs in these image classification tasks (we have achieved one of the best performance when image augmentation has not been applied, and top 5 performance with image augmentation). |
Tasks | Image Augmentation, Image Classification |
Published | 2016-06-20 |
URL | http://arxiv.org/abs/1606.05929v4 |
http://arxiv.org/pdf/1606.05929v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-convolutional-neural-networks-using |
Repo | https://github.com/mowangphy/HOPE-CNN |
Framework | none |
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
Title | Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models |
Authors | Minh-Thang Luong, Christopher D. Manning |
Abstract | Nearly all previous work on neural machine translation (NMT) has used quite restricted vocabularies, perhaps with a subsequent method to patch in unknown words. This paper presents a novel word-character solution to achieving open vocabulary NMT. We build hybrid systems that translate mostly at the word level and consult the character components for rare words. Our character-level recurrent neural networks compute source word representations and recover unknown target words when needed. The twofold advantage of such a hybrid approach is that it is much faster and easier to train than character-based ones; at the same time, it never produces unknown words as in the case of word-based models. On the WMT’15 English to Czech translation task, this hybrid approach offers an addition boost of +2.1-11.4 BLEU points over models that already handle unknown words. Our best system achieves a new state-of-the-art result with 20.7 BLEU score. We demonstrate that our character models can successfully learn to not only generate well-formed words for Czech, a highly-inflected language with a very complex vocabulary, but also build correct representations for English source words. |
Tasks | Machine Translation |
Published | 2016-04-04 |
URL | http://arxiv.org/abs/1604.00788v2 |
http://arxiv.org/pdf/1604.00788v2.pdf | |
PWC | https://paperswithcode.com/paper/achieving-open-vocabulary-neural-machine |
Repo | https://github.com/yurayli/stanford-cs224n-sol |
Framework | pytorch |
Differentiable Functional Program Interpreters
Title | Differentiable Functional Program Interpreters |
Authors | John K. Feser, Marc Brockschmidt, Alexander L. Gaunt, Daniel Tarlow |
Abstract | Programming by Example (PBE) is the task of inducing computer programs from input-output examples. It can be seen as a type of machine learning where the hypothesis space is the set of legal programs in some programming language. Recent work on differentiable interpreters relaxes the discrete space of programs into a continuous space so that search over programs can be performed using gradient-based optimization. While conceptually powerful, so far differentiable interpreter-based program synthesis has only been capable of solving very simple problems. In this work, we study modeling choices that arise when constructing a differentiable programming language and their impact on the success of synthesis. The main motivation for the modeling choices comes from functional programming: we study the effect of memory allocation schemes, immutable data, type systems, and built-in control-flow structures. Empirically we show that incorporating functional programming ideas into differentiable programming languages allows us to learn much more complex programs than is possible with existing differentiable languages. |
Tasks | Program Synthesis |
Published | 2016-11-07 |
URL | http://arxiv.org/abs/1611.01988v2 |
http://arxiv.org/pdf/1611.01988v2.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-functional-program |
Repo | https://github.com/ethancaballero/neural-engineers-first-attempt |
Framework | tf |
Adaptive Neural Compilation
Title | Adaptive Neural Compilation |
Authors | Rudy Bunel, Alban Desmaison, Pushmeet Kohli, Philip H. S. Torr, M. Pawan Kumar |
Abstract | This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that make the code faster to execute without changing its semantics. In contrast, our work involves adapting programs to make them more efficient while considering correctness only on a target input distribution. Our approach is inspired by the recent works on differentiable representations of programs. We show that it is possible to compile programs written in a low-level language to a differentiable representation. We also show how programs in this representation can be optimised to make them efficient on a target distribution of inputs. Experimental results demonstrate that our approach enables learning specifically-tuned algorithms for given data distributions with a high success rate. |
Tasks | |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07969v2 |
http://arxiv.org/pdf/1605.07969v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-neural-compilation |
Repo | https://github.com/albanD/adaptive-neural-compilation |
Framework | none |
Beyond Deep Residual Learning for Image Restoration: Persistent Homology-Guided Manifold Simplification
Title | Beyond Deep Residual Learning for Image Restoration: Persistent Homology-Guided Manifold Simplification |
Authors | Woong Bae, Jaejun Yoo, Jong Chul Ye |
Abstract | The latest deep learning approaches perform better than the state-of-the-art signal processing approaches in various image restoration tasks. However, if an image contains many patterns and structures, the performance of these CNNs is still inferior. To address this issue, here we propose a novel feature space deep residual learning algorithm that outperforms the existing residual learning. The main idea is originated from the observation that the performance of a learning algorithm can be improved if the input and/or label manifolds can be made topologically simpler by an analytic mapping to a feature space. Our extensive numerical studies using denoising experiments and NTIRE single-image super-resolution (SISR) competition demonstrate that the proposed feature space residual learning outperforms the existing state-of-the-art approaches. Moreover, our algorithm was ranked third in NTIRE competition with 5-10 times faster computational time compared to the top ranked teams. The source code is available on page : https://github.com/iorism/CNN.git |
Tasks | Denoising, Image Restoration, Image Super-Resolution, Super-Resolution |
Published | 2016-11-19 |
URL | http://arxiv.org/abs/1611.06345v4 |
http://arxiv.org/pdf/1611.06345v4.pdf | |
PWC | https://paperswithcode.com/paper/beyond-deep-residual-learning-for-image |
Repo | https://github.com/iorism/CNN |
Framework | none |
Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network
Title | Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network |
Authors | Tong He, Weilin Huang, Yu Qiao, Jian Yao |
Abstract | We introduce a new top-down pipeline for scene text detection. We propose a novel Cascaded Convolutional Text Network (CCTN) that joints two customized convolutional networks for coarse-to-fine text localization. The CCTN fast detects text regions roughly from a low-resolution image, and then accurately localizes text lines from each enlarged region. We cast previous character based detection into direct text region estimation, avoiding multiple bottom- up post-processing steps. It exhibits surprising robustness and discriminative power by considering whole text region as detection object which provides strong semantic information. We customize convolutional network by develop- ing rectangle convolutions and multiple in-network fusions. This enables it to handle multi-shape and multi-scale text efficiently. Furthermore, the CCTN is computationally efficient by sharing convolutional computations, and high-level property allows it to be invariant to various languages and multiple orientations. It achieves 0.84 and 0.86 F-measures on the ICDAR 2011 and ICDAR 2013, delivering substantial improvements over state-of-the-art results [23, 1]. |
Tasks | Scene Text Detection |
Published | 2016-03-31 |
URL | http://arxiv.org/abs/1603.09423v1 |
http://arxiv.org/pdf/1603.09423v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-text-localization-in-natural-image |
Repo | https://github.com/apekshapriya/Text-Localization-in-Image |
Framework | tf |
“What is Relevant in a Text Document?": An Interpretable Machine Learning Approach
Title | “What is Relevant in a Text Document?": An Interpretable Machine Learning Approach |
Authors | Leila Arras, Franziska Horn, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek |
Abstract | Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text’s category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications. |
Tasks | Interpretable Machine Learning |
Published | 2016-12-23 |
URL | http://arxiv.org/abs/1612.07843v1 |
http://arxiv.org/pdf/1612.07843v1.pdf | |
PWC | https://paperswithcode.com/paper/what-is-relevant-in-a-text-document-an |
Repo | https://github.com/sebastian-lapuschkin/lrp_toolbox |
Framework | none |
GENESIM: genetic extraction of a single, interpretable model
Title | GENESIM: genetic extraction of a single, interpretable model |
Authors | Gilles Vandewiele, Olivier Janssens, Femke Ongenae, Filip De Turck, Sofie Van Hoecke |
Abstract | Models obtained by decision tree induction techniques excel in being interpretable.However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques are able to achieve a higher accuracy. However, this comes at a cost of losing interpretability of the resulting model. This makes ensemble techniques impractical in applications where decision support, instead of decision making, is crucial. To bridge this gap, we present the GENESIM algorithm that transforms an ensemble of decision trees to a single decision tree with an enhanced predictive performance by using a genetic algorithm. We compared GENESIM to prevalent decision tree induction and ensemble techniques using twelve publicly available data sets. The results show that GENESIM achieves a better predictive performance on most of these data sets than decision tree induction techniques and a predictive performance in the same order of magnitude as the ensemble techniques. Moreover, the resulting model of GENESIM has a very low complexity, making it very interpretable, in contrast to ensemble techniques. |
Tasks | Decision Making, Interpretable Machine Learning |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05722v1 |
http://arxiv.org/pdf/1611.05722v1.pdf | |
PWC | https://paperswithcode.com/paper/genesim-genetic-extraction-of-a-single |
Repo | https://github.com/IBCNServices/GENESIM |
Framework | none |
System Identification through Online Sparse Gaussian Process Regression with Input Noise
Title | System Identification through Online Sparse Gaussian Process Regression with Input Noise |
Authors | Hildo Bijl, Thomas B. Schön, Jan-Willem van Wingerden, Michel Verhaegen |
Abstract | There has been a growing interest in using non-parametric regression methods like Gaussian Process (GP) regression for system identification. GP regression does traditionally have three important downsides: (1) it is computationally intensive, (2) it cannot efficiently implement newly obtained measurements online, and (3) it cannot deal with stochastic (noisy) input points. In this paper we present an algorithm tackling all these three issues simultaneously. The resulting Sparse Online Noisy Input GP (SONIG) regression algorithm can incorporate new noisy measurements in constant runtime. A comparison has shown that it is more accurate than similar existing regression algorithms. When applied to non-linear black-box system modeling, its performance is competitive with existing non-linear ARX models. |
Tasks | |
Published | 2016-01-29 |
URL | http://arxiv.org/abs/1601.08068v3 |
http://arxiv.org/pdf/1601.08068v3.pdf | |
PWC | https://paperswithcode.com/paper/system-identification-through-online-sparse |
Repo | https://github.com/HildoBijl/SONIG |
Framework | none |
Crafting Adversarial Input Sequences for Recurrent Neural Networks
Title | Crafting Adversarial Input Sequences for Recurrent Neural Networks |
Authors | Nicolas Papernot, Patrick McDaniel, Ananthram Swami, Richard Harang |
Abstract | Machine learning models are frequently used to solve complex security problems, as well as to make decisions in sensitive situations like guiding autonomous vehicles or predicting financial market behaviors. Previous efforts have shown that numerous machine learning models were vulnerable to adversarial manipulations of their inputs taking the form of adversarial samples. Such inputs are crafted by adding carefully selected perturbations to legitimate inputs so as to force the machine learning model to misbehave, for instance by outputting a wrong class if the machine learning task of interest is classification. In fact, to the best of our knowledge, all previous work on adversarial samples crafting for neural network considered models used to solve classification tasks, most frequently in computer vision applications. In this paper, we contribute to the field of adversarial machine learning by investigating adversarial input sequences for recurrent neural networks processing sequential data. We show that the classes of algorithms introduced previously to craft adversarial samples misclassified by feed-forward neural networks can be adapted to recurrent neural networks. In a experiment, we show that adversaries can craft adversarial sequences misleading both categorical and sequential recurrent neural networks. |
Tasks | Autonomous Vehicles |
Published | 2016-04-28 |
URL | http://arxiv.org/abs/1604.08275v1 |
http://arxiv.org/pdf/1604.08275v1.pdf | |
PWC | https://paperswithcode.com/paper/crafting-adversarial-input-sequences-for |
Repo | https://github.com/Bhushan-Jagtap-2013/Adversarial_Attack_on_RNN |
Framework | tf |
CuMF_SGD: Fast and Scalable Matrix Factorization
Title | CuMF_SGD: Fast and Scalable Matrix Factorization |
Authors | Xiaolong Xie, Wei Tan, Liana L. Fong, Yun Liang |
Abstract | Matrix factorization (MF) has been widely used in e.g., recommender systems, topic modeling and word embedding. Stochastic gradient descent (SGD) is popular in solving MF problems because it can deal with large data sets and is easy to do incremental learning. We observed that SGD for MF is memory bound. Meanwhile, single-node CPU systems with caching performs well only for small data sets; distributed systems have higher aggregated memory bandwidth but suffer from relatively slow network connection. This observation inspires us to accelerate MF by utilizing GPUs’s high memory bandwidth and fast intra-node connection. We present cuMF_SGD, a CUDA-based SGD solution for large-scale MF problems. On a single CPU, we design two workload schedule schemes, i.e., batch-Hogwild! and wavefront-update that fully exploit the massive amount of cores. Especially, batch-Hogwild! as a vectorized version of Hogwild! overcomes the issue of memory discontinuity. We also develop highly-optimized kernels for SGD update, leveraging cache, warp-shuffle instructions and half-precision floats. We also design a partition scheme to utilize multiple GPUs while addressing the well-known convergence issue when parallelizing SGD. On three data sets with only one Maxwell or Pascal GPU, cuMF_SGD runs 3.1X-28.2X as fast compared with state-of-art CPU solutions on 1-64 CPU nodes. Evaluations also show that cuMF_SGD scales well on multiple GPUs in large data sets. |
Tasks | Recommendation Systems |
Published | 2016-10-19 |
URL | http://arxiv.org/abs/1610.05838v3 |
http://arxiv.org/pdf/1610.05838v3.pdf | |
PWC | https://paperswithcode.com/paper/cumf_sgd-fast-and-scalable-matrix |
Repo | https://github.com/MehdiChelh/CuMF_SGD |
Framework | none |