July 30, 2019

2637 words 13 mins read

Paper Group AWR 7

Paper Group AWR 7

Multimodal Word Distributions. Space-Time Graph Modeling of Ride Requests Based on Real-World Data. DGM: A deep learning algorithm for solving partial differential equations. Road Extraction by Deep Residual U-Net. mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions. Neural Optimizer Search with Reinforcement L …

Multimodal Word Distributions

Title Multimodal Word Distributions
Authors Ben Athiwaratkun, Andrew Gordon Wilson
Abstract Word embeddings provide point representations of words containing useful semantic information. We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. To learn these distributions, we propose an energy-based max-margin objective. We show that the resulting approach captures uniquely expressive semantic information, and outperforms alternatives, such as word2vec skip-grams, and Gaussian embeddings, on benchmark datasets such as word similarity and entailment.
Tasks Word Embeddings
Published 2017-04-27
URL https://arxiv.org/abs/1704.08424v2
PDF https://arxiv.org/pdf/1704.08424v2.pdf
PWC https://paperswithcode.com/paper/multimodal-word-distributions
Repo https://github.com/benathi/multisense-prob-fasttext
Framework none

Space-Time Graph Modeling of Ride Requests Based on Real-World Data

Title Space-Time Graph Modeling of Ride Requests Based on Real-World Data
Authors Abhinav Jauhri, Brian Foo, Jerome Berclaz, Chih Chi Hu, Radek Grzeszczuk, Vasu Parameswaran, John Paul Shen
Abstract This paper focuses on modeling ride requests and their variations over location and time, based on analyzing extensive real-world data from a ride-sharing service. We introduce a graph model that captures the spatial and temporal variability of ride requests and the potentials for ride pooling. We discover these ride request graphs exhibit a well known property called densification power law often found in real graphs modelling human behaviors. We show the pattern of ride requests and the potential of ride pooling for a city can be characterized by the densification factor of the ride request graphs. Previous works have shown that it is possible to automatically generate synthetic versions of these graphs that exhibit a given densification factor. We present an algorithm for automatic generation of synthetic ride request graphs that match quite well the densification factor of ride request graphs from actual ride request data.
Tasks
Published 2017-01-23
URL http://arxiv.org/abs/1701.06635v1
PDF http://arxiv.org/pdf/1701.06635v1.pdf
PWC https://paperswithcode.com/paper/space-time-graph-modeling-of-ride-requests
Repo https://github.com/ajauhri/mobility-modeling
Framework none

DGM: A deep learning algorithm for solving partial differential equations

Title DGM: A deep learning algorithm for solving partial differential equations
Authors Justin Sirignano, Konstantinos Spiliopoulos
Abstract High-dimensional PDEs have been a longstanding computational challenge. We propose to solve high-dimensional PDEs by approximating the solution with a deep neural network which is trained to satisfy the differential operator, initial condition, and boundary conditions. Our algorithm is meshfree, which is key since meshes become infeasible in higher dimensions. Instead of forming a mesh, the neural network is trained on batches of randomly sampled time and space points. The algorithm is tested on a class of high-dimensional free boundary PDEs, which we are able to accurately solve in up to $200$ dimensions. The algorithm is also tested on a high-dimensional Hamilton-Jacobi-Bellman PDE and Burgers’ equation. The deep learning algorithm approximates the general solution to the Burgers’ equation for a continuum of different boundary conditions and physical conditions (which can be viewed as a high-dimensional space). We call the algorithm a “Deep Galerkin Method (DGM)” since it is similar in spirit to Galerkin methods, with the solution approximated by a neural network instead of a linear combination of basis functions. In addition, we prove a theorem regarding the approximation power of neural networks for a class of quasilinear parabolic PDEs.
Tasks
Published 2017-08-24
URL http://arxiv.org/abs/1708.07469v5
PDF http://arxiv.org/pdf/1708.07469v5.pdf
PWC https://paperswithcode.com/paper/dgm-a-deep-learning-algorithm-for-solving
Repo https://github.com/Plemeur/DGM
Framework pytorch

Road Extraction by Deep Residual U-Net

Title Road Extraction by Deep Residual U-Net
Authors Zhengxin Zhang, Qingjie Liu, Yunhong Wang
Abstract Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network which combines the strengths of residual learning and U-Net is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model is two-fold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters however better performance. We test our network on a public road dataset and compare it with U-Net and other two state of the art deep learning based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.
Tasks Lesion Segmentation, Lung Nodule Segmentation, Retinal Vessel Segmentation, Semantic Segmentation, Skin Cancer Segmentation
Published 2017-11-29
URL http://arxiv.org/abs/1711.10684v1
PDF http://arxiv.org/pdf/1711.10684v1.pdf
PWC https://paperswithcode.com/paper/road-extraction-by-deep-residual-u-net
Repo https://github.com/Kaido0/Brain-Tissue-Segment-Keras
Framework tf

mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions

Title mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions
Authors Bernd Bischl, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, Michel Lang
Abstract We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization (MBO), also known as Bayesian optimization, which addresses the problem of expensive black-box optimization by approximating the given objective function through a surrogate regression model. It is designed for both single- and multi-objective optimization with mixed continuous, categorical and conditional parameters. Additional features include multi-point batch proposal, parallelization, visualization, logging and error-handling. mlrMBO is implemented in a modular fashion, such that single components can be easily replaced or adapted by the user for specific use cases, e.g., any regression learner from the mlr toolbox for machine learning can be used, and infill criteria and infill optimizers are easily exchangeable. We empirically demonstrate that mlrMBO provides state-of-the-art performance by comparing it on different benchmark scenarios against a wide range of other optimizers, including DiceOptim, rBayesianOptimization, SPOT, SMAC, Spearmint, and Hyperopt.
Tasks
Published 2017-03-09
URL http://arxiv.org/abs/1703.03373v3
PDF http://arxiv.org/pdf/1703.03373v3.pdf
PWC https://paperswithcode.com/paper/mlrmbo-a-modular-framework-for-model-based
Repo https://github.com/cran/mlrMBO
Framework none

Neural Optimizer Search with Reinforcement Learning

Title Neural Optimizer Search with Reinforcement Learning
Authors Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le
Abstract We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google’s neural machine translation system.
Tasks Machine Translation
Published 2017-09-21
URL http://arxiv.org/abs/1709.07417v2
PDF http://arxiv.org/pdf/1709.07417v2.pdf
PWC https://paperswithcode.com/paper/neural-optimizer-search-with-reinforcement
Repo https://github.com/calclavia/NOS
Framework pytorch

Regular Boardgames

Title Regular Boardgames
Authors Jakub Kowalski, Maksymilian Mika, Jakub Sutowicz, Marek Szykuła
Abstract We propose a new General Game Playing (GGP) language called Regular Boardgames (RBG), which is based on the theory of regular languages. The objective of RBG is to join key properties as expressiveness, efficiency, and naturalness of the description in one GGP formalism, compensating certain drawbacks of the existing languages. This often makes RBG more suitable for various research and practical developments in GGP. While dedicated mostly for describing board games, RBG is universal for the class of all finite deterministic turn-based games with perfect information. We establish foundations of RBG, and analyze it theoretically and experimentally, focusing on the efficiency of reasoning. Regular Boardgames is the first GGP language that allows efficient encoding and playing games with complex rules and with large branching factor (e.g.\ amazons, arimaa, large chess variants, go, international checkers, paper soccer).
Tasks Board Games
Published 2017-06-08
URL http://arxiv.org/abs/1706.02462v2
PDF http://arxiv.org/pdf/1706.02462v2.pdf
PWC https://paperswithcode.com/paper/regular-boardgames
Repo https://github.com/marekesz/rbg1.0
Framework none

Forward Thinking: Building Deep Random Forests

Title Forward Thinking: Building Deep Random Forests
Authors Kevin Miller, Chris Hettinger, Jeffrey Humpherys, Tyler Jarvis, David Kartchner
Abstract The success of deep neural networks has inspired many to wonder whether other learners could benefit from deep, layered architectures. We present a general framework called forward thinking for deep learning that generalizes the architectural flexibility and sophistication of deep neural networks while also allowing for (i) different types of learning functions in the network, other than neurons, and (ii) the ability to adaptively deepen the network as needed to improve results. This is done by training one layer at a time, and once a layer is trained, the input data are mapped forward through the layer to create a new learning problem. The process is then repeated, transforming the data through multiple layers, one at a time, rendering a new dataset, which is expected to be better behaved, and on which a final output layer can achieve good performance. In the case where the neurons of deep neural nets are replaced with decision trees, we call the result a Forward Thinking Deep Random Forest (FTDRF). We demonstrate a proof of concept by applying FTDRF on the MNIST dataset. We also provide a general mathematical formulation that allows for other types of deep learning problems to be considered.
Tasks
Published 2017-05-20
URL http://arxiv.org/abs/1705.07366v1
PDF http://arxiv.org/pdf/1705.07366v1.pdf
PWC https://paperswithcode.com/paper/forward-thinking-building-deep-random-forests
Repo https://github.com/tkchris93/ForwardThinking
Framework tf

Deep Sets

Title Deep Sets
Authors Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola
Abstract We study the problem of designing models for machine learning tasks defined on \emph{sets}. In contrast to traditional approach of operating on fixed dimensional vectors, we consider objective functions defined on sets that are invariant to permutations. Such problems are widespread, ranging from estimation of population statistics \cite{poczos13aistats}, to anomaly detection in piezometer data of embankment dams \cite{Jung15Exploration}, to cosmology \cite{Ntampaka16Dynamical,Ravanbakhsh16ICML1}. Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong. This family of functions has a special structure which enables us to design a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks. We also derive the necessary and sufficient conditions for permutation equivariance in deep models. We demonstrate the applicability of our method on population statistic estimation, point cloud classification, set expansion, and outlier detection.
Tasks Anomaly Detection, Outlier Detection
Published 2017-03-10
URL http://arxiv.org/abs/1703.06114v3
PDF http://arxiv.org/pdf/1703.06114v3.pdf
PWC https://paperswithcode.com/paper/deep-sets
Repo https://github.com/MathieuCarriere/perslay
Framework tf

Deep Learning applied to NLP

Title Deep Learning applied to NLP
Authors Marc Moreno Lopez, Jugal Kalita
Abstract Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP.
Tasks Image Classification
Published 2017-03-09
URL http://arxiv.org/abs/1703.03091v1
PDF http://arxiv.org/pdf/1703.03091v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-applied-to-nlp
Repo https://github.com/linghduoduo/NLP
Framework tf

Enabling Sparse Winograd Convolution by Native Pruning

Title Enabling Sparse Winograd Convolution by Native Pruning
Authors Sheng Li, Jongsoo Park, Ping Tak Peter Tang
Abstract Sparse methods and the use of Winograd convolutions are two orthogonal approaches, each of which significantly accelerates convolution computations in modern CNNs. Sparse Winograd merges these two and thus has the potential to offer a combined performance benefit. Nevertheless, training convolution layers so that the resulting Winograd kernels are sparse has not hitherto been very successful. By introducing a Winograd layer in place of a standard convolution layer, we can learn and prune Winograd coefficients “natively” and obtain sparsity level beyond 90% with only 0.1% accuracy loss with AlexNet on ImageNet dataset. Furthermore, we present a sparse Winograd convolution algorithm and implementation that exploits the sparsity, achieving up to 31.7 effective TFLOP/s in 32-bit precision on a latest Intel Xeon CPU, which corresponds to a 5.4x speedup over a state-of-the-art dense convolution implementation.
Tasks
Published 2017-02-28
URL http://arxiv.org/abs/1702.08597v2
PDF http://arxiv.org/pdf/1702.08597v2.pdf
PWC https://paperswithcode.com/paper/enabling-sparse-winograd-convolution-by
Repo https://github.com/IntelLabs/SkimCaffe
Framework tf

Self-ensembling for visual domain adaptation

Title Self-ensembling for visual domain adaptation
Authors Geoffrey French, Michal Mackiewicz, Mark Fisher
Abstract This paper explores the use of self-ensembling for visual domain adaptation problems. Our technique is derived from the mean teacher variant (Tarvainen et al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that achieved state of the art results in the area of semi-supervised learning. We introduce a number of modifications to their approach for challenging domain adaptation scenarios and evaluate its effectiveness. Our approach achieves state of the art results in a variety of benchmarks, including our winning entry in the VISDA-2017 visual domain adaptation challenge. In small image benchmarks, our algorithm not only outperforms prior art, but can also achieve accuracy that is close to that of a classifier trained in a supervised fashion.
Tasks Domain Adaptation
Published 2017-06-16
URL http://arxiv.org/abs/1706.05208v4
PDF http://arxiv.org/pdf/1706.05208v4.pdf
PWC https://paperswithcode.com/paper/self-ensembling-for-visual-domain-adaptation
Repo https://github.com/Britefury/self-ensemble-visual-domain-adapt
Framework pytorch

CSGNet: Neural Shape Parser for Constructive Solid Geometry

Title CSGNet: Neural Shape Parser for Constructive Solid Geometry
Authors Gopal Sharma, Rishabh Goyal, Difan Liu, Evangelos Kalogerakis, Subhransu Maji
Abstract We present a neural architecture that takes as input a 2D or 3D shape and outputs a program that generates the shape. The instructions in our program are based on constructive solid geometry principles, i.e., a set of boolean operations on shape primitives defined recursively. Bottom-up techniques for this shape parsing task rely on primitive detection and are inherently slow since the search space over possible primitive combinations is large. In contrast, our model uses a recurrent neural network that parses the input shape in a top-down manner, which is significantly faster and yields a compact and easy-to-interpret sequence of modeling instructions. Our model is also more effective as a shape detector compared to existing state-of-the-art detection techniques. We finally demonstrate that our network can be trained on novel datasets without ground-truth program annotations through policy gradient techniques.
Tasks
Published 2017-12-22
URL http://arxiv.org/abs/1712.08290v2
PDF http://arxiv.org/pdf/1712.08290v2.pdf
PWC https://paperswithcode.com/paper/csgnet-neural-shape-parser-for-constructive
Repo https://github.com/AN313/deformable
Framework pytorch

Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders

Title Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders
Authors Fengfu Li, Hong Qiao, Bo Zhang, Xuanyang Xi
Abstract Traditional image clustering methods take a two-step approach, feature learning and clustering, sequentially. However, recent research results demonstrated that combining the separated phases in a unified framework and training them jointly can achieve a better performance. In this paper, we first introduce fully convolutional auto-encoders for image feature learning and then propose a unified clustering framework to learn image representations and cluster centers jointly based on a fully convolutional auto-encoder and soft $k$-means scores. At initial stages of the learning procedure, the representations extracted from the auto-encoder may not be very discriminative for latter clustering. We address this issue by adopting a boosted discriminative distribution, where high score assignments are highlighted and low score ones are de-emphasized. With the gradually boosted discrimination, clustering assignment scores are discriminated and cluster purities are enlarged. Experiments on several vision benchmark datasets show that our methods can achieve a state-of-the-art performance.
Tasks Image Clustering
Published 2017-03-23
URL http://arxiv.org/abs/1703.07980v1
PDF http://arxiv.org/pdf/1703.07980v1.pdf
PWC https://paperswithcode.com/paper/discriminatively-boosted-image-clustering
Repo https://github.com/waynezhanghk/gacluster
Framework pytorch

Dirichlet-vMF Mixture Model

Title Dirichlet-vMF Mixture Model
Authors Shaohua Li
Abstract This document is about the multi-document Von-Mises-Fisher mixture model with a Dirichlet prior, referred to as VMFMix. VMFMix is analogous to Latent Dirichlet Allocation (LDA) in that they can capture the co-occurrence patterns acorss multiple documents. The difference is that in VMFMix, the topic-word distribution is defined on a continuous n-dimensional hypersphere. Hence VMFMix is used to derive topic embeddings, i.e., representative vectors, from multiple sets of embedding vectors. An efficient Variational Expectation-Maximization inference algorithm is derived. The performance of VMFMix on two document classification tasks is reported, with some preliminary analysis.
Tasks Document Classification
Published 2017-02-24
URL http://arxiv.org/abs/1702.07495v1
PDF http://arxiv.org/pdf/1702.07495v1.pdf
PWC https://paperswithcode.com/paper/dirichlet-vmf-mixture-model
Repo https://github.com/askerlee/vmfmix
Framework none
comments powered by Disqus