July 30, 2019

2637 words 13 mins read

Paper Group AWR 7

Multimodal Word Distributions. Space-Time Graph Modeling of Ride Requests Based on Real-World Data. DGM: A deep learning algorithm for solving partial differential equations. Road Extraction by Deep Residual U-Net. mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions. Neural Optimizer Search with Reinforcement L …

Multimodal Word Distributions


Title	Multimodal Word Distributions
Authors	Ben Athiwaratkun, Andrew Gordon Wilson
Abstract	Word embeddings provide point representations of words containing useful semantic information. We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. To learn these distributions, we propose an energy-based max-margin objective. We show that the resulting approach captures uniquely expressive semantic information, and outperforms alternatives, such as word2vec skip-grams, and Gaussian embeddings, on benchmark datasets such as word similarity and entailment.
Tasks	Word Embeddings
Published	2017-04-27
URL	https://arxiv.org/abs/1704.08424v2
PDF	https://arxiv.org/pdf/1704.08424v2.pdf
PWC	https://paperswithcode.com/paper/multimodal-word-distributions
Repo	https://github.com/benathi/multisense-prob-fasttext
Framework	none

Space-Time Graph Modeling of Ride Requests Based on Real-World Data


Title	Space-Time Graph Modeling of Ride Requests Based on Real-World Data
Authors	Abhinav Jauhri, Brian Foo, Jerome Berclaz, Chih Chi Hu, Radek Grzeszczuk, Vasu Parameswaran, John Paul Shen
Abstract	This paper focuses on modeling ride requests and their variations over location and time, based on analyzing extensive real-world data from a ride-sharing service. We introduce a graph model that captures the spatial and temporal variability of ride requests and the potentials for ride pooling. We discover these ride request graphs exhibit a well known property called densification power law often found in real graphs modelling human behaviors. We show the pattern of ride requests and the potential of ride pooling for a city can be characterized by the densification factor of the ride request graphs. Previous works have shown that it is possible to automatically generate synthetic versions of these graphs that exhibit a given densification factor. We present an algorithm for automatic generation of synthetic ride request graphs that match quite well the densification factor of ride request graphs from actual ride request data.
Tasks
Published	2017-01-23
URL	http://arxiv.org/abs/1701.06635v1
PDF	http://arxiv.org/pdf/1701.06635v1.pdf
PWC	https://paperswithcode.com/paper/space-time-graph-modeling-of-ride-requests
Repo	https://github.com/ajauhri/mobility-modeling
Framework	none

DGM: A deep learning algorithm for solving partial differential equations


Title	DGM: A deep learning algorithm for solving partial differential equations
Authors	Justin Sirignano, Konstantinos Spiliopoulos
Abstract	High-dimensional PDEs have been a longstanding computational challenge. We propose to solve high-dimensional PDEs by approximating the solution with a deep neural network which is trained to satisfy the differential operator, initial condition, and boundary conditions. Our algorithm is meshfree, which is key since meshes become infeasible in higher dimensions. Instead of forming a mesh, the neural network is trained on batches of randomly sampled time and space points. The algorithm is tested on a class of high-dimensional free boundary PDEs, which we are able to accurately solve in up to $200$ dimensions. The algorithm is also tested on a high-dimensional Hamilton-Jacobi-Bellman PDE and Burgers’ equation. The deep learning algorithm approximates the general solution to the Burgers’ equation for a continuum of different boundary conditions and physical conditions (which can be viewed as a high-dimensional space). We call the algorithm a “Deep Galerkin Method (DGM)” since it is similar in spirit to Galerkin methods, with the solution approximated by a neural network instead of a linear combination of basis functions. In addition, we prove a theorem regarding the approximation power of neural networks for a class of quasilinear parabolic PDEs.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.07469v5
PDF	http://arxiv.org/pdf/1708.07469v5.pdf
PWC	https://paperswithcode.com/paper/dgm-a-deep-learning-algorithm-for-solving
Repo	https://github.com/Plemeur/DGM
Framework	pytorch

Road Extraction by Deep Residual U-Net


Title	Road Extraction by Deep Residual U-Net
Authors	Zhengxin Zhang, Qingjie Liu, Yunhong Wang
Abstract	Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network which combines the strengths of residual learning and U-Net is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model is two-fold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters however better performance. We test our network on a public road dataset and compare it with U-Net and other two state of the art deep learning based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.
Tasks	Lesion Segmentation, Lung Nodule Segmentation, Retinal Vessel Segmentation, Semantic Segmentation, Skin Cancer Segmentation
Published	2017-11-29
URL	http://arxiv.org/abs/1711.10684v1
PDF	http://arxiv.org/pdf/1711.10684v1.pdf
PWC	https://paperswithcode.com/paper/road-extraction-by-deep-residual-u-net
Repo	https://github.com/Kaido0/Brain-Tissue-Segment-Keras
Framework	tf

mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions


Title	mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions
Authors	Bernd Bischl, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, Michel Lang
Abstract	We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization (MBO), also known as Bayesian optimization, which addresses the problem of expensive black-box optimization by approximating the given objective function through a surrogate regression model. It is designed for both single- and multi-objective optimization with mixed continuous, categorical and conditional parameters. Additional features include multi-point batch proposal, parallelization, visualization, logging and error-handling. mlrMBO is implemented in a modular fashion, such that single components can be easily replaced or adapted by the user for specific use cases, e.g., any regression learner from the mlr toolbox for machine learning can be used, and infill criteria and infill optimizers are easily exchangeable. We empirically demonstrate that mlrMBO provides state-of-the-art performance by comparing it on different benchmark scenarios against a wide range of other optimizers, including DiceOptim, rBayesianOptimization, SPOT, SMAC, Spearmint, and Hyperopt.
Tasks
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03373v3
PDF	http://arxiv.org/pdf/1703.03373v3.pdf
PWC	https://paperswithcode.com/paper/mlrmbo-a-modular-framework-for-model-based
Repo	https://github.com/cran/mlrMBO
Framework	none

Neural Optimizer Search with Reinforcement Learning


Title	Neural Optimizer Search with Reinforcement Learning
Authors	Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le
Abstract	We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google’s neural machine translation system.
Tasks	Machine Translation
Published	2017-09-21
URL	http://arxiv.org/abs/1709.07417v2
PDF	http://arxiv.org/pdf/1709.07417v2.pdf
PWC	https://paperswithcode.com/paper/neural-optimizer-search-with-reinforcement
Repo	https://github.com/calclavia/NOS
Framework	pytorch

Regular Boardgames


Title	Regular Boardgames
Authors	Jakub Kowalski, Maksymilian Mika, Jakub Sutowicz, Marek Szykuła
Abstract	We propose a new General Game Playing (GGP) language called Regular Boardgames (RBG), which is based on the theory of regular languages. The objective of RBG is to join key properties as expressiveness, efficiency, and naturalness of the description in one GGP formalism, compensating certain drawbacks of the existing languages. This often makes RBG more suitable for various research and practical developments in GGP. While dedicated mostly for describing board games, RBG is universal for the class of all finite deterministic turn-based games with perfect information. We establish foundations of RBG, and analyze it theoretically and experimentally, focusing on the efficiency of reasoning. Regular Boardgames is the first GGP language that allows efficient encoding and playing games with complex rules and with large branching factor (e.g.\ amazons, arimaa, large chess variants, go, international checkers, paper soccer).
Tasks	Board Games
Published	2017-06-08
URL	http://arxiv.org/abs/1706.02462v2
PDF	http://arxiv.org/pdf/1706.02462v2.pdf
PWC	https://paperswithcode.com/paper/regular-boardgames
Repo	https://github.com/marekesz/rbg1.0
Framework	none

Forward Thinking: Building Deep Random Forests


Title	Forward Thinking: Building Deep Random Forests
Authors	Kevin Miller, Chris Hettinger, Jeffrey Humpherys, Tyler Jarvis, David Kartchner
Abstract	The success of deep neural networks has inspired many to wonder whether other learners could benefit from deep, layered architectures. We present a general framework called forward thinking for deep learning that generalizes the architectural flexibility and sophistication of deep neural networks while also allowing for (i) different types of learning functions in the network, other than neurons, and (ii) the ability to adaptively deepen the network as needed to improve results. This is done by training one layer at a time, and once a layer is trained, the input data are mapped forward through the layer to create a new learning problem. The process is then repeated, transforming the data through multiple layers, one at a time, rendering a new dataset, which is expected to be better behaved, and on which a final output layer can achieve good performance. In the case where the neurons of deep neural nets are replaced with decision trees, we call the result a Forward Thinking Deep Random Forest (FTDRF). We demonstrate a proof of concept by applying FTDRF on the MNIST dataset. We also provide a general mathematical formulation that allows for other types of deep learning problems to be considered.
Tasks
Published	2017-05-20
URL	http://arxiv.org/abs/1705.07366v1
PDF	http://arxiv.org/pdf/1705.07366v1.pdf
PWC	https://paperswithcode.com/paper/forward-thinking-building-deep-random-forests
Repo	https://github.com/tkchris93/ForwardThinking
Framework	tf

Deep Sets


Title	Deep Sets
Authors	Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola
Abstract	We study the problem of designing models for machine learning tasks defined on \emph{sets}. In contrast to traditional approach of operating on fixed dimensional vectors, we consider objective functions defined on sets that are invariant to permutations. Such problems are widespread, ranging from estimation of population statistics \cite{poczos13aistats}, to anomaly detection in piezometer data of embankment dams \cite{Jung15Exploration}, to cosmology \cite{Ntampaka16Dynamical,Ravanbakhsh16ICML1}. Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong. This family of functions has a special structure which enables us to design a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks. We also derive the necessary and sufficient conditions for permutation equivariance in deep models. We demonstrate the applicability of our method on population statistic estimation, point cloud classification, set expansion, and outlier detection.
Tasks	Anomaly Detection, Outlier Detection
Published	2017-03-10
URL	http://arxiv.org/abs/1703.06114v3
PDF	http://arxiv.org/pdf/1703.06114v3.pdf
PWC	https://paperswithcode.com/paper/deep-sets
Repo	https://github.com/MathieuCarriere/perslay
Framework	tf

Deep Learning applied to NLP


Title	Deep Learning applied to NLP
Authors	Marc Moreno Lopez, Jugal Kalita
Abstract	Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP.
Tasks	Image Classification
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03091v1
PDF	http://arxiv.org/pdf/1703.03091v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-applied-to-nlp
Repo	https://github.com/linghduoduo/NLP
Framework	tf

Enabling Sparse Winograd Convolution by Native Pruning


Title	Enabling Sparse Winograd Convolution by Native Pruning
Authors	Sheng Li, Jongsoo Park, Ping Tak Peter Tang
Abstract	Sparse methods and the use of Winograd convolutions are two orthogonal approaches, each of which significantly accelerates convolution computations in modern CNNs. Sparse Winograd merges these two and thus has the potential to offer a combined performance benefit. Nevertheless, training convolution layers so that the resulting Winograd kernels are sparse has not hitherto been very successful. By introducing a Winograd layer in place of a standard convolution layer, we can learn and prune Winograd coefficients “natively” and obtain sparsity level beyond 90% with only 0.1% accuracy loss with AlexNet on ImageNet dataset. Furthermore, we present a sparse Winograd convolution algorithm and implementation that exploits the sparsity, achieving up to 31.7 effective TFLOP/s in 32-bit precision on a latest Intel Xeon CPU, which corresponds to a 5.4x speedup over a state-of-the-art dense convolution implementation.
Tasks
Published	2017-02-28
URL	http://arxiv.org/abs/1702.08597v2
PDF	http://arxiv.org/pdf/1702.08597v2.pdf
PWC	https://paperswithcode.com/paper/enabling-sparse-winograd-convolution-by
Repo	https://github.com/IntelLabs/SkimCaffe
Framework	tf

Self-ensembling for visual domain adaptation


Title	Self-ensembling for visual domain adaptation
Authors	Geoffrey French, Michal Mackiewicz, Mark Fisher
Abstract	This paper explores the use of self-ensembling for visual domain adaptation problems. Our technique is derived from the mean teacher variant (Tarvainen et al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that achieved state of the art results in the area of semi-supervised learning. We introduce a number of modifications to their approach for challenging domain adaptation scenarios and evaluate its effectiveness. Our approach achieves state of the art results in a variety of benchmarks, including our winning entry in the VISDA-2017 visual domain adaptation challenge. In small image benchmarks, our algorithm not only outperforms prior art, but can also achieve accuracy that is close to that of a classifier trained in a supervised fashion.
Tasks	Domain Adaptation
Published	2017-06-16
URL	http://arxiv.org/abs/1706.05208v4
PDF	http://arxiv.org/pdf/1706.05208v4.pdf
PWC	https://paperswithcode.com/paper/self-ensembling-for-visual-domain-adaptation
Repo	https://github.com/Britefury/self-ensemble-visual-domain-adapt
Framework	pytorch

CSGNet: Neural Shape Parser for Constructive Solid Geometry


Title	CSGNet: Neural Shape Parser for Constructive Solid Geometry
Authors	Gopal Sharma, Rishabh Goyal, Difan Liu, Evangelos Kalogerakis, Subhransu Maji
Abstract	We present a neural architecture that takes as input a 2D or 3D shape and outputs a program that generates the shape. The instructions in our program are based on constructive solid geometry principles, i.e., a set of boolean operations on shape primitives defined recursively. Bottom-up techniques for this shape parsing task rely on primitive detection and are inherently slow since the search space over possible primitive combinations is large. In contrast, our model uses a recurrent neural network that parses the input shape in a top-down manner, which is significantly faster and yields a compact and easy-to-interpret sequence of modeling instructions. Our model is also more effective as a shape detector compared to existing state-of-the-art detection techniques. We finally demonstrate that our network can be trained on novel datasets without ground-truth program annotations through policy gradient techniques.
Tasks
Published	2017-12-22
URL	http://arxiv.org/abs/1712.08290v2
PDF	http://arxiv.org/pdf/1712.08290v2.pdf
PWC	https://paperswithcode.com/paper/csgnet-neural-shape-parser-for-constructive
Repo	https://github.com/AN313/deformable
Framework	pytorch

Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders


Title	Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders
Authors	Fengfu Li, Hong Qiao, Bo Zhang, Xuanyang Xi
Abstract	Traditional image clustering methods take a two-step approach, feature learning and clustering, sequentially. However, recent research results demonstrated that combining the separated phases in a unified framework and training them jointly can achieve a better performance. In this paper, we first introduce fully convolutional auto-encoders for image feature learning and then propose a unified clustering framework to learn image representations and cluster centers jointly based on a fully convolutional auto-encoder and soft $k$-means scores. At initial stages of the learning procedure, the representations extracted from the auto-encoder may not be very discriminative for latter clustering. We address this issue by adopting a boosted discriminative distribution, where high score assignments are highlighted and low score ones are de-emphasized. With the gradually boosted discrimination, clustering assignment scores are discriminated and cluster purities are enlarged. Experiments on several vision benchmark datasets show that our methods can achieve a state-of-the-art performance.
Tasks	Image Clustering
Published	2017-03-23
URL	http://arxiv.org/abs/1703.07980v1
PDF	http://arxiv.org/pdf/1703.07980v1.pdf
PWC	https://paperswithcode.com/paper/discriminatively-boosted-image-clustering
Repo	https://github.com/waynezhanghk/gacluster
Framework	pytorch

Dirichlet-vMF Mixture Model


Title	Dirichlet-vMF Mixture Model
Authors	Shaohua Li
Abstract	This document is about the multi-document Von-Mises-Fisher mixture model with a Dirichlet prior, referred to as VMFMix. VMFMix is analogous to Latent Dirichlet Allocation (LDA) in that they can capture the co-occurrence patterns acorss multiple documents. The difference is that in VMFMix, the topic-word distribution is defined on a continuous n-dimensional hypersphere. Hence VMFMix is used to derive topic embeddings, i.e., representative vectors, from multiple sets of embedding vectors. An efficient Variational Expectation-Maximization inference algorithm is derived. The performance of VMFMix on two document classification tasks is reported, with some preliminary analysis.
Tasks	Document Classification
Published	2017-02-24
URL	http://arxiv.org/abs/1702.07495v1
PDF	http://arxiv.org/pdf/1702.07495v1.pdf
PWC	https://paperswithcode.com/paper/dirichlet-vmf-mixture-model
Repo	https://github.com/askerlee/vmfmix
Framework	none