Paper Group AWR 7
Multimodal Word Distributions. Space-Time Graph Modeling of Ride Requests Based on Real-World Data. DGM: A deep learning algorithm for solving partial differential equations. Road Extraction by Deep Residual U-Net. mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions. Neural Optimizer Search with Reinforcement L …
Multimodal Word Distributions
Title | Multimodal Word Distributions |
Authors | Ben Athiwaratkun, Andrew Gordon Wilson |
Abstract | Word embeddings provide point representations of words containing useful semantic information. We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. To learn these distributions, we propose an energy-based max-margin objective. We show that the resulting approach captures uniquely expressive semantic information, and outperforms alternatives, such as word2vec skip-grams, and Gaussian embeddings, on benchmark datasets such as word similarity and entailment. |
Tasks | Word Embeddings |
Published | 2017-04-27 |
URL | https://arxiv.org/abs/1704.08424v2 |
https://arxiv.org/pdf/1704.08424v2.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-word-distributions |
Repo | https://github.com/benathi/multisense-prob-fasttext |
Framework | none |
Space-Time Graph Modeling of Ride Requests Based on Real-World Data
Title | Space-Time Graph Modeling of Ride Requests Based on Real-World Data |
Authors | Abhinav Jauhri, Brian Foo, Jerome Berclaz, Chih Chi Hu, Radek Grzeszczuk, Vasu Parameswaran, John Paul Shen |
Abstract | This paper focuses on modeling ride requests and their variations over location and time, based on analyzing extensive real-world data from a ride-sharing service. We introduce a graph model that captures the spatial and temporal variability of ride requests and the potentials for ride pooling. We discover these ride request graphs exhibit a well known property called densification power law often found in real graphs modelling human behaviors. We show the pattern of ride requests and the potential of ride pooling for a city can be characterized by the densification factor of the ride request graphs. Previous works have shown that it is possible to automatically generate synthetic versions of these graphs that exhibit a given densification factor. We present an algorithm for automatic generation of synthetic ride request graphs that match quite well the densification factor of ride request graphs from actual ride request data. |
Tasks | |
Published | 2017-01-23 |
URL | http://arxiv.org/abs/1701.06635v1 |
http://arxiv.org/pdf/1701.06635v1.pdf | |
PWC | https://paperswithcode.com/paper/space-time-graph-modeling-of-ride-requests |
Repo | https://github.com/ajauhri/mobility-modeling |
Framework | none |
DGM: A deep learning algorithm for solving partial differential equations
Title | DGM: A deep learning algorithm for solving partial differential equations |
Authors | Justin Sirignano, Konstantinos Spiliopoulos |
Abstract | High-dimensional PDEs have been a longstanding computational challenge. We propose to solve high-dimensional PDEs by approximating the solution with a deep neural network which is trained to satisfy the differential operator, initial condition, and boundary conditions. Our algorithm is meshfree, which is key since meshes become infeasible in higher dimensions. Instead of forming a mesh, the neural network is trained on batches of randomly sampled time and space points. The algorithm is tested on a class of high-dimensional free boundary PDEs, which we are able to accurately solve in up to $200$ dimensions. The algorithm is also tested on a high-dimensional Hamilton-Jacobi-Bellman PDE and Burgers’ equation. The deep learning algorithm approximates the general solution to the Burgers’ equation for a continuum of different boundary conditions and physical conditions (which can be viewed as a high-dimensional space). We call the algorithm a “Deep Galerkin Method (DGM)” since it is similar in spirit to Galerkin methods, with the solution approximated by a neural network instead of a linear combination of basis functions. In addition, we prove a theorem regarding the approximation power of neural networks for a class of quasilinear parabolic PDEs. |
Tasks | |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07469v5 |
http://arxiv.org/pdf/1708.07469v5.pdf | |
PWC | https://paperswithcode.com/paper/dgm-a-deep-learning-algorithm-for-solving |
Repo | https://github.com/Plemeur/DGM |
Framework | pytorch |
Road Extraction by Deep Residual U-Net
Title | Road Extraction by Deep Residual U-Net |
Authors | Zhengxin Zhang, Qingjie Liu, Yunhong Wang |
Abstract | Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network which combines the strengths of residual learning and U-Net is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model is two-fold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters however better performance. We test our network on a public road dataset and compare it with U-Net and other two state of the art deep learning based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts. |
Tasks | Lesion Segmentation, Lung Nodule Segmentation, Retinal Vessel Segmentation, Semantic Segmentation, Skin Cancer Segmentation |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10684v1 |
http://arxiv.org/pdf/1711.10684v1.pdf | |
PWC | https://paperswithcode.com/paper/road-extraction-by-deep-residual-u-net |
Repo | https://github.com/Kaido0/Brain-Tissue-Segment-Keras |
Framework | tf |
mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions
Title | mlrMBO: A Modular Framework for Model-Based Optimization of Expensive Black-Box Functions |
Authors | Bernd Bischl, Jakob Richter, Jakob Bossek, Daniel Horn, Janek Thomas, Michel Lang |
Abstract | We present mlrMBO, a flexible and comprehensive R toolbox for model-based optimization (MBO), also known as Bayesian optimization, which addresses the problem of expensive black-box optimization by approximating the given objective function through a surrogate regression model. It is designed for both single- and multi-objective optimization with mixed continuous, categorical and conditional parameters. Additional features include multi-point batch proposal, parallelization, visualization, logging and error-handling. mlrMBO is implemented in a modular fashion, such that single components can be easily replaced or adapted by the user for specific use cases, e.g., any regression learner from the mlr toolbox for machine learning can be used, and infill criteria and infill optimizers are easily exchangeable. We empirically demonstrate that mlrMBO provides state-of-the-art performance by comparing it on different benchmark scenarios against a wide range of other optimizers, including DiceOptim, rBayesianOptimization, SPOT, SMAC, Spearmint, and Hyperopt. |
Tasks | |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03373v3 |
http://arxiv.org/pdf/1703.03373v3.pdf | |
PWC | https://paperswithcode.com/paper/mlrmbo-a-modular-framework-for-model-based |
Repo | https://github.com/cran/mlrMBO |
Framework | none |
Neural Optimizer Search with Reinforcement Learning
Title | Neural Optimizer Search with Reinforcement Learning |
Authors | Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le |
Abstract | We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathematical update equation based on a list of primitive functions, such as the gradient, running average of the gradient, etc. The controller is trained with Reinforcement Learning to maximize the performance of a model after a few epochs. On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a ConvNet model. We introduce two new optimizers, named PowerSign and AddSign, which we show transfer well and improve training on a variety of different tasks and architectures, including ImageNet classification and Google’s neural machine translation system. |
Tasks | Machine Translation |
Published | 2017-09-21 |
URL | http://arxiv.org/abs/1709.07417v2 |
http://arxiv.org/pdf/1709.07417v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-optimizer-search-with-reinforcement |
Repo | https://github.com/calclavia/NOS |
Framework | pytorch |
Regular Boardgames
Title | Regular Boardgames |
Authors | Jakub Kowalski, Maksymilian Mika, Jakub Sutowicz, Marek Szykuła |
Abstract | We propose a new General Game Playing (GGP) language called Regular Boardgames (RBG), which is based on the theory of regular languages. The objective of RBG is to join key properties as expressiveness, efficiency, and naturalness of the description in one GGP formalism, compensating certain drawbacks of the existing languages. This often makes RBG more suitable for various research and practical developments in GGP. While dedicated mostly for describing board games, RBG is universal for the class of all finite deterministic turn-based games with perfect information. We establish foundations of RBG, and analyze it theoretically and experimentally, focusing on the efficiency of reasoning. Regular Boardgames is the first GGP language that allows efficient encoding and playing games with complex rules and with large branching factor (e.g.\ amazons, arimaa, large chess variants, go, international checkers, paper soccer). |
Tasks | Board Games |
Published | 2017-06-08 |
URL | http://arxiv.org/abs/1706.02462v2 |
http://arxiv.org/pdf/1706.02462v2.pdf | |
PWC | https://paperswithcode.com/paper/regular-boardgames |
Repo | https://github.com/marekesz/rbg1.0 |
Framework | none |
Forward Thinking: Building Deep Random Forests
Title | Forward Thinking: Building Deep Random Forests |
Authors | Kevin Miller, Chris Hettinger, Jeffrey Humpherys, Tyler Jarvis, David Kartchner |
Abstract | The success of deep neural networks has inspired many to wonder whether other learners could benefit from deep, layered architectures. We present a general framework called forward thinking for deep learning that generalizes the architectural flexibility and sophistication of deep neural networks while also allowing for (i) different types of learning functions in the network, other than neurons, and (ii) the ability to adaptively deepen the network as needed to improve results. This is done by training one layer at a time, and once a layer is trained, the input data are mapped forward through the layer to create a new learning problem. The process is then repeated, transforming the data through multiple layers, one at a time, rendering a new dataset, which is expected to be better behaved, and on which a final output layer can achieve good performance. In the case where the neurons of deep neural nets are replaced with decision trees, we call the result a Forward Thinking Deep Random Forest (FTDRF). We demonstrate a proof of concept by applying FTDRF on the MNIST dataset. We also provide a general mathematical formulation that allows for other types of deep learning problems to be considered. |
Tasks | |
Published | 2017-05-20 |
URL | http://arxiv.org/abs/1705.07366v1 |
http://arxiv.org/pdf/1705.07366v1.pdf | |
PWC | https://paperswithcode.com/paper/forward-thinking-building-deep-random-forests |
Repo | https://github.com/tkchris93/ForwardThinking |
Framework | tf |
Deep Sets
Title | Deep Sets |
Authors | Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola |
Abstract | We study the problem of designing models for machine learning tasks defined on \emph{sets}. In contrast to traditional approach of operating on fixed dimensional vectors, we consider objective functions defined on sets that are invariant to permutations. Such problems are widespread, ranging from estimation of population statistics \cite{poczos13aistats}, to anomaly detection in piezometer data of embankment dams \cite{Jung15Exploration}, to cosmology \cite{Ntampaka16Dynamical,Ravanbakhsh16ICML1}. Our main theorem characterizes the permutation invariant functions and provides a family of functions to which any permutation invariant objective function must belong. This family of functions has a special structure which enables us to design a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks. We also derive the necessary and sufficient conditions for permutation equivariance in deep models. We demonstrate the applicability of our method on population statistic estimation, point cloud classification, set expansion, and outlier detection. |
Tasks | Anomaly Detection, Outlier Detection |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.06114v3 |
http://arxiv.org/pdf/1703.06114v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-sets |
Repo | https://github.com/MathieuCarriere/perslay |
Framework | tf |
Deep Learning applied to NLP
Title | Deep Learning applied to NLP |
Authors | Marc Moreno Lopez, Jugal Kalita |
Abstract | Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP. |
Tasks | Image Classification |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03091v1 |
http://arxiv.org/pdf/1703.03091v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-applied-to-nlp |
Repo | https://github.com/linghduoduo/NLP |
Framework | tf |
Enabling Sparse Winograd Convolution by Native Pruning
Title | Enabling Sparse Winograd Convolution by Native Pruning |
Authors | Sheng Li, Jongsoo Park, Ping Tak Peter Tang |
Abstract | Sparse methods and the use of Winograd convolutions are two orthogonal approaches, each of which significantly accelerates convolution computations in modern CNNs. Sparse Winograd merges these two and thus has the potential to offer a combined performance benefit. Nevertheless, training convolution layers so that the resulting Winograd kernels are sparse has not hitherto been very successful. By introducing a Winograd layer in place of a standard convolution layer, we can learn and prune Winograd coefficients “natively” and obtain sparsity level beyond 90% with only 0.1% accuracy loss with AlexNet on ImageNet dataset. Furthermore, we present a sparse Winograd convolution algorithm and implementation that exploits the sparsity, achieving up to 31.7 effective TFLOP/s in 32-bit precision on a latest Intel Xeon CPU, which corresponds to a 5.4x speedup over a state-of-the-art dense convolution implementation. |
Tasks | |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08597v2 |
http://arxiv.org/pdf/1702.08597v2.pdf | |
PWC | https://paperswithcode.com/paper/enabling-sparse-winograd-convolution-by |
Repo | https://github.com/IntelLabs/SkimCaffe |
Framework | tf |
Self-ensembling for visual domain adaptation
Title | Self-ensembling for visual domain adaptation |
Authors | Geoffrey French, Michal Mackiewicz, Mark Fisher |
Abstract | This paper explores the use of self-ensembling for visual domain adaptation problems. Our technique is derived from the mean teacher variant (Tarvainen et al., 2017) of temporal ensembling (Laine et al;, 2017), a technique that achieved state of the art results in the area of semi-supervised learning. We introduce a number of modifications to their approach for challenging domain adaptation scenarios and evaluate its effectiveness. Our approach achieves state of the art results in a variety of benchmarks, including our winning entry in the VISDA-2017 visual domain adaptation challenge. In small image benchmarks, our algorithm not only outperforms prior art, but can also achieve accuracy that is close to that of a classifier trained in a supervised fashion. |
Tasks | Domain Adaptation |
Published | 2017-06-16 |
URL | http://arxiv.org/abs/1706.05208v4 |
http://arxiv.org/pdf/1706.05208v4.pdf | |
PWC | https://paperswithcode.com/paper/self-ensembling-for-visual-domain-adaptation |
Repo | https://github.com/Britefury/self-ensemble-visual-domain-adapt |
Framework | pytorch |
CSGNet: Neural Shape Parser for Constructive Solid Geometry
Title | CSGNet: Neural Shape Parser for Constructive Solid Geometry |
Authors | Gopal Sharma, Rishabh Goyal, Difan Liu, Evangelos Kalogerakis, Subhransu Maji |
Abstract | We present a neural architecture that takes as input a 2D or 3D shape and outputs a program that generates the shape. The instructions in our program are based on constructive solid geometry principles, i.e., a set of boolean operations on shape primitives defined recursively. Bottom-up techniques for this shape parsing task rely on primitive detection and are inherently slow since the search space over possible primitive combinations is large. In contrast, our model uses a recurrent neural network that parses the input shape in a top-down manner, which is significantly faster and yields a compact and easy-to-interpret sequence of modeling instructions. Our model is also more effective as a shape detector compared to existing state-of-the-art detection techniques. We finally demonstrate that our network can be trained on novel datasets without ground-truth program annotations through policy gradient techniques. |
Tasks | |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08290v2 |
http://arxiv.org/pdf/1712.08290v2.pdf | |
PWC | https://paperswithcode.com/paper/csgnet-neural-shape-parser-for-constructive |
Repo | https://github.com/AN313/deformable |
Framework | pytorch |
Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders
Title | Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders |
Authors | Fengfu Li, Hong Qiao, Bo Zhang, Xuanyang Xi |
Abstract | Traditional image clustering methods take a two-step approach, feature learning and clustering, sequentially. However, recent research results demonstrated that combining the separated phases in a unified framework and training them jointly can achieve a better performance. In this paper, we first introduce fully convolutional auto-encoders for image feature learning and then propose a unified clustering framework to learn image representations and cluster centers jointly based on a fully convolutional auto-encoder and soft $k$-means scores. At initial stages of the learning procedure, the representations extracted from the auto-encoder may not be very discriminative for latter clustering. We address this issue by adopting a boosted discriminative distribution, where high score assignments are highlighted and low score ones are de-emphasized. With the gradually boosted discrimination, clustering assignment scores are discriminated and cluster purities are enlarged. Experiments on several vision benchmark datasets show that our methods can achieve a state-of-the-art performance. |
Tasks | Image Clustering |
Published | 2017-03-23 |
URL | http://arxiv.org/abs/1703.07980v1 |
http://arxiv.org/pdf/1703.07980v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminatively-boosted-image-clustering |
Repo | https://github.com/waynezhanghk/gacluster |
Framework | pytorch |
Dirichlet-vMF Mixture Model
Title | Dirichlet-vMF Mixture Model |
Authors | Shaohua Li |
Abstract | This document is about the multi-document Von-Mises-Fisher mixture model with a Dirichlet prior, referred to as VMFMix. VMFMix is analogous to Latent Dirichlet Allocation (LDA) in that they can capture the co-occurrence patterns acorss multiple documents. The difference is that in VMFMix, the topic-word distribution is defined on a continuous n-dimensional hypersphere. Hence VMFMix is used to derive topic embeddings, i.e., representative vectors, from multiple sets of embedding vectors. An efficient Variational Expectation-Maximization inference algorithm is derived. The performance of VMFMix on two document classification tasks is reported, with some preliminary analysis. |
Tasks | Document Classification |
Published | 2017-02-24 |
URL | http://arxiv.org/abs/1702.07495v1 |
http://arxiv.org/pdf/1702.07495v1.pdf | |
PWC | https://paperswithcode.com/paper/dirichlet-vmf-mixture-model |
Repo | https://github.com/askerlee/vmfmix |
Framework | none |