October 21, 2019

3102 words 15 mins read

Paper Group AWR 85

Paper Group AWR 85

Maximizing acquisition functions for Bayesian optimization. MnasNet: Platform-Aware Neural Architecture Search for Mobile. Learning Word Vectors for 157 Languages. Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction. ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network. Eval …

Maximizing acquisition functions for Bayesian optimization

Title Maximizing acquisition functions for Bayesian optimization
Authors James T. Wilson, Frank Hutter, Marc Peter Deisenroth
Abstract Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide its search process. Fully maximizing acquisition functions produces the Bayes’ decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We first show that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization. Subsequently, we identify a common family of acquisition functions, including EI and UCB, whose properties not only facilitate but justify use of greedy approaches for their maximization.
Tasks
Published 2018-05-25
URL http://arxiv.org/abs/1805.10196v2
PDF http://arxiv.org/pdf/1805.10196v2.pdf
PWC https://paperswithcode.com/paper/maximizing-acquisition-functions-for-bayesian
Repo https://github.com/j-wilson/MaximizingAcquisitionFunctions
Framework tf

MnasNet: Platform-Aware Neural Architecture Search for Mobile

Title MnasNet: Platform-Aware Neural Architecture Search for Mobile
Authors Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, Quoc V. Le
Abstract Designing convolutional neural networks (CNN) for mobile devices is challenging because mobile models need to be small and fast, yet still accurate. Although significant efforts have been dedicated to design and improve mobile CNNs on all dimensions, it is very difficult to manually balance these trade-offs when there are so many architectural possibilities to consider. In this paper, we propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. Unlike previous work, where latency is considered via another, often inaccurate proxy (e.g., FLOPS), our approach directly measures real-world inference latency by executing the model on mobile phones. To further strike the right balance between flexibility and search space size, we propose a novel factorized hierarchical search space that encourages layer diversity throughout the network. Experimental results show that our approach consistently outperforms state-of-the-art mobile CNN models across multiple vision tasks. On the ImageNet classification task, our MnasNet achieves 75.2% top-1 accuracy with 78ms latency on a Pixel phone, which is 1.8x faster than MobileNetV2 [29] with 0.5% higher accuracy and 2.3x faster than NASNet [36] with 1.2% higher accuracy. Our MnasNet also achieves better mAP quality than MobileNets for COCO object detection. Code is at https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet
Tasks Image Classification, Neural Architecture Search, Object Detection
Published 2018-07-31
URL https://arxiv.org/abs/1807.11626v3
PDF https://arxiv.org/pdf/1807.11626v3.pdf
PWC https://paperswithcode.com/paper/mnasnet-platform-aware-neural-architecture
Repo https://github.com/mingxingtan/mnasnet
Framework tf

Learning Word Vectors for 157 Languages

Title Learning Word Vectors for 157 Languages
Authors Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov
Abstract Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. A key ingredient to the successful application of these representations is to train them on very large corpora, and use these pre-trained models in downstream tasks. In this paper, we describe how we trained such high quality word representations for 157 languages. We used two sources of data to train these models: the free online encyclopedia Wikipedia and data from the common crawl project. We also introduce three new word analogy datasets to evaluate these word vectors, for French, Hindi and Polish. Finally, we evaluate our pre-trained word vectors on 10 languages for which evaluation datasets exists, showing very strong performance compared to previous models.
Tasks
Published 2018-02-19
URL http://arxiv.org/abs/1802.06893v2
PDF http://arxiv.org/pdf/1802.06893v2.pdf
PWC https://paperswithcode.com/paper/learning-word-vectors-for-157-languages
Repo https://github.com/dzieciou/lemmatizer-pl
Framework tf

Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction

Title Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction
Authors Lifeng Jin, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane Schwartz
Abstract There have been several recent attempts to improve the accuracy of grammar induction systems by bounding the recursive complexity of the induction model (Ponvert et al., 2011; Noji and Johnson, 2016; Shain et al., 2016; Jin et al., 2018). Modern depth-bounded grammar inducers have been shown to be more accurate than early unbounded PCFG inducers, but this technique has never been compared against unbounded induction within the same system, in part because most previous depth-bounding models are built around sequence models, the complexity of which grows exponentially with the maximum allowed depth. The present work instead applies depth bounds within a chart-based Bayesian PCFG inducer (Johnson et al., 2007b), where bounding can be switched on and off, and then samples trees with and without bounding. Results show that depth-bounding is indeed significantly effective in limiting the search space of the inducer and thereby increasing the accuracy of the resulting parsing model. Moreover, parsing results on English, Chinese and German show that this bounded model with a new inference technique is able to produce parse trees more accurately than or competitively with state-of-the-art constituency-based grammar induction models.
Tasks
Published 2018-09-10
URL http://arxiv.org/abs/1809.03112v1
PDF http://arxiv.org/pdf/1809.03112v1.pdf
PWC https://paperswithcode.com/paper/depth-bounding-is-effective-improvements-and
Repo https://github.com/lifengjin/dimi_emnlp18
Framework none

ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network

Title ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network
Authors Sachin Mehta, Mohammad Rastegari, Linda Shapiro, Hannaneh Hajishirzi
Abstract We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual and sequential data. Our network uses group point-wise and depth-wise dilated separable convolutions to learn representations from a large effective receptive field with fewer FLOPs and parameters. The performance of our network is evaluated on four different tasks: (1) object classification, (2) semantic segmentation, (3) object detection, and (4) language modeling. Experiments on these tasks, including image classification on the ImageNet and language modeling on the PenTree bank dataset, demonstrate the superior performance of our method over the state-of-the-art methods. Our network outperforms ESPNet by 4-5% and has 2-4x fewer FLOPs on the PASCAL VOC and the Cityscapes dataset. Compared to YOLOv2 on the MS-COCO object detection, ESPNetv2 delivers 4.4% higher accuracy with 6x fewer FLOPs. Our experiments show that ESPNetv2 is much more power efficient than existing state-of-the-art efficient methods including ShuffleNets and MobileNets. Our code is open-source and available at https://github.com/sacmehta/ESPNetv2
Tasks Image Classification, Language Modelling, Object Classification, Object Detection, Real-Time Object Detection, Real-Time Semantic Segmentation, Semantic Segmentation
Published 2018-11-28
URL http://arxiv.org/abs/1811.11431v3
PDF http://arxiv.org/pdf/1811.11431v3.pdf
PWC https://paperswithcode.com/paper/espnetv2-a-light-weight-power-efficient-and
Repo https://github.com/adichaloo/EdgeNet
Framework pytorch

Evaluating Scoped Meaning Representations

Title Evaluating Scoped Meaning Representations
Authors Rik van Noord, Lasha Abzianidze, Hessel Haagsma, Johan Bos
Abstract Semantic parsing offers many opportunities to improve natural language understanding. We present a semantically annotated parallel corpus for English, German, Italian, and Dutch where sentences are aligned with scoped meaning representations in order to capture the semantics of negation, modals, quantification, and presupposition triggers. The semantic formalism is based on Discourse Representation Theory, but concepts are represented by WordNet synsets and thematic roles by VerbNet relations. Translating scoped meaning representations to sets of clauses enables us to compare them for the purpose of semantic parser evaluation and checking translations. This is done by computing precision and recall on matching clauses, in a similar way as is done for Abstract Meaning Representations. We show that our matching tool for evaluating scoped meaning representations is both accurate and efficient. Applying this matching tool to three baseline semantic parsers yields F-scores between 43% and 54%. A pilot study is performed to automatically find changes in meaning by comparing meaning representations of translations. This comparison turns out to be an additional way of (i) finding annotation mistakes and (ii) finding instances where our semantic analysis needs to be improved.
Tasks Semantic Parsing
Published 2018-02-23
URL http://arxiv.org/abs/1802.08599v2
PDF http://arxiv.org/pdf/1802.08599v2.pdf
PWC https://paperswithcode.com/paper/evaluating-scoped-meaning-representations
Repo https://github.com/RikVN/DRS_parsing
Framework none

Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning

Title Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning
Authors Sainbayar Sukhbaatar, Emily Denton, Arthur Szlam, Rob Fergus
Abstract In hierarchical reinforcement learning a major challenge is determining appropriate low-level policies. We propose an unsupervised learning scheme, based on asymmetric self-play from Sukhbaatar et al. (2018), that automatically learns a good representation of sub-goals in the environment and a low-level policy that can execute them. A high-level policy can then direct the lower one by generating a sequence of continuous sub-goal vectors. We evaluate our model using Mazebase and Mujoco environments, including the challenging AntGather task. Visualizations of the sub-goal embeddings reveal a logical decomposition of tasks within the environment. Quantitatively, our approach obtains compelling performance gains over non-hierarchical approaches.
Tasks Hierarchical Reinforcement Learning
Published 2018-11-22
URL http://arxiv.org/abs/1811.09083v1
PDF http://arxiv.org/pdf/1811.09083v1.pdf
PWC https://paperswithcode.com/paper/learning-goal-embeddings-via-self-play-for
Repo https://github.com/tesatory/hsp
Framework pytorch

Diversity-Driven Extensible Hierarchical Reinforcement Learning

Title Diversity-Driven Extensible Hierarchical Reinforcement Learning
Authors Yuhang Song, Jianyi Wang, Thomas Lukasiewicz, Zhenghua Xu, Mai Xu
Abstract Hierarchical reinforcement learning (HRL) has recently shown promising advances on speeding up learning, improving the exploration, and discovering intertask transferable skills. Most recent works focus on HRL with two levels, i.e., a master policy manipulates subpolicies, which in turn manipulate primitive actions. However, HRL with multiple levels is usually needed in many real-world scenarios, whose ultimate goals are highly abstract, while their actions are very primitive. Therefore, in this paper, we propose a diversity-driven extensible HRL (DEHRL), where an extensible and scalable framework is built and learned levelwise to realize HRL with multiple levels. DEHRL follows a popular assumption: diverse subpolicies are useful, i.e., subpolicies are believed to be more useful if they are more diverse. However, existing implementations of this diversity assumption usually have their own drawbacks, which makes them inapplicable to HRL with multiple levels. Consequently, we further propose a novel diversity-driven solution to achieve this assumption in DEHRL. Experimental studies evaluate DEHRL with five baselines from four perspectives in two domains; the results show that DEHRL outperforms the state-of-the-art baselines in all four aspects.
Tasks Hierarchical Reinforcement Learning
Published 2018-11-10
URL http://arxiv.org/abs/1811.04324v2
PDF http://arxiv.org/pdf/1811.04324v2.pdf
PWC https://paperswithcode.com/paper/diversity-driven-extensible-hierarchical
Repo https://github.com/YuhangSong/DEHRL
Framework tf

Task-Aware Compressed Sensing with Generative Adversarial Networks

Title Task-Aware Compressed Sensing with Generative Adversarial Networks
Authors Maya Kabkab, Pouya Samangouei, Rama Chellappa
Abstract In recent years, neural network approaches have been widely adopted for machine learning tasks, with applications in computer vision. More recently, unsupervised generative models based on neural networks have been successfully applied to model data distributions via low-dimensional latent spaces. In this paper, we use Generative Adversarial Networks (GANs) to impose structure in compressed sensing problems, replacing the usual sparsity constraint. We propose to train the GANs in a task-aware fashion, specifically for reconstruction tasks. We also show that it is possible to train our model without using any (or much) non-compressed data. Finally, we show that the latent space of the GAN carries discriminative information and can further be regularized to generate input features for general inference tasks. We demonstrate the effectiveness of our method on a variety of reconstruction and classification problems.
Tasks
Published 2018-02-05
URL http://arxiv.org/abs/1802.01284v1
PDF http://arxiv.org/pdf/1802.01284v1.pdf
PWC https://paperswithcode.com/paper/task-aware-compressed-sensing-with-generative
Repo https://github.com/po0ya/csgan
Framework tf
Title FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
Authors Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer
Abstract Designing accurate and efficient ConvNets for mobile devices is challenging because the design space is combinatorially large. Due to this, previous neural architecture search (NAS) methods are computationally expensive. ConvNet architecture optimality depends on factors such as input resolution and target devices. However, existing approaches are too expensive for case-by-case redesigns. Also, previous work focuses primarily on reducing FLOPs, but FLOP count does not always reflect actual latency. To address these, we propose a differentiable neural architecture search (DNAS) framework that uses gradient-based methods to optimize ConvNet architectures, avoiding enumerating and training individual architectures separately as in previous methods. FBNets, a family of models discovered by DNAS surpass state-of-the-art models both designed manually and generated automatically. FBNet-B achieves 74.1% top-1 accuracy on ImageNet with 295M FLOPs and 23.1 ms latency on a Samsung S8 phone, 2.4x smaller and 1.5x faster than MobileNetV2-1.3 with similar accuracy. Despite higher accuracy and lower latency than MnasNet, we estimate FBNet-B’s search cost is 420x smaller than MnasNet’s, at only 216 GPU-hours. Searched for different resolutions and channel sizes, FBNets achieve 1.5% to 6.4% higher accuracy than MobileNetV2. The smallest FBNet achieves 50.2% accuracy and 2.9 ms latency (345 frames per second) on a Samsung S8. Over a Samsung-optimized FBNet, the iPhone-X-optimized model achieves a 1.4x speedup on an iPhone X.
Tasks Image Classification, Neural Architecture Search
Published 2018-12-09
URL https://arxiv.org/abs/1812.03443v3
PDF https://arxiv.org/pdf/1812.03443v3.pdf
PWC https://paperswithcode.com/paper/fbnet-hardware-aware-efficient-convnet-design
Repo https://github.com/hpnair/18663_Project_FBNet
Framework pytorch

Decreasing the size of the Restricted Boltzmann machine

Title Decreasing the size of the Restricted Boltzmann machine
Authors Yohei Saito, Takuya Kato
Abstract We propose a method to decrease the number of hidden units of the restricted Boltzmann machine while avoiding decrease of the performance measured by the Kullback-Leibler divergence. Then, we demonstrate our algorithm by using numerical simulations.
Tasks
Published 2018-07-09
URL http://arxiv.org/abs/1807.02999v2
PDF http://arxiv.org/pdf/1807.02999v2.pdf
PWC https://paperswithcode.com/paper/decreasing-the-size-of-the-restricted
Repo https://github.com/snsiorssb/RBM
Framework none

SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories

Title SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories
Authors Sweta Karlekar, Mohit Bansal
Abstract With the recent rise of #MeToo, an increasing number of personal stories about sexual harassment and sexual abuse have been shared online. In order to push forward the fight against such harassment and abuse, we present the task of automatically categorizing and analyzing various forms of sexual harassment, based on stories shared on the online forum SafeCity. For the labels of groping, ogling, and commenting, our single-label CNN-RNN model achieves an accuracy of 86.5%, and our multi-label model achieves a Hamming score of 82.5%. Furthermore, we present analysis using LIME, first-derivative saliency heatmaps, activation clustering, and embedding visualization to interpret neural model predictions and demonstrate how this extracts features that can help automatically fill out incident reports, identify unsafe areas, avoid unsafe practices, and ‘pin the creeps’.
Tasks
Published 2018-09-13
URL http://arxiv.org/abs/1809.04739v2
PDF http://arxiv.org/pdf/1809.04739v2.pdf
PWC https://paperswithcode.com/paper/safecity-understanding-diverse-forms-of
Repo https://github.com/Shubhammawa/ML-DL-papers-articles
Framework tf

ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning

Title ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning
Authors Maarten Sap, Ronan LeBras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, Yejin Choi
Abstract We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge. Compared to existing resources that center around taxonomic knowledge, ATOMIC focuses on inferential knowledge organized as typed if-then relations with variables (e.g., “if X pays Y a compliment, then Y will likely return the compliment”). We propose nine if-then relation types to distinguish causes vs. effects, agents vs. themes, voluntary vs. involuntary events, and actions vs. mental states. By generatively training on the rich inferential knowledge described in ATOMIC, we show that neural models can acquire simple commonsense capabilities and reason about previously unseen events. Experimental results demonstrate that multitask models that incorporate the hierarchical structure of if-then relation types lead to more accurate inference compared to models trained in isolation, as measured by both automatic and human evaluation.
Tasks
Published 2018-10-31
URL http://arxiv.org/abs/1811.00146v3
PDF http://arxiv.org/pdf/1811.00146v3.pdf
PWC https://paperswithcode.com/paper/atomic-an-atlas-of-machine-commonsense-for-if
Repo https://github.com/shengyp/Temporal-and-Evolving-KG
Framework none
Title Classification Using Link Prediction
Authors Seyed Amin Fadaee, Maryam Amir Haeri
Abstract Link prediction in a graph is the problem of detecting the missing links that would be formed in the near future. Using a graph representation of the data, we can convert the problem of classification to the problem of link prediction which aims at finding the missing links between the unlabeled data (unlabeled nodes) and their classes. To our knowledge, despite the fact that numerous algorithms use the graph representation of the data for classification, none are using link prediction as the heart of their classifying procedure. In this work, we propose a novel algorithm called CULP (Classification Using Link Prediction) which uses a new structure namely Label Embedded Graph or LEG and a link predictor to find the class of the unlabeled data. Different link predictors along with Compatibility Score - a new link predictor we proposed that is designed specifically for our settings - has been used and showed promising results for classifying different datasets. This paper further improved CULP by designing an extension called CULM which uses a majority vote (hence the M in the acronym) procedure with weights proportional to the predictions’ confidences to use the predictive power of multiple link predictors and also exploits the low level features of the data. Extensive experimental evaluations shows that both CULP and CULM are highly accurate and competitive with the cutting edge graph classifiers and general classifiers.
Tasks Link Prediction
Published 2018-10-01
URL http://arxiv.org/abs/1810.00717v1
PDF http://arxiv.org/pdf/1810.00717v1.pdf
PWC https://paperswithcode.com/paper/classification-using-link-prediction
Repo https://github.com/aminfadaee/culp
Framework none

Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds

Title Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds
Authors Pedro Hermosilla, Tobias Ritschel, Pere-Pau Vázquez, Àlvar Vinacua, Timo Ropinski
Abstract Deep learning systems extensively use convolution operations to process input data. Though convolution is clearly defined for structured data such as 2D images or 3D volumes, this is not true for other data types such as sparse point clouds. Previous techniques have developed approximations to convolutions for restricted conditions. Unfortunately, their applicability is limited and cannot be used for general point clouds. We propose an efficient and effective method to learn convolutions for non-uniformly sampled point clouds, as they are obtained with modern acquisition techniques. Learning is enabled by four key novelties: first, representing the convolution kernel itself as a multilayer perceptron; second, phrasing convolution as a Monte Carlo integration problem, third, using this notion to combine information from multiple samplings at different levels; and fourth using Poisson disk sampling as a scalable means of hierarchical point cloud learning. The key idea across all these contributions is to guarantee adequate consideration of the underlying non-uniform sample distribution function from a Monte Carlo perspective. To make the proposed concepts applicable to real-world tasks, we furthermore propose an efficient implementation which significantly reduces the GPU memory required during the training process. By employing our method in hierarchical network architectures we can outperform most of the state-of-the-art networks on established point cloud segmentation, classification and normal estimation benchmarks. Furthermore, in contrast to most existing approaches, we also demonstrate the robustness of our method with respect to sampling variations, even when training with uniformly sampled data only. To support the direct application of these concepts, we provide a ready-to-use TensorFlow implementation of these layers at https://github.com/viscom-ulm/MCCNN
Tasks
Published 2018-06-05
URL http://arxiv.org/abs/1806.01759v2
PDF http://arxiv.org/pdf/1806.01759v2.pdf
PWC https://paperswithcode.com/paper/monte-carlo-convolution-for-learning-on-non
Repo https://github.com/viscom-ulm/MCCNN
Framework tf
comments powered by Disqus