July 28, 2019

2862 words 14 mins read

Paper Group ANR 378

Paper Group ANR 378

Learning Network Structures from Contagion. Revisiting Activation Regularization for Language RNNs. Distral: Robust Multitask Reinforcement Learning. Concurrent Pump Scheduling and Storage Level Optimization Using Meta-Models and Evolutionary Algorithms. Open Loop Hyperparameter Optimization and Determinantal Point Processes. Representation Learnin …

Learning Network Structures from Contagion

Title Learning Network Structures from Contagion
Authors Adisak Supeesun, Jittat Fakcharoenphol
Abstract In 2014, Amin, Heidari, and Kearns proved that tree networks can be learned by observing only the infected set of vertices of the contagion process under the independent cascade model, in both the active and passive query models. They also showed empirically that simple extensions of their algorithms work on sparse networks. In this work, we focus on the active model. We prove that a simple modification of Amin et al.‘s algorithm works on more general classes of networks, namely (i) networks with large girth and low path growth rate, and (ii) networks with bounded degree. This also provides partial theoretical explanation for Amin et al.‘s experiments on sparse networks.
Tasks
Published 2017-05-29
URL http://arxiv.org/abs/1705.10051v1
PDF http://arxiv.org/pdf/1705.10051v1.pdf
PWC https://paperswithcode.com/paper/learning-network-structures-from-contagion
Repo
Framework

Revisiting Activation Regularization for Language RNNs

Title Revisiting Activation Regularization for Language RNNs
Authors Stephen Merity, Bryan McCann, Richard Socher
Abstract Recurrent neural networks (RNNs) serve as a fundamental building block for many sequence tasks across natural language processing. Recent research has focused on recurrent dropout techniques or custom RNN cells in order to improve performance. Both of these can require substantial modifications to the machine learning model or to the underlying RNN configurations. We revisit traditional regularization techniques, specifically L2 regularization on RNN activations and slowness regularization over successive hidden states, to improve the performance of RNNs on the task of language modeling. Both of these techniques require minimal modification to existing RNN architectures and result in performance improvements comparable or superior to more complicated regularization techniques or custom cell architectures. These regularization techniques can be used without any modification on optimized LSTM implementations such as the NVIDIA cuDNN LSTM.
Tasks L2 Regularization, Language Modelling
Published 2017-08-03
URL http://arxiv.org/abs/1708.01009v1
PDF http://arxiv.org/pdf/1708.01009v1.pdf
PWC https://paperswithcode.com/paper/revisiting-activation-regularization-for
Repo
Framework

Distral: Robust Multitask Reinforcement Learning

Title Distral: Robust Multitask Reinforcement Learning
Authors Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu
Abstract Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (Distill & transfer learning). Instead of sharing parameters between the different workers, we propose to share a “distilled” policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable—attributes that are critical in deep reinforcement learning.
Tasks Transfer Learning
Published 2017-07-13
URL http://arxiv.org/abs/1707.04175v1
PDF http://arxiv.org/pdf/1707.04175v1.pdf
PWC https://paperswithcode.com/paper/distral-robust-multitask-reinforcement
Repo
Framework

Concurrent Pump Scheduling and Storage Level Optimization Using Meta-Models and Evolutionary Algorithms

Title Concurrent Pump Scheduling and Storage Level Optimization Using Meta-Models and Evolutionary Algorithms
Authors Morad Behandish, Zheng Yi Wu
Abstract In spite of the growing computational power offered by the commodity hardware, fast pump scheduling of complex water distribution systems is still a challenge. In this paper, the Artificial Neural Network (ANN) meta-modeling technique has been employed with a Genetic Algorithm (GA) for simultaneously optimizing the pump operation and the tank levels at the ends of the cycle. The generalized GA+ANN algorithm has been tested on a real system in the UK. Comparing to the existing operation, the daily cost is reduced by about 10-15%, while the number of pump switches are kept below 4 switches-per-day. In addition, tank levels are optimized ensure a periodic behavior, which results in a predictable and stable performance over repeated cycles.
Tasks
Published 2017-11-14
URL http://arxiv.org/abs/1711.04988v1
PDF http://arxiv.org/pdf/1711.04988v1.pdf
PWC https://paperswithcode.com/paper/concurrent-pump-scheduling-and-storage-level
Repo
Framework

Open Loop Hyperparameter Optimization and Determinantal Point Processes

Title Open Loop Hyperparameter Optimization and Determinantal Point Processes
Authors Jesse Dodge, Kevin Jamieson, Noah A. Smith
Abstract Driven by the need for parallelizable hyperparameter optimization methods, this paper studies \emph{open loop} search methods: sequences that are predetermined and can be generated before a single configuration is evaluated. Examples include grid search, uniform random search, low discrepancy sequences, and other sampling distributions. In particular, we propose the use of $k$-determinantal point processes in hyperparameter optimization via random search. Compared to conventional uniform random search where hyperparameter settings are sampled independently, a $k$-DPP promotes diversity. We describe an approach that transforms hyperparameter search spaces for efficient use with a $k$-DPP. In addition, we introduce a novel Metropolis-Hastings algorithm which can sample from $k$-DPPs defined over any space from which uniform samples can be drawn, including spaces with a mixture of discrete and continuous dimensions or tree structure. Our experiments show significant benefits in realistic scenarios with a limited budget for training supervised learners, whether in serial or parallel.
Tasks Hyperparameter Optimization, Point Processes
Published 2017-06-06
URL https://arxiv.org/abs/1706.01566v4
PDF https://arxiv.org/pdf/1706.01566v4.pdf
PWC https://paperswithcode.com/paper/open-loop-hyperparameter-optimization-and
Repo
Framework

Representation Learning and Pairwise Ranking for Implicit Feedback in Recommendation Systems

Title Representation Learning and Pairwise Ranking for Implicit Feedback in Recommendation Systems
Authors Sumit Sidana, Mikhail Trofimov, Oleg Horodnitskii, Charlotte Laclau, Yury Maximov, Massih-Reza Amini
Abstract In this paper, we propose a novel ranking framework for collaborative filtering with the overall aim of learning user preferences over items by minimizing a pairwise ranking loss. We show the minimization problem involves dependent random variables and provide a theoretical analysis by proving the consistency of the empirical risk minimization in the worst case where all users choose a minimal number of positive and negative items. We further derive a Neural-Network model that jointly learns a new representation of users and items in an embedded space as well as the preference relation of users over the pairs of items. The learning objective is based on three scenarios of ranking losses that control the ability of the model to maintain the ordering over the items induced from the users’ preferences, as well as, the capacity of the dot-product defined in the learned embedded space to produce the ordering. The proposed model is by nature suitable for implicit feedback and involves the estimation of only very few parameters. Through extensive experiments on several real-world benchmarks on implicit data, we show the interest of learning the preference and the embedding simultaneously when compared to learning those separately. We also demonstrate that our approach is very competitive with the best state-of-the-art collaborative filtering techniques proposed for implicit feedback.
Tasks Recommendation Systems, Representation Learning
Published 2017-04-29
URL http://arxiv.org/abs/1705.00105v4
PDF http://arxiv.org/pdf/1705.00105v4.pdf
PWC https://paperswithcode.com/paper/representation-learning-and-pairwise-ranking
Repo
Framework

Applying MAPP Algorithm for Cooperative Path Finding in Urban Environments

Title Applying MAPP Algorithm for Cooperative Path Finding in Urban Environments
Authors Anton Andreychuk, Konstantin Yakovlev
Abstract The paper considers the problem of planning a set of non-conflict trajectories for the coalition of intelligent agents (mobile robots). Two divergent approaches, e.g. centralized and decentralized, are surveyed and analyzed. Decentralized planner - MAPP is described and applied to the task of finding trajectories for dozens UAVs performing nap-of-the-earth flight in urban environments. Results of the experimental studies provide an opportunity to claim that MAPP is a highly efficient planner for solving considered types of tasks.
Tasks
Published 2017-07-20
URL http://arxiv.org/abs/1707.06607v1
PDF http://arxiv.org/pdf/1707.06607v1.pdf
PWC https://paperswithcode.com/paper/applying-mapp-algorithm-for-cooperative-path
Repo
Framework

Multiple-Source Adaptation for Regression Problems

Title Multiple-Source Adaptation for Regression Problems
Authors Judy Hoffman, Mehryar Mohri, Ningshan Zhang
Abstract We present a detailed theoretical analysis of the problem of multiple-source adaptation in the general stochastic scenario, extending known results that assume a single target labeling function. Our results cover a more realistic scenario and show the existence of a single robust predictor accurate for \emph{any} target mixture of the source distributions. Moreover, we present an efficient and practical optimization solution to determine the robust predictor in the important case of squared loss, by casting the problem as an instance of DC-programming. We report the results of experiments with both an artificial task and a sentiment analysis task. We find that our algorithm outperforms competing approaches by producing a single robust model that performs well on any target mixture distribution.
Tasks Sentiment Analysis
Published 2017-11-14
URL http://arxiv.org/abs/1711.05037v1
PDF http://arxiv.org/pdf/1711.05037v1.pdf
PWC https://paperswithcode.com/paper/multiple-source-adaptation-for-regression
Repo
Framework

Active classification with comparison queries

Title Active classification with comparison queries
Authors Daniel M. Kane, Shachar Lovett, Shay Moran, Jiapeng Zhang
Abstract We study an extension of active learning in which the learning algorithm may ask the annotator to compare the distances of two examples from the boundary of their label-class. For example, in a recommendation system application (say for restaurants), the annotator may be asked whether she liked or disliked a specific restaurant (a label query); or which one of two restaurants did she like more (a comparison query). We focus on the class of half spaces, and show that under natural assumptions, such as large margin or bounded bit-description of the input examples, it is possible to reveal all the labels of a sample of size $n$ using approximately $O(\log n)$ queries. This implies an exponential improvement over classical active learning, where only label queries are allowed. We complement these results by showing that if any of these assumptions is removed then, in the worst case, $\Omega(n)$ queries are required. Our results follow from a new general framework of active learning with additional queries. We identify a combinatorial dimension, called the \emph{inference dimension}, that captures the query complexity when each additional query is determined by $O(1)$ examples (such as comparison queries, each of which is determined by the two compared examples). Our results for half spaces follow by bounding the inference dimension in the cases discussed above.
Tasks Active Learning
Published 2017-04-11
URL http://arxiv.org/abs/1704.03564v2
PDF http://arxiv.org/pdf/1704.03564v2.pdf
PWC https://paperswithcode.com/paper/active-classification-with-comparison-queries
Repo
Framework

Be Careful What You Backpropagate: A Case For Linear Output Activations & Gradient Boosting

Title Be Careful What You Backpropagate: A Case For Linear Output Activations & Gradient Boosting
Authors Anders Oland, Aayush Bansal, Roger B. Dannenberg, Bhiksha Raj
Abstract In this work, we show that saturating output activation functions, such as the softmax, impede learning on a number of standard classification tasks. Moreover, we present results showing that the utility of softmax does not stem from the normalization, as some have speculated. In fact, the normalization makes things worse. Rather, the advantage is in the exponentiation of error gradients. This exponential gradient boosting is shown to speed up convergence and improve generalization. To this end, we demonstrate faster convergence and better performance on diverse classification tasks: image classification using CIFAR-10 and ImageNet, and semantic segmentation using PASCAL VOC 2012. In the latter case, using the state-of-the-art neural network architecture, the model converged 33% faster with our method (roughly two days of training less) than with the standard softmax activation, and with a slightly better performance to boot.
Tasks Image Classification, Semantic Segmentation
Published 2017-07-13
URL http://arxiv.org/abs/1707.04199v1
PDF http://arxiv.org/pdf/1707.04199v1.pdf
PWC https://paperswithcode.com/paper/be-careful-what-you-backpropagate-a-case-for
Repo
Framework

Estimating the error variance in a high-dimensional linear model

Title Estimating the error variance in a high-dimensional linear model
Authors Guo Yu, Jacob Bien
Abstract The lasso has been studied extensively as a tool for estimating the coefficient vector in the high-dimensional linear model; however, considerably less is known about estimating the error variance in this context. In this paper, we propose the natural lasso estimator for the error variance, which maximizes a penalized likelihood objective. A key aspect of the natural lasso is that the likelihood is expressed in terms of the natural parameterization of the multiparameter exponential family of a Gaussian with unknown mean and variance. The result is a remarkably simple estimator of the error variance with provably good performance in terms of mean squared error. These theoretical results do not require placing any assumptions on the design matrix or the true regression coefficients. We also propose a companion estimator, called the organic lasso, which theoretically does not require tuning of the regularization parameter. Both estimators do well empirically compared to preexisting methods, especially in settings where successful recovery of the true support of the coefficient vector is hard. Finally, we show that existing methods can do well under fewer assumptions than previously known, thus providing a fuller story about the problem of estimating the error variance in high-dimensional linear models.
Tasks
Published 2017-12-06
URL https://arxiv.org/abs/1712.02412v3
PDF https://arxiv.org/pdf/1712.02412v3.pdf
PWC https://paperswithcode.com/paper/estimating-the-error-variance-in-a-high
Repo
Framework

CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation

Title CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation
Authors Saeedeh Shekarpour, Faisal Alshargi, Valerie Shalin, Krishnaprasad Thirunarayan, Amit P. Sheth
Abstract While the general analysis of named entities has received substantial research attention on unstructured as well as structured data, the analysis of relations among named entities has received limited focus. In fact, a review of the literature revealed a deficiency in research on the abstract conceptualization required to organize relations. We believe that such an abstract conceptualization can benefit various communities and applications such as natural language processing, information extraction, machine learning, and ontology engineering. In this paper, we present Comprehensive EVent Ontology (CEVO), built on Levin’s conceptual hierarchy of English verbs that categorizes verbs with shared meaning, and syntactic behavior. We present the fundamental concepts and requirements for this ontology. Furthermore, we present three use cases employing the CEVO ontology on annotation tasks: (i) annotating relations in plain text, (ii) annotating ontological properties, and (iii) linking textual relations to ontological properties. These use-cases demonstrate the benefits of using CEVO for annotation: (i) annotating English verbs from an abstract conceptualization, (ii) playing the role of an upper ontology for organizing ontological properties, and (iii) facilitating the annotation of text relations using any underlying vocabulary. This resource is available at https://shekarpour.github.io/cevo.io/ using https://w3id.org/cevo namespace.
Tasks
Published 2017-01-19
URL http://arxiv.org/abs/1701.05625v2
PDF http://arxiv.org/pdf/1701.05625v2.pdf
PWC https://paperswithcode.com/paper/cevo-comprehensive-event-ontology-enhancing
Repo
Framework

Balancing Interpretability and Predictive Accuracy for Unsupervised Tensor Mining

Title Balancing Interpretability and Predictive Accuracy for Unsupervised Tensor Mining
Authors Ishmam Zabir, Evangelos E. Papalexakis
Abstract The PARAFAC tensor decomposition has enjoyed an increasing success in exploratory multi-aspect data mining scenarios. A major challenge remains the estimation of the number of latent factors (i.e., the rank) of the decomposition, which yields high-quality, interpretable results. Previously, we have proposed an automated tensor mining method which leverages a well-known quality heuristic from the field of Chemometrics, the Core Consistency Diagnostic (CORCONDIA), in order to automatically determine the rank for the PARAFAC decomposition. In this work we set out to explore the trade-off between 1) the interpretability/quality of the results (as expressed by CORCONDIA), and 2) the predictive accuracy of the results, in order to further improve the rank estimation quality. Our preliminary results indicate that striking a good balance in that trade-off benefits rank estimation.
Tasks
Published 2017-09-04
URL http://arxiv.org/abs/1709.01147v1
PDF http://arxiv.org/pdf/1709.01147v1.pdf
PWC https://paperswithcode.com/paper/balancing-interpretability-and-predictive
Repo
Framework

Learning how to Active Learn: A Deep Reinforcement Learning Approach

Title Learning how to Active Learn: A Deep Reinforcement Learning Approach
Authors Meng Fang, Yuan Li, Trevor Cohn
Abstract Active learning aims to select a small subset of data for annotation such that a classifier learned on the data is highly accurate. This is usually done using heuristic selection methods, however the effectiveness of such methods is limited and moreover, the performance of heuristics varies between datasets. To address these shortcomings, we introduce a novel formulation by reframing the active learning as a reinforcement learning problem and explicitly learning a data selection policy, where the policy takes the role of the active learning heuristic. Importantly, our method allows the selection policy learned using simulation on one language to be transferred to other languages. We demonstrate our method using cross-lingual named entity recognition, observing uniform improvements over traditional active learning.
Tasks Active Learning, Named Entity Recognition
Published 2017-08-08
URL http://arxiv.org/abs/1708.02383v1
PDF http://arxiv.org/pdf/1708.02383v1.pdf
PWC https://paperswithcode.com/paper/learning-how-to-active-learn-a-deep
Repo
Framework

TensorFlow-Serving: Flexible, High-Performance ML Serving

Title TensorFlow-Serving: Flexible, High-Performance ML Serving
Authors Christopher Olston, Noah Fiedel, Kiril Gorovoy, Jeremiah Harmsen, Li Lao, Fangwei Li, Vinu Rajashekhar, Sukriti Ramesh, Jordan Soyke
Abstract We describe TensorFlow-Serving, a system to serve machine learning models inside Google which is also available in the cloud and via open-source. It is extremely flexible in terms of the types of ML platforms it supports, and ways to integrate with systems that convey new models and updated versions from training to serving. At the same time, the core code paths around model lookup and inference have been carefully optimized to avoid performance pitfalls observed in naive implementations. Google uses it in many production deployments, including a multi-tenant model hosting service called TFS^2.
Tasks
Published 2017-12-17
URL http://arxiv.org/abs/1712.06139v2
PDF http://arxiv.org/pdf/1712.06139v2.pdf
PWC https://paperswithcode.com/paper/tensorflow-serving-flexible-high-performance
Repo
Framework
comments powered by Disqus