July 28, 2019

3160 words 15 mins read

Paper Group ANR 185

Paper Group ANR 185

Pyramid Vector Quantization for Deep Learning. A Generic Framework for Interesting Subspace Cluster Detection in Multi-attributed Networks. Large-Scale Domain Adaptation via Teacher-Student Learning. Learning to Embed Words in Context for Syntactic Tasks. Simplified Long Short-term Memory Recurrent Neural Networks: part II. Machine Learning Approac …

Pyramid Vector Quantization for Deep Learning

Title Pyramid Vector Quantization for Deep Learning
Authors Vincenzo Liguori
Abstract This paper explores the use of Pyramid Vector Quantization (PVQ) to reduce the computational cost for a variety of neural networks (NNs) while, at the same time, compressing the weights that describe them. This is based on the fact that the dot product between an N dimensional vector of real numbers and an N dimensional PVQ vector can be calculated with only additions and subtractions and one multiplication. This is advantageous since tensor products, commonly used in NNs, can be re-conduced to a dot product or a set of dot products. Finally, it is stressed that any NN architecture that is based on an operation that can be re-conduced to a dot product can benefit from the techniques described here.
Tasks Quantization
Published 2017-04-10
URL http://arxiv.org/abs/1704.02681v1
PDF http://arxiv.org/pdf/1704.02681v1.pdf
PWC https://paperswithcode.com/paper/pyramid-vector-quantization-for-deep-learning
Repo
Framework

A Generic Framework for Interesting Subspace Cluster Detection in Multi-attributed Networks

Title A Generic Framework for Interesting Subspace Cluster Detection in Multi-attributed Networks
Authors Feng Chen, Baojian Zhou, Adil Alim, Liang Zhao
Abstract Detection of interesting (e.g., coherent or anomalous) clusters has been studied extensively on plain or univariate networks, with various applications. Recently, algorithms have been extended to networks with multiple attributes for each node in the real-world. In a multi-attributed network, often, a cluster of nodes is only interesting for a subset (subspace) of attributes, and this type of clusters is called subspace clusters. However, in the current literature, few methods are capable of detecting subspace clusters, which involves concurrent feature selection and network cluster detection. These relevant methods are mostly heuristic-driven and customized for specific application scenarios. In this work, we present a generic and theoretical framework for detection of interesting subspace clusters in large multi-attributed networks. Specifically, we propose a subspace graph-structured matching pursuit algorithm, namely, SG-Pursuit, to address a broad class of such problems for different score functions (e.g., coherence or anomalous functions) and topology constraints (e.g., connected subgraphs and dense subgraphs). We prove that our algorithm 1) runs in nearly-linear time on the network size and the total number of attributes and 2) enjoys rigorous guarantees (geometrical convergence rate and tight error bound) analogous to those of the state-of-the-art algorithms for sparse feature selection problems and subgraph detection problems. As a case study, we specialize SG-Pursuit to optimize a number of well-known score functions for two typical tasks, including detection of coherent dense and anomalous connected subspace clusters in real-world networks. Empirical evidence demonstrates that our proposed generic algorithm SG-Pursuit performs superior over state-of-the-art methods that are designed specifically for these two tasks.
Tasks Feature Selection
Published 2017-09-15
URL http://arxiv.org/abs/1709.05246v2
PDF http://arxiv.org/pdf/1709.05246v2.pdf
PWC https://paperswithcode.com/paper/a-generic-framework-for-interesting-subspace
Repo
Framework

Large-Scale Domain Adaptation via Teacher-Student Learning

Title Large-Scale Domain Adaptation via Teacher-Student Learning
Authors Jinyu Li, Michael L. Seltzer, Xi Wang, Rui Zhao, Yifan Gong
Abstract High accuracy speech recognition requires a large amount of transcribed data for supervised training. In the absence of such data, domain adaptation of a well-trained acoustic model can be performed, but even here, high accuracy usually requires significant labeled data from the target domain. In this work, we propose an approach to domain adaptation that does not require transcriptions but instead uses a corpus of unlabeled parallel data, consisting of pairs of samples from the source domain of the well-trained model and the desired target domain. To perform adaptation, we employ teacher/student (T/S) learning, in which the posterior probabilities generated by the source-domain model can be used in lieu of labels to train the target-domain model. We evaluate the proposed approach in two scenarios, adapting a clean acoustic model to noisy speech and adapting an adults speech acoustic model to children speech. Significant improvements in accuracy are obtained, with reductions in word error rate of up to 44% over the original source model without the need for transcribed data in the target domain. Moreover, we show that increasing the amount of unlabeled data results in additional model robustness, which is particularly beneficial when using simulated training data in the target-domain.
Tasks Domain Adaptation, Speech Recognition
Published 2017-08-17
URL http://arxiv.org/abs/1708.05466v1
PDF http://arxiv.org/pdf/1708.05466v1.pdf
PWC https://paperswithcode.com/paper/large-scale-domain-adaptation-via-teacher
Repo
Framework

Learning to Embed Words in Context for Syntactic Tasks

Title Learning to Embed Words in Context for Syntactic Tasks
Authors Lifu Tu, Kevin Gimpel, Karen Livescu
Abstract We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.
Tasks
Published 2017-06-09
URL http://arxiv.org/abs/1706.02807v2
PDF http://arxiv.org/pdf/1706.02807v2.pdf
PWC https://paperswithcode.com/paper/learning-to-embed-words-in-context-for
Repo
Framework

Simplified Long Short-term Memory Recurrent Neural Networks: part II

Title Simplified Long Short-term Memory Recurrent Neural Networks: part II
Authors Atra Akandeh, Fathi M. Salem
Abstract This is part II of three-part work. Here, we present a second set of inter-related five variants of simplified Long Short-term Memory (LSTM) recurrent neural networks by further reducing adaptive parameters. Two of these models have been introduced in part I of this work. We evaluate and verify our model variants on the benchmark MNIST dataset and assert that these models are comparable to the base LSTM model while use progressively less number of parameters. Moreover, we observe that in case of using the ReLU activation, the test accuracy performance of the standard LSTM will drop after a number of epochs when learning parameter become larger. However all of the new model variants sustain their performance.
Tasks
Published 2017-07-14
URL http://arxiv.org/abs/1707.04623v1
PDF http://arxiv.org/pdf/1707.04623v1.pdf
PWC https://paperswithcode.com/paper/simplified-long-short-term-memory-recurrent-2
Repo
Framework

Machine Learning Approaches for Traffic Volume Forecasting: A Case Study of the Moroccan Highway Network

Title Machine Learning Approaches for Traffic Volume Forecasting: A Case Study of the Moroccan Highway Network
Authors Abderrahim Khalifa, Younes Idsouguou, Loubna Benabbou, Mourad Zirari
Abstract In this paper, we aim to illustrate different approaches we followed while developing a forecasting tool for highway traffic in Morocco. Two main approaches were adopted: Statistical Analysis as a step of data exploration and data wrangling. Therefore, a beta model is carried out for a better understanding of traffic behavior. Next, we moved to Machine Learning where we worked with a bunch of algorithms such as Random Forest, Artificial Neural Networks, Extra Trees, etc. yet, we were convinced that this field of study is still considered under state of the art models, so, we were also covering an application of Long Short-Term Memory Neural Networks.
Tasks
Published 2017-11-18
URL http://arxiv.org/abs/1711.06779v1
PDF http://arxiv.org/pdf/1711.06779v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-approaches-for-traffic
Repo
Framework

Group Recommendations: Axioms, Impossibilities, and Random Walks

Title Group Recommendations: Axioms, Impossibilities, and Random Walks
Authors Omer Lev, Moshe Tennenholtz
Abstract We introduce an axiomatic approach to group recommendations, in line of previous work on the axiomatic treatment of trust-based recommendation systems, ranking systems, and other foundational work on the axiomatic approach to internet mechanisms in social choice settings. In group recommendations we wish to recommend to a group of agents, consisting of both opinionated and undecided members, a joint choice that would be acceptable to them. Such a system has many applications, such as choosing a movie or a restaurant to go to with a group of friends, recommending games for online game players, & other communal activities. Our method utilizes a given social graph to extract information on the undecided, relying on the agents influencing them. We first show that a set of fairly natural desired requirements (a.k.a axioms) leads to an impossibility, rendering mutual satisfaction of them unreachable. However, we also show a modified set of axioms that fully axiomatize a group variant of the random-walk recommendation system, expanding a previous result from the individual recommendation case.
Tasks Recommendation Systems
Published 2017-07-27
URL http://arxiv.org/abs/1707.08755v1
PDF http://arxiv.org/pdf/1707.08755v1.pdf
PWC https://paperswithcode.com/paper/group-recommendations-axioms-impossibilities
Repo
Framework

Low Impact Artificial Intelligences

Title Low Impact Artificial Intelligences
Authors Stuart Armstrong, Benjamin Levinstein
Abstract There are many goals for an AI that could become dangerous if the AI becomes superintelligent or otherwise powerful. Much work on the AI control problem has been focused on constructing AI goals that are safe even for such AIs. This paper looks at an alternative approach: defining a general concept of `low impact’. The aim is to ensure that a powerful AI which implements low impact will not modify the world extensively, even if it is given a simple or dangerous goal. The paper proposes various ways of defining and grounding low impact, and discusses methods for ensuring that the AI can still be allowed to have a (desired) impact despite the restriction. The end of the paper addresses known issues with this approach and avenues for future research. |
Tasks
Published 2017-05-30
URL http://arxiv.org/abs/1705.10720v1
PDF http://arxiv.org/pdf/1705.10720v1.pdf
PWC https://paperswithcode.com/paper/low-impact-artificial-intelligences
Repo
Framework

Multi-Objective Software Suite of Two-Dimensional Shape Descriptors for Object-Based Image Analysis

Title Multi-Objective Software Suite of Two-Dimensional Shape Descriptors for Object-Based Image Analysis
Authors Andrea Baraldi, João V. B. Soares
Abstract In recent years two sets of planar (2D) shape attributes, provided with an intuitive physical meaning, were proposed to the remote sensing community by, respectively, Nagao & Matsuyama and Shackelford & Davis in their seminal works on the increasingly popular geographic object based image analysis (GEOBIA) paradigm. These two published sets of intuitive geometric features were selected as initial conditions by the present R&D software project, whose multi-objective goal was to accomplish: (i) a minimally dependent and maximally informative design (knowledge/information representation) of a general purpose, user and application independent dictionary of 2D shape terms provided with a physical meaning intuitive to understand by human end users and (ii) an effective (accurate, scale invariant, easy to use) and efficient implementation of 2D shape descriptors. To comply with the Quality Assurance Framework for Earth Observation guidelines, the proposed suite of geometric functions is validated by means of a novel quantitative quality assurance policy, centered on inter feature dependence (causality) assessment. This innovative multivariate feature validation strategy is alternative to traditional feature selection procedures based on either inductive data learning classification accuracy estimation, which is inherently case specific, or cross correlation estimation, because statistical cross correlation does not imply causation. The project deliverable is an original general purpose software suite of seven validated off the shelf 2D shape descriptors intuitive to use. Alternative to existing commercial or open source software libraries of tens of planar shape functions whose informativeness remains unknown, it is eligible for use in (GE)OBIA systems in operating mode, expected to mimic human reasoning based on a convergence of evidence approach.
Tasks Feature Selection
Published 2017-01-08
URL http://arxiv.org/abs/1701.01941v2
PDF http://arxiv.org/pdf/1701.01941v2.pdf
PWC https://paperswithcode.com/paper/multi-objective-software-suite-of-two
Repo
Framework

A Diversified Multi-Start Algorithm for Unconstrained Binary Quadratic Problems Leveraging the Graphics Processor Unit

Title A Diversified Multi-Start Algorithm for Unconstrained Binary Quadratic Problems Leveraging the Graphics Processor Unit
Authors Mark W. Lewis
Abstract Multi-start algorithms are a common and effective tool for metaheuristic searches. In this paper we amplify multi-start capabilities by employing the parallel processing power of the graphics processer unit (GPU) to quickly generate a diverse starting set of solutions for the Unconstrained Binary Quadratic Optimization Problem which are evaluated and used to implement screening methods to select solutions for further optimization. This method is implemented as an initial high quality solution generation phase prior to a secondary steepest ascent search and a comparison of results to best known approaches on benchmark unconstrained binary quadratic problems demonstrates that GPU-enabled diversified multi-start with screening quickly yields very good results.
Tasks
Published 2017-05-31
URL http://arxiv.org/abs/1706.00037v1
PDF http://arxiv.org/pdf/1706.00037v1.pdf
PWC https://paperswithcode.com/paper/a-diversified-multi-start-algorithm-for
Repo
Framework

Revisiting L21-norm Robustness with Vector Outlier Regularization

Title Revisiting L21-norm Robustness with Vector Outlier Regularization
Authors Bo Jiang, Chris Ding
Abstract In many real-world applications, data usually contain outliers. One popular approach is to use L2,1 norm function as a robust error/loss function. However, the robustness of L2,1 norm function is not well understood so far. In this paper, we propose a new Vector Outlier Regularization (VOR) framework to understand and analyze the robustness of L2,1 norm function. Our VOR function defines a data point to be outlier if it is outside a threshold with respect to a theoretical prediction, and regularize it-pull it back to the threshold line. We then prove that L2,1 function is the limiting case of this VOR with the usual least square/L2 error function as the threshold shrinks to zero. One interesting property of VOR is that how far an outlier lies away from its theoretically predicted value does not affect the final regularization and analysis results. This VOR property unmasks one of the most peculiar property of L2,1 norm function: The effects of outliers seem to be independent of how outlying they are-if an outlier is moved further away from the intrinsic manifold/subspace, the final analysis results do not change. VOR provides a new way to understand and analyze the robustness of L2,1 norm function. Applying VOR to matrix factorization leads to a new VORPCA model. We give a comprehensive comparison with trace-norm based L21-norm PCA to demonstrate the advantages of VORPCA.
Tasks
Published 2017-06-20
URL https://arxiv.org/abs/1706.06409v2
PDF https://arxiv.org/pdf/1706.06409v2.pdf
PWC https://paperswithcode.com/paper/outlier-regularization-for-vector-data-and
Repo
Framework

On the diffusion approximation of nonconvex stochastic gradient descent

Title On the diffusion approximation of nonconvex stochastic gradient descent
Authors Wenqing Hu, Chris Junchi Li, Lei Li, Jian-Guo Liu
Abstract We study the Stochastic Gradient Descent (SGD) method in nonconvex optimization problems from the point of view of approximating diffusion processes. We prove rigorously that the diffusion process can approximate the SGD algorithm weakly using the weak form of master equation for probability evolution. In the small step size regime and the presence of omnidirectional noise, our weak approximating diffusion process suggests the following dynamics for the SGD iteration starting from a local minimizer (resp.~saddle point): it escapes in a number of iterations exponentially (resp.~almost linearly) dependent on the inverse stepsize. The results are obtained using the theory for random perturbations of dynamical systems (theory of large deviations for local minimizers and theory of exiting for unstable stationary points). In addition, we discuss the effects of batch size for the deep neural networks, and we find that small batch size is helpful for SGD algorithms to escape unstable stationary points and sharp minimizers. Our theory indicates that one should increase the batch size at later stage for the SGD to be trapped in flat minimizers for better generalization.
Tasks
Published 2017-05-22
URL http://arxiv.org/abs/1705.07562v2
PDF http://arxiv.org/pdf/1705.07562v2.pdf
PWC https://paperswithcode.com/paper/on-the-diffusion-approximation-of-nonconvex
Repo
Framework

Non-Markovian Control with Gated End-to-End Memory Policy Networks

Title Non-Markovian Control with Gated End-to-End Memory Policy Networks
Authors Julien Perez, Tomi Silander
Abstract Partially observable environments present an important open challenge in the domain of sequential control learning with delayed rewards. Despite numerous attempts during the two last decades, the majority of reinforcement learning algorithms and associated approximate models, applied to this context, still assume Markovian state transitions. In this paper, we explore the use of a recently proposed attention-based model, the Gated End-to-End Memory Network, for sequential control. We call the resulting model the Gated End-to-End Memory Policy Network. More precisely, we use a model-free value-based algorithm to learn policies for partially observed domains using this memory-enhanced neural network. This model is end-to-end learnable and it features unbounded memory. Indeed, because of its attention mechanism and associated non-parametric memory, the proposed model allows us to define an attention mechanism over the observation stream unlike recurrent models. We show encouraging results that illustrate the capability of our attention-based model in the context of the continuous-state non-stationary control problem of stock trading. We also present an OpenAI Gym environment for simulated stock exchange and explain its relevance as a benchmark for the field of non-Markovian decision process learning.
Tasks
Published 2017-05-31
URL http://arxiv.org/abs/1705.10993v1
PDF http://arxiv.org/pdf/1705.10993v1.pdf
PWC https://paperswithcode.com/paper/non-markovian-control-with-gated-end-to-end
Repo
Framework

Hyperprofile-based Computation Offloading for Mobile Edge Networks

Title Hyperprofile-based Computation Offloading for Mobile Edge Networks
Authors Andrew Crutcher, Caleb Koch, Kyle Coleman, Jon Patman, Flavio Esposito, Prasad Calyam
Abstract In recent studies, researchers have developed various computation offloading frameworks for bringing cloud services closer to the user via edge networks. Specifically, an edge device needs to offload computationally intensive tasks because of energy and processing constraints. These constraints present the challenge of identifying which edge nodes should receive tasks to reduce overall resource consumption. We propose a unique solution to this problem which incorporates elements from Knowledge-Defined Networking (KDN) to make intelligent predictions about offloading costs based on historical data. Each server instance can be represented in a multidimensional feature space where each dimension corresponds to a predicted metric. We compute features for a “hyperprofile” and position nodes based on the predicted costs of offloading a particular task. We then perform a k-Nearest Neighbor (kNN) query within the hyperprofile to select nodes for offloading computation. This paper formalizes our hyperprofile-based solution and explores the viability of using machine learning (ML) techniques to predict metrics useful for computation offloading. We also investigate the effects of using different distance metrics for the queries. Our results show various network metrics can be modeled accurately with regression, and there are circumstances where kNN queries using Euclidean distance as opposed to rectilinear distance is more favorable.
Tasks
Published 2017-07-28
URL http://arxiv.org/abs/1707.09422v1
PDF http://arxiv.org/pdf/1707.09422v1.pdf
PWC https://paperswithcode.com/paper/hyperprofile-based-computation-offloading-for
Repo
Framework

Forecasting Player Behavioral Data and Simulating in-Game Events

Title Forecasting Player Behavioral Data and Simulating in-Game Events
Authors Anna Guitart, Pei Pei Chen, Paul Bertens, África Periáñez
Abstract Understanding player behavior is fundamental in game data science. Video games evolve as players interact with the game, so being able to foresee player experience would help to ensure a successful game development. In particular, game developers need to evaluate beforehand the impact of in-game events. Simulation optimization of these events is crucial to increase player engagement and maximize monetization. We present an experimental analysis of several methods to forecast game-related variables, with two main aims: to obtain accurate predictions of in-app purchases and playtime in an operational production environment, and to perform simulations of in-game events in order to maximize sales and playtime. Our ultimate purpose is to take a step towards the data-driven development of games. The results suggest that, even though the performance of traditional approaches such as ARIMA is still better, the outcomes of state-of-the-art techniques like deep learning are promising. Deep learning comes up as a well-suited general model that could be used to forecast a variety of time series with different dynamic behaviors.
Tasks Time Series
Published 2017-10-05
URL http://arxiv.org/abs/1710.01931v1
PDF http://arxiv.org/pdf/1710.01931v1.pdf
PWC https://paperswithcode.com/paper/forecasting-player-behavioral-data-and
Repo
Framework
comments powered by Disqus