July 27, 2019

2942 words 14 mins read

Paper Group ANR 729

Fast spatial inference in the homogeneous Ising model. Real-time convolutional networks for sonar image classification in low-power embedded systems. Inferring Narrative Causality between Event Pairs in Films. Revisiting the Design Issues of Local Models for Japanese Predicate-Argument Structure Analysis. DocTag2Vec: An Embedding Based Multi-label …

Fast spatial inference in the homogeneous Ising model


Title	Fast spatial inference in the homogeneous Ising model
Authors	Alejandro Murua, Ranjan Maitra
Abstract	The Ising model is important in statistical modeling and inference in many applications, however its normalizing constant, mean number of active vertices and mean spin interaction are intractable. We provide accurate approximations that make it possible to calculate these quantities numerically. Simulation studies indicate good performance when compared to Markov Chain Monte Carlo methods and at a tiny fraction of the time. The methodology is also used to perform Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment.
Tasks	Bayesian Inference
Published	2017-12-06
URL	http://arxiv.org/abs/1712.02195v2
PDF	http://arxiv.org/pdf/1712.02195v2.pdf
PWC	https://paperswithcode.com/paper/fast-spatial-inference-in-the-homogeneous
Repo
Framework

Real-time convolutional networks for sonar image classification in low-power embedded systems


Title	Real-time convolutional networks for sonar image classification in low-power embedded systems
Authors	Matias Valdenegro-Toro
Abstract	Deep Neural Networks have impressive classification performance, but this comes at the expense of significant computational resources at inference time. Autonomous Underwater Vehicles use low-power embedded systems for sonar image perception, and cannot execute large neural networks in real-time. We propose the use of max-pooling aggressively, and we demonstrate it with a Fire-based module and a new Tiny module that includes max-pooling in each module. By stacking them we build networks that achieve the same accuracy as bigger ones, while reducing the number of parameters and considerably increasing computational performance. Our networks can classify a 96x96 sonar image with 98.8 - 99.7 accuracy on only 41 to 61 milliseconds on a Raspberry Pi 2, which corresponds to speedups of 28.6 - 19.7.
Tasks	Image Classification
Published	2017-09-07
URL	http://arxiv.org/abs/1709.02153v1
PDF	http://arxiv.org/pdf/1709.02153v1.pdf
PWC	https://paperswithcode.com/paper/real-time-convolutional-networks-for-sonar
Repo
Framework

Inferring Narrative Causality between Event Pairs in Films


Title	Inferring Narrative Causality between Event Pairs in Films
Authors	Zhichao Hu, Marilyn A. Walker
Abstract	To understand narrative, humans draw inferences about the underlying relations between narrative events. Cognitive theories of narrative understanding define these inferences as four different types of causality, that include pairs of events A, B where A physically causes B (X drop, X break), to pairs of events where A causes emotional state B (Y saw X, Y felt fear). Previous work on learning narrative relations from text has either focused on “strict” physical causality, or has been vague about what relation is being learned. This paper learns pairs of causal events from a corpus of film scene descriptions which are action rich and tend to be told in chronological order. We show that event pairs induced using our methods are of high quality and are judged to have a stronger causal relation than event pairs from Rel-grams.
Tasks
Published	2017-08-30
URL	http://arxiv.org/abs/1708.09496v1
PDF	http://arxiv.org/pdf/1708.09496v1.pdf
PWC	https://paperswithcode.com/paper/inferring-narrative-causality-between-event
Repo
Framework

Revisiting the Design Issues of Local Models for Japanese Predicate-Argument Structure Analysis


Title	Revisiting the Design Issues of Local Models for Japanese Predicate-Argument Structure Analysis
Authors	Yuichiroh Matsubayashi, Kentaro Inui
Abstract	The research trend in Japanese predicate-argument structure (PAS) analysis is shifting from pointwise prediction models with local features to global models designed to search for globally optimal solutions. However, the existing global models tend to employ only relatively simple local features; therefore, the overall performance gains are rather limited. The importance of designing a local model is demonstrated in this study by showing that the performance of a sophisticated local model can be considerably improved with recent feature embedding methods and a feature combination learning based on a neural network, outperforming the state-of-the-art global models in $F_1$ on a common benchmark dataset.
Tasks
Published	2017-10-12
URL	http://arxiv.org/abs/1710.04437v1
PDF	http://arxiv.org/pdf/1710.04437v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-the-design-issues-of-local-models
Repo
Framework

DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging


Title	DocTag2Vec: An Embedding Based Multi-label Learning Approach for Document Tagging
Authors	Sheng Chen, Akshay Soni, Aasish Pappu, Yashar Mehdad
Abstract	Tagging news articles or blog posts with relevant tags from a collection of predefined ones is coined as document tagging in this work. Accurate tagging of articles can benefit several downstream applications such as recommendation and search. In this work, we propose a novel yet simple approach called DocTag2Vec to accomplish this task. We substantially extend Word2Vec and Doc2Vec—two popular models for learning distributed representation of words and documents. In DocTag2Vec, we simultaneously learn the representation of words, documents, and tags in a joint vector space during training, and employ the simple $k$-nearest neighbor search to predict tags for unseen documents. In contrast to previous multi-label learning methods, DocTag2Vec directly deals with raw text instead of provided feature vector, and in addition, enjoys advantages like the learning of tag representation, and the ability of handling newly created tags. To demonstrate the effectiveness of our approach, we conduct experiments on several datasets and show promising results against state-of-the-art methods.
Tasks	Multi-Label Learning
Published	2017-07-14
URL	http://arxiv.org/abs/1707.04596v1
PDF	http://arxiv.org/pdf/1707.04596v1.pdf
PWC	https://paperswithcode.com/paper/doctag2vec-an-embedding-based-multi-label
Repo
Framework

Aggregating Algorithm for Prediction of Packs


Title	Aggregating Algorithm for Prediction of Packs
Authors	Dmitry Adamskiy, Tony Bellotti, Raisa Dzhamtyrova, Yuri Kalnishkan
Abstract	This paper formulates the protocol for prediction of packs, which a special case of prediction under delayed feedback. Under this protocol, the learner must make a few predictions without seeing the outcomes and then the outcomes are revealed. We develop the theory of prediction with expert advice for packs. By applying Vovk’s Aggregating Algorithm to this problem we obtain a number of algorithms with tight upper bounds. We carry out empirical experiments on housing data.
Tasks
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08114v1
PDF	http://arxiv.org/pdf/1710.08114v1.pdf
PWC	https://paperswithcode.com/paper/aggregating-algorithm-for-prediction-of-packs
Repo
Framework

Numerical Integration and Dynamic Discretization in Heuristic Search Planning over Hybrid Domains


Title	Numerical Integration and Dynamic Discretization in Heuristic Search Planning over Hybrid Domains
Authors	Miquel Ramirez, Enrico Scala, Patrik Haslum, Sylvie Thiebaux
Abstract	In this paper we look into the problem of planning over hybrid domains, where change can be both discrete and instantaneous, or continuous over time. In addition, it is required that each state on the trajectory induced by the execution of plans complies with a given set of global constraints. We approach the computation of plans for such domains as the problem of searching over a deterministic state model. In this model, some of the successor states are obtained by solving numerically the so-called initial value problem over a set of ordinary differential equations (ODE) given by the current plan prefix. These equations hold over time intervals whose duration is determined dynamically, according to whether zero crossing events take place for a set of invariant conditions. The resulting planner, FS+, incorporates these features together with effective heuristic guidance. FS+ does not impose any of the syntactic restrictions on process effects often found on the existing literature on Hybrid Planning. A key concept of our approach is that a clear separation is struck between planning and simulation time steps. The former is the time allowed to observe the evolution of a given dynamical system before committing to a future course of action, whilst the later is part of the model of the environment. FS+ is shown to be a robust planner over a diverse set of hybrid domains, taken from the existing literature on hybrid planning and systems.
Tasks
Published	2017-03-13
URL	http://arxiv.org/abs/1703.04232v1
PDF	http://arxiv.org/pdf/1703.04232v1.pdf
PWC	https://paperswithcode.com/paper/numerical-integration-and-dynamic
Repo
Framework

Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for Markov Chain Monte Carlo


Title	Fractional Langevin Monte Carlo: Exploring Lévy Driven Stochastic Differential Equations for Markov Chain Monte Carlo
Authors	Umut Şimşekli
Abstract	Along with the recent advances in scalable Markov Chain Monte Carlo methods, sampling techniques that are based on Langevin diffusions have started receiving increasing attention. These so called Langevin Monte Carlo (LMC) methods are based on diffusions driven by a Brownian motion, which gives rise to Gaussian proposal distributions in the resulting algorithms. Even though these approaches have proven successful in many applications, their performance can be limited by the light-tailed nature of the Gaussian proposals. In this study, we extend classical LMC and develop a novel Fractional LMC (FLMC) framework that is based on a family of heavy-tailed distributions, called $\alpha$-stable L'{e}vy distributions. As opposed to classical approaches, the proposed approach can possess large jumps while targeting the correct distribution, which would be beneficial for efficient exploration of the state space. We develop novel computational methods that can scale up to large-scale problems and we provide formal convergence analysis of the proposed scheme. Our experiments support our theory: FLMC can provide superior performance in multi-modal settings, improved convergence rates, and robustness to algorithm parameters.
Tasks	Efficient Exploration
Published	2017-06-12
URL	http://arxiv.org/abs/1706.03649v1
PDF	http://arxiv.org/pdf/1706.03649v1.pdf
PWC	https://paperswithcode.com/paper/fractional-langevin-monte-carlo-exploring-1
Repo
Framework

Robust Frequent Directions with Application in Online Learning


Title	Robust Frequent Directions with Application in Online Learning
Authors	Luo Luo, Cheng Chen, Zhihua Zhang, Wu-Jun Li, Tong Zhang
Abstract	The frequent directions (FD) technique is a deterministic approach for online sketching that has many applications in machine learning. The conventional FD is a heuristic procedure that often outputs rank deficient matrices. To overcome the rank deficiency problem, we propose a new sketching strategy called robust frequent directions (RFD) by introducing a regularization term. RFD can be derived from an optimization problem. It updates the sketch matrix and the regularization term adaptively and jointly. RFD reduces the approximation error of FD without increasing the computational cost. We also apply RFD to online learning and propose an effective hyperparameter-free online Newton algorithm. We derive a regret bound for our online Newton algorithm based on RFD, which guarantees the robustness of the algorithm. The experimental studies demonstrate that the proposed method outperforms state-of-the-art second order online learning algorithms.
Tasks
Published	2017-05-15
URL	http://arxiv.org/abs/1705.05067v3
PDF	http://arxiv.org/pdf/1705.05067v3.pdf
PWC	https://paperswithcode.com/paper/robust-frequent-directions-with-application
Repo
Framework

On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning


Title	On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning
Authors	Huizhen Yu
Abstract	We consider off-policy temporal-difference (TD) learning methods for policy evaluation in Markov decision processes with finite spaces and discounted reward criteria, and we present a collection of convergence results for several gradient-based TD algorithms with linear function approximation. The algorithms we analyze include: (i) two basic forms of two-time-scale gradient-based TD algorithms, which we call GTD and which minimize the mean squared projected Bellman error using stochastic gradient-descent; (ii) their “robustified” biased variants; (iii) their mirror-descent versions which combine the mirror-descent idea with TD learning; and (iv) a single-time-scale version of GTD that solves minimax problems formulated for approximate policy evaluation. We derive convergence results for three types of stepsizes: constant stepsize, slowly diminishing stepsize, as well as the standard type of diminishing stepsize with a square-summable condition. For the first two types of stepsizes, we apply the weak convergence method from stochastic approximation theory to characterize the asymptotic behavior of the algorithms, and for the standard type of stepsize, we analyze the algorithmic behavior with respect to a stronger mode of convergence, almost sure convergence. Our convergence results are for the aforementioned TD algorithms with three general ways of setting their $\lambda$-parameters: (i) state-dependent $\lambda$; (ii) a recently proposed scheme of using history-dependent $\lambda$ to keep the eligibility traces of the algorithms bounded while allowing for relatively large values of $\lambda$; and (iii) a composite scheme of setting the $\lambda$-parameters that combines the preceding two schemes and allows a broader class of generalized Bellman operators to be used for approximate policy evaluation with TD methods.
Tasks
Published	2017-12-27
URL	http://arxiv.org/abs/1712.09652v2
PDF	http://arxiv.org/pdf/1712.09652v2.pdf
PWC	https://paperswithcode.com/paper/on-convergence-of-some-gradient-based
Repo
Framework

Cnvlutin2: Ineffectual-Activation-and-Weight-Free Deep Neural Network Computing


Title	Cnvlutin2: Ineffectual-Activation-and-Weight-Free Deep Neural Network Computing
Authors	Patrick Judd, Alberto Delmas, Sayeh Sharify, Andreas Moshovos
Abstract	We discuss several modifications and extensions over the previous proposed Cnvlutin (CNV) accelerator for convolutional and fully-connected layers of Deep Learning Network. We first describe different encodings of the activations that are deemed ineffectual. The encodings have different memory overhead and energy characteristics. We propose using a level of indirection when accessing activations from memory to reduce their memory footprint by storing only the effectual activations. We also present a modified organization that detects the activations that are deemed as ineffectual while fetching them from memory. This is different than the original design that instead detected them at the output of the preceding layer. Finally, we present an extended CNV that can also skip ineffectual weights.
Tasks
Published	2017-04-29
URL	http://arxiv.org/abs/1705.00125v1
PDF	http://arxiv.org/pdf/1705.00125v1.pdf
PWC	https://paperswithcode.com/paper/cnvlutin2-ineffectual-activation-and-weight
Repo
Framework

Mapping higher-order network flows in memory and multilayer networks with Infomap


Title	Mapping higher-order network flows in memory and multilayer networks with Infomap
Authors	Daniel Edler, Ludvig Bohlin, Martin Rosvall
Abstract	Comprehending complex systems by simplifying and highlighting important dynamical patterns requires modeling and mapping higher-order network flows. However, complex systems come in many forms and demand a range of representations, including memory and multilayer networks, which in turn call for versatile community-detection algorithms to reveal important modular regularities in the flows. Here we show that various forms of higher-order network flows can be represented in a unified way with networks that distinguish physical nodes for representing a~complex system’s objects from state nodes for describing flows between the objects. Moreover, these so-called sparse memory networks allow the information-theoretic community detection method known as the map equation to identify overlapping and nested flow modules in data from a range of~different higher-order interactions such as multistep, multi-source, and temporal data. We derive the map equation applied to sparse memory networks and describe its search algorithm Infomap, which can exploit the flexibility of sparse memory networks. Together they provide a general solution to reveal overlapping modular patterns in higher-order flows through complex systems.
Tasks	Community Detection
Published	2017-06-15
URL	http://arxiv.org/abs/1706.04792v2
PDF	http://arxiv.org/pdf/1706.04792v2.pdf
PWC	https://paperswithcode.com/paper/mapping-higher-order-network-flows-in-memory
Repo
Framework

Exhaustive search for sparse variable selection in linear regression


Title	Exhaustive search for sparse variable selection in linear regression
Authors	Yasuhiko Igarashi, Hikaru Takenaka, Yoshinori Nakanishi-Ohno, Makoto Uemura, Shiro Ikeda, Masato Okada
Abstract	We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.
Tasks
Published	2017-07-07
URL	http://arxiv.org/abs/1707.02050v1
PDF	http://arxiv.org/pdf/1707.02050v1.pdf
PWC	https://paperswithcode.com/paper/exhaustive-search-for-sparse-variable
Repo
Framework

Learning Representations of Emotional Speech with Deep Convolutional Generative Adversarial Networks


Title	Learning Representations of Emotional Speech with Deep Convolutional Generative Adversarial Networks
Authors	Jonathan Chang, Stefan Scherer
Abstract	Automatically assessing emotional valence in human speech has historically been a difficult task for machine learning algorithms. The subtle changes in the voice of the speaker that are indicative of positive or negative emotional states are often “overshadowed” by voice characteristics relating to emotional intensity or emotional activation. In this work we explore a representation learning approach that automatically derives discriminative representations of emotional speech. In particular, we investigate two machine learning strategies to improve classifier performance: (1) utilization of unlabeled data using a deep convolutional generative adversarial network (DCGAN), and (2) multitask learning. Within our extensive experiments we leverage a multitask annotated emotional corpus as well as a large unlabeled meeting corpus (around 100 hours). Our speaker-independent classification experiments show that in particular the use of unlabeled data in our investigations improves performance of the classifiers and both fully supervised baseline approaches are outperformed considerably. We improve the classification of emotional valence on a discrete 5-point scale to 43.88% and on a 3-point scale to 49.80%, which is competitive to state-of-the-art performance.
Tasks	Representation Learning
Published	2017-04-22
URL	http://arxiv.org/abs/1705.02394v1
PDF	http://arxiv.org/pdf/1705.02394v1.pdf
PWC	https://paperswithcode.com/paper/learning-representations-of-emotional-speech
Repo
Framework

Exploration of Large Networks with Covariates via Fast and Universal Latent Space Model Fitting


Title	Exploration of Large Networks with Covariates via Fast and Universal Latent Space Model Fitting
Authors	Zhuang Ma, Zongming Ma
Abstract	Latent space models are effective tools for statistical modeling and exploration of network data. These models can effectively model real world network characteristics such as degree heterogeneity, transitivity, homophily, etc. Due to their close connection to generalized linear models, it is also natural to incorporate covariate information in them. The current paper presents two universal fitting algorithms for networks with edge covariates: one based on nuclear norm penalization and the other based on projected gradient descent. Both algorithms are motivated by maximizing likelihood for a special class of inner-product models while working simultaneously for a wide range of different latent space models, such as distance models, which allow latent vectors to affect edge formation in flexible ways. These fitting methods, especially the one based on projected gradient descent, are fast and scalable to large networks. We obtain their rates of convergence for both inner-product models and beyond. The effectiveness of the modeling approach and fitting algorithms is demonstrated on five real world network datasets for different statistical tasks, including community detection with and without edge covariates, and network assisted learning.
Tasks	Community Detection
Published	2017-05-05
URL	http://arxiv.org/abs/1705.02372v2
PDF	http://arxiv.org/pdf/1705.02372v2.pdf
PWC	https://paperswithcode.com/paper/exploration-of-large-networks-with-covariates
Repo
Framework