May 6, 2019

2995 words 15 mins read

Paper Group ANR 244

Paper Group ANR 244

A network that learns Strassen multiplication. Sparse Linear Regression via Generalized Orthogonal Least-Squares. Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly. Improved Parsing for Argument-Clusters Coordination. Reinforcement Learning approach for Real Time Strategy Games Battle city and S3. Zero- …

A network that learns Strassen multiplication

Title A network that learns Strassen multiplication
Authors Veit Elser
Abstract We study neural networks whose only non-linear components are multipliers, to test a new training rule in a context where the precise representation of data is paramount. These networks are challenged to discover the rules of matrix multiplication, given many examples. By limiting the number of multipliers, the network is forced to discover the Strassen multiplication rules. This is the mathematical equivalent of finding low rank decompositions of the $n\times n$ matrix multiplication tensor, $M_n$. We train these networks with the conservative learning rule, which makes minimal changes to the weights so as to give the correct output for each input at the time the input-output pair is received. Conservative learning needs a few thousand examples to find the rank 7 decomposition of $M_2$, and $10^5$ for the rank 23 decomposition of $M_3$ (the lowest known). High precision is critical, especially for $M_3$, to discriminate between true decompositions and “border approximations”.
Tasks
Published 2016-01-26
URL http://arxiv.org/abs/1601.07227v1
PDF http://arxiv.org/pdf/1601.07227v1.pdf
PWC https://paperswithcode.com/paper/a-network-that-learns-strassen-multiplication
Repo
Framework

Sparse Linear Regression via Generalized Orthogonal Least-Squares

Title Sparse Linear Regression via Generalized Orthogonal Least-Squares
Authors Abolfazl Hashemi, Haris Vikalo
Abstract Sparse linear regression, which entails finding a sparse solution to an underdetermined system of linear equations, can formally be expressed as an $l_0$-constrained least-squares problem. The Orthogonal Least-Squares (OLS) algorithm sequentially selects the features (i.e., columns of the coefficient matrix) to greedily find an approximate sparse solution. In this paper, a generalization of Orthogonal Least-Squares which relies on a recursive relation between the components of the optimal solution to select L features at each step and solve the resulting overdetermined system of equations is proposed. Simulation results demonstrate that the generalized OLS algorithm is computationally efficient and achieves performance superior to that of existing greedy algorithms broadly used in the literature.
Tasks
Published 2016-02-22
URL http://arxiv.org/abs/1602.06916v2
PDF http://arxiv.org/pdf/1602.06916v2.pdf
PWC https://paperswithcode.com/paper/sparse-linear-regression-via-generalized
Repo
Framework

Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly

Title Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly
Authors Hao Jiang, Kristen Grauman
Abstract Today’s person detection methods work best when people are in common upright poses and appear reasonably well spaced out in the image. However, in many real images, that’s not what people do. People often appear quite close to each other, e.g., with limbs linked or heads touching, and their poses are often not pedestrian-like. We propose an approach to detangle people in multi-person images. We formulate the task as a region assembly problem. Starting from a large set of overlapping regions from body part semantic segmentation and generic object proposals, our optimization approach reassembles those pieces together into multiple person instances. It enforces that the composed body part regions of each person instance obey constraints on relative sizes, mutual spatial relationships, foreground coverage, and exclusive label assignments when overlapping. Since optimal region assembly is a challenging combinatorial problem, we present a Lagrangian relaxation method to accelerate the lower bound estimation, thereby enabling a fast branch and bound solution for the global optimum. As output, our method produces a pixel-level map indicating both 1) the body part labels (arm, leg, torso, and head), and 2) which parts belong to which individual person. Our results on three challenging datasets show our method is robust to clutter, occlusion, and complex poses. It outperforms a variety of competing methods, including existing detector CRF methods and region CNN approaches. In addition, we demonstrate its impact on a proxemics recognition task, which demands a precise representation of “whose body part is where” in crowded images.
Tasks Human Detection, Semantic Segmentation
Published 2016-04-13
URL http://arxiv.org/abs/1604.03880v1
PDF http://arxiv.org/pdf/1604.03880v1.pdf
PWC https://paperswithcode.com/paper/detangling-people-individuating-multiple
Repo
Framework

Improved Parsing for Argument-Clusters Coordination

Title Improved Parsing for Argument-Clusters Coordination
Authors Jessica Ficler, Yoav Goldberg
Abstract Syntactic parsers perform poorly in prediction of Argument-Cluster Coordination (ACC). We change the PTB representation of ACC to be more suitable for learning by a statistical PCFG parser, affecting 125 trees in the training set. Training on the modified trees yields a slight improvement in EVALB scores on sections 22 and 23. The main evaluation is on a corpus of 4th grade science exams, in which ACC structures are prevalent. On this corpus, we obtain an impressive x2.7 improvement in recovering ACC structures compared to a parser trained on the original PTB trees.
Tasks
Published 2016-06-01
URL http://arxiv.org/abs/1606.00294v1
PDF http://arxiv.org/pdf/1606.00294v1.pdf
PWC https://paperswithcode.com/paper/improved-parsing-for-argument-clusters
Repo
Framework

Reinforcement Learning approach for Real Time Strategy Games Battle city and S3

Title Reinforcement Learning approach for Real Time Strategy Games Battle city and S3
Authors Harshit Sethy, Amit Patel
Abstract In this paper we proposed reinforcement learning algorithms with the generalized reward function. In our proposed method we use Q-learning and SARSA algorithms with generalised reward function to train the reinforcement learning agent. We evaluated the performance of our proposed algorithms on two real-time strategy games called BattleCity and S3. There are two main advantages of having such an approach as compared to other works in RTS. (1) We can ignore the concept of a simulator which is often game specific and is usually hard coded in any type of RTS games (2) our system can learn from interaction with any opponents and quickly change the strategy according to the opponents and do not need any human traces as used in previous works. Keywords : Reinforcement learning, Machine learning, Real time strategy, Artificial intelligence.
Tasks Q-Learning, Real-Time Strategy Games
Published 2016-02-16
URL http://arxiv.org/abs/1602.04936v1
PDF http://arxiv.org/pdf/1602.04936v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-approach-for-real-time
Repo
Framework

Zero-Shot Learning with Multi-Battery Factor Analysis

Title Zero-Shot Learning with Multi-Battery Factor Analysis
Authors Zhong Ji, Yuzhong Xie, Yanwei Pang, Lei Chen, Zhongfei Zhang
Abstract Zero-shot learning (ZSL) extends the conventional image classification technique to a more challenging situation where the test image categories are not seen in the training samples. Most studies on ZSL utilize side information such as attributes or word vectors to bridge the relations between the seen classes and the unseen classes. However, existing approaches on ZSL typically exploit a shared space for each type of side information independently, which cannot make full use of the complementary knowledge of different types of side information. To this end, this paper presents an MBFA-ZSL approach to embed different types of side information as well as the visual feature into one shared space. Specifically, we first develop an algorithm named Multi-Battery Factor Analysis (MBFA) to build a unified semantic space, and then employ multiple types of side information in it to achieve the ZSL. The close-form solution makes MBFA-ZSL simple to implement and efficient to run on large datasets. Extensive experiments on the popular AwA, CUB, and SUN datasets show its significant superiority over the state-of-the-art approaches.
Tasks Image Classification, Zero-Shot Learning
Published 2016-06-30
URL http://arxiv.org/abs/1606.09349v1
PDF http://arxiv.org/pdf/1606.09349v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-learning-with-multi-battery-factor
Repo
Framework

Recalling Holistic Information for Semantic Segmentation

Title Recalling Holistic Information for Semantic Segmentation
Authors Hexiang Hu, Zhiwei Deng, Guang-tong Zhou, Fei Sha, Greg Mori
Abstract Semantic segmentation requires a detailed labeling of image pixels by object category. Information derived from local image patches is necessary to describe the detailed shape of individual objects. However, this information is ambiguous and can result in noisy labels. Global inference of image content can instead capture the general semantic concepts present. We advocate that high-recall holistic inference of image concepts provides valuable information for detailed pixel labeling. We build a two-stream neural network architecture that facilitates information flow from holistic information to local pixels, while keeping common image features shared among the low-level layers of both the holistic analysis and segmentation branches. We empirically evaluate our network on four standard semantic segmentation datasets. Our network obtains state-of-the-art performance on PASCAL-Context and NYUDv2, and ablation studies verify its effectiveness on ADE20K and SIFT-Flow.
Tasks Semantic Segmentation
Published 2016-11-24
URL http://arxiv.org/abs/1611.08061v1
PDF http://arxiv.org/pdf/1611.08061v1.pdf
PWC https://paperswithcode.com/paper/recalling-holistic-information-for-semantic
Repo
Framework

A Spectral Series Approach to High-Dimensional Nonparametric Regression

Title A Spectral Series Approach to High-Dimensional Nonparametric Regression
Authors Ann B. Lee, Rafael Izbicki
Abstract A key question in modern statistics is how to make fast and reliable inferences for complex, high-dimensional data. While there has been much interest in sparse techniques, current methods do not generalize well to data with nonlinear structure. In this work, we present an orthogonal series estimator for predictors that are complex aggregate objects, such as natural images, galaxy spectra, trajectories, and movies. Our series approach ties together ideas from kernel machine learning, and Fourier methods. We expand the unknown regression on the data in terms of the eigenfunctions of a kernel-based operator, and we take advantage of orthogonality of the basis with respect to the underlying data distribution, P, to speed up computations and tuning of parameters. If the kernel is appropriately chosen, then the eigenfunctions adapt to the intrinsic geometry and dimension of the data. We provide theoretical guarantees for a radial kernel with varying bandwidth, and we relate smoothness of the regression function with respect to P to sparsity in the eigenbasis. Finally, using simulated and real-world data, we systematically compare the performance of the spectral series approach with classical kernel smoothing, k-nearest neighbors regression, kernel ridge regression, and state-of-the-art manifold and local regression methods.
Tasks
Published 2016-02-01
URL http://arxiv.org/abs/1602.00355v1
PDF http://arxiv.org/pdf/1602.00355v1.pdf
PWC https://paperswithcode.com/paper/a-spectral-series-approach-to-high
Repo
Framework

PCANet: An energy perspective

Title PCANet: An energy perspective
Authors Jiasong Wu, Shijie Qiu, Youyong Kong, Longyu Jiang, Lotfi Senhadji, Huazhong Shu
Abstract The principal component analysis network (PCANet), which is one of the recently proposed deep learning architectures, achieves the state-of-the-art classification accuracy in various databases. However, the explanation of the PCANet is lacked. In this paper, we try to explain why PCANet works well from energy perspective point of view based on a set of experiments. The impact of various parameters on the error rate of PCANet is analyzed in depth. It was found that this error rate is correlated with the logarithm of energy of image. The proposed energy explanation approach can be used as a testing method for checking if every step of the constructed networks is necessary.
Tasks
Published 2016-03-03
URL http://arxiv.org/abs/1603.00944v1
PDF http://arxiv.org/pdf/1603.00944v1.pdf
PWC https://paperswithcode.com/paper/pcanet-an-energy-perspective
Repo
Framework

Connectome Smoothing via Low-rank Approximations

Title Connectome Smoothing via Low-rank Approximations
Authors Runze Tang, Michael Ketcha, Alexandra Badea, Evan D. Calabrese, Daniel S. Margulies, Joshua T. Vogelstein, Carey E. Priebe, Daniel L. Sussman
Abstract In statistical connectomics, the quantitative study of brain networks, estimating the mean of a population of graphs based on a sample is a core problem. Often, this problem is especially difficult because the sample or cohort size is relatively small, sometimes even a single subject. While using the element-wise sample mean of the adjacency matrices is a common approach, this method does not exploit any underlying structural properties of the graphs. We propose using a low-rank method which incorporates tools for dimension selection and diagonal augmentation to smooth the estimates and improve performance over the naive methodology for small sample sizes. Theoretical results for the stochastic blockmodel show that this method offers major improvements when there are many vertices. Similarly, we demonstrate that the low-rank methods outperform the standard sample mean for a variety of independent edge distributions as well as human connectome data derived from magnetic resonance imaging, especially when sample sizes are small. Moreover, the low-rank methods yield “eigen-connectomes”, which correlate with the lobe-structure of the human brain and superstructures of the mouse brain. These results indicate that low-rank methods are an important part of the tool box for researchers studying populations of graphs in general, and statistical connectomics in particular.
Tasks
Published 2016-09-06
URL http://arxiv.org/abs/1609.01672v3
PDF http://arxiv.org/pdf/1609.01672v3.pdf
PWC https://paperswithcode.com/paper/connectome-smoothing-via-low-rank
Repo
Framework

Regression-based reduced-order models to predict transient thermal output for enhanced geothermal systems

Title Regression-based reduced-order models to predict transient thermal output for enhanced geothermal systems
Authors M. K. Mudunuru, S. Karra, D. R. Harp, G. D. Guthrie, H. S. Viswanathan
Abstract The goal of this paper is to assess the utility of Reduced-Order Models (ROMs) developed from 3D physics-based models for predicting transient thermal power output for an enhanced geothermal reservoir while explicitly accounting for uncertainties in the subsurface system and site-specific details. Numerical simulations are performed based on Latin Hypercube Sampling (LHS) of model inputs drawn from uniform probability distributions. Key sensitive parameters are identified from these simulations, which are fracture zone permeability, well/skin factor, bottom hole pressure, and injection flow rate. The inputs for ROMs are based on these key sensitive parameters. The ROMs are then used to evaluate the influence of subsurface attributes on thermal power production curves. The resulting ROMs are compared with field-data and the detailed physics-based numerical simulations. We propose three different ROMs with different levels of model parsimony, each describing key and essential features of the power production curves. ROM-1 is able to accurately reproduce the power output of numerical simulations for low values of permeabilities and certain features of the field-scale data, and is relatively parsimonious. ROM-2 is a more complex model than ROM-1 but it accurately describes the field-data. At higher permeabilities, ROM-2 reproduces numerical results better than ROM-1, however, there is a considerable deviation at low fracture zone permeabilities. ROM-3 is developed by taking the best aspects of ROM-1 and ROM-2 and provides a middle ground for model parsimony. It is able to describe various features of numerical simulations and field-data. From the proposed workflow, we demonstrate that the proposed simple ROMs are able to capture various complex features of the power production curves of Fenton Hill HDR system. For typical EGS applications, ROM-2 and ROM-3 outperform ROM-1.
Tasks
Published 2016-06-14
URL http://arxiv.org/abs/1606.04567v3
PDF http://arxiv.org/pdf/1606.04567v3.pdf
PWC https://paperswithcode.com/paper/regression-based-reduced-order-models-to
Repo
Framework

Solving General Arithmetic Word Problems

Title Solving General Arithmetic Word Problems
Authors Subhro Roy, Dan Roth
Abstract This paper presents a novel approach to automatically solving arithmetic word problems. This is the first algorithmic approach that can handle arithmetic problems with multiple steps and operations, without depending on additional annotations or predefined templates. We develop a theory for expression trees that can be used to represent and evaluate the target arithmetic expressions; we use it to uniquely decompose the target arithmetic problem to multiple classification problems; we then compose an expression tree, combining these with world knowledge through a constrained inference framework. Our classifiers gain from the use of {\em quantity schemas} that supports better extraction of features. Experimental results show that our method outperforms existing systems, achieving state of the art performance on benchmark datasets of arithmetic word problems.
Tasks
Published 2016-08-04
URL http://arxiv.org/abs/1608.01413v2
PDF http://arxiv.org/pdf/1608.01413v2.pdf
PWC https://paperswithcode.com/paper/solving-general-arithmetic-word-problems
Repo
Framework

Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method

Title Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method
Authors Lihua Lei, Michael I. Jordan
Abstract We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales, with the number of updates at the faster scale being governed by a geometric random variable. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size $n$; indeed, these costs are independent of $n$ when the target accuracy is low. An experimental evaluation on real datasets confirms the effectiveness of SCSG.
Tasks
Published 2016-09-12
URL https://arxiv.org/abs/1609.03261v3
PDF https://arxiv.org/pdf/1609.03261v3.pdf
PWC https://paperswithcode.com/paper/less-than-a-single-pass-stochastically
Repo
Framework

Efficient Optimization for Rank-based Loss Functions

Title Efficient Optimization for Rank-based Loss Functions
Authors Pritish Mohapatra, Michal Rolinek, C. V. Jawahar, Vladimir Kolmogorov, M. Pawan Kumar
Abstract The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time.
Tasks Information Retrieval
Published 2016-04-27
URL http://arxiv.org/abs/1604.08269v3
PDF http://arxiv.org/pdf/1604.08269v3.pdf
PWC https://paperswithcode.com/paper/efficient-optimization-for-rank-based-loss
Repo
Framework

The Mondrian Kernel

Title The Mondrian Kernel
Authors Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh
Abstract We introduce the Mondrian kernel, a fast random feature approximation to the Laplace kernel. It is suitable for both batch and online learning, and admits a fast kernel-width-selection procedure as the random features can be re-used efficiently for all kernel widths. The features are constructed by sampling trees via a Mondrian process [Roy and Teh, 2009], and we highlight the connection to Mondrian forests [Lakshminarayanan et al., 2014], where trees are also sampled via a Mondrian process, but fit independently. This link provides a new insight into the relationship between kernel methods and random forests.
Tasks
Published 2016-06-16
URL http://arxiv.org/abs/1606.05241v1
PDF http://arxiv.org/pdf/1606.05241v1.pdf
PWC https://paperswithcode.com/paper/the-mondrian-kernel
Repo
Framework
comments powered by Disqus