January 26, 2020

3075 words 15 mins read

Paper Group ANR 1538

Paper Group ANR 1538

A Memoization Framework for Scaling Submodular Optimization to Large Scale Problems. Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy. Visual Context-aware Convolution Filters for Transformation-invariant Neural Network. Language Transfer for Early Warning of Epidemics from Social Media. Finite-Tim …

A Memoization Framework for Scaling Submodular Optimization to Large Scale Problems

Title A Memoization Framework for Scaling Submodular Optimization to Large Scale Problems
Authors Rishabh Iyer, Jeff Bilmes
Abstract We are motivated by large scale submodular optimization problems, where standard algorithms that treat the submodular functions in the \emph{value oracle model} do not scale. In this paper, we present a model called the \emph{precomputational complexity model}, along with a unifying memoization based framework, which looks at the specific form of the given submodular function. A key ingredient in this framework is the notion of a \emph{precomputed statistic}, which is maintained in the course of the algorithms. We show that we can easily integrate this idea into a large class of submodular optimization problems including constrained and unconstrained submodular maximization, minimization, difference of submodular optimization, optimization with submodular constraints and several other related optimization problems. Moreover, memoization can be integrated in both discrete and continuous relaxation flavors of algorithms for these problems. We demonstrate this idea for several commonly occurring submodular functions, and show how the precomputational model provides significant speedups compared to the value oracle model. Finally, we empirically demonstrate this for large scale machine learning problems of data subset selection and summarization.
Tasks
Published 2019-02-26
URL http://arxiv.org/abs/1902.10176v1
PDF http://arxiv.org/pdf/1902.10176v1.pdf
PWC https://paperswithcode.com/paper/a-memoization-framework-for-scaling
Repo
Framework

Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy

Title Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy
Authors Ke Sun, Bing Yu, Zhouchen Lin, Zhanxing Zhu
Abstract Regularization plays a crucial role in machine learning models, especially for deep neural networks. The existing regularization techniques mainly reply on the i.i.d. assumption and only employ the information of the current sample, without the leverage of neighboring information between samples. In this work, we propose a general regularizer called Patch-level Neighborhood Interpolation~(\textbf{Pani}) that fully exploits the relationship between samples. Furthermore, by explicitly constructing a patch-level graph in the different network layers and interpolating the neighborhood features to refine the representation of the current sample, our Patch-level Neighborhood Interpolation can then be applied to enhance two popular regularization strategies, namely Virtual Adversarial Training (VAT) and MixUp, yielding their neighborhood versions. The first derived \textbf{Pani VAT} presents a novel way to construct non-local adversarial smoothness by incorporating patch-level interpolated perturbations. In addition, the \textbf{Pani MixUp} method extends the original MixUp regularization to the patch level and then can be developed to MixMatch, achieving the state-of-the-art performance. Finally, extensive experiments are conducted to verify the effectiveness of the Patch-level Neighborhood Interpolation in both supervised and semi-supervised settings.
Tasks
Published 2019-11-21
URL https://arxiv.org/abs/1911.09307v1
PDF https://arxiv.org/pdf/1911.09307v1.pdf
PWC https://paperswithcode.com/paper/patch-level-neighborhood-interpolation-a
Repo
Framework

Visual Context-aware Convolution Filters for Transformation-invariant Neural Network

Title Visual Context-aware Convolution Filters for Transformation-invariant Neural Network
Authors Suraj Tripathi, Abhay Kumar, Chirag Singh
Abstract We propose a novel visual context-aware filter generation module which incorporates contextual information present in images into Convolutional Neural Networks (CNNs). In contrast to traditional CNNs, we do not employ the same set of learned convolution filters for all input image instances. Our proposed input-conditioned convolution filters when combined with techniques inspired by Multi-instance learning and max-pooling, results in a transformation-invariant neural network. We investigated the performance of our proposed framework on three MNIST variations, which covers both rotation and scaling variance, and achieved 1.13% error on MNIST-rot-12k, 1.12% error on Half-rotated MNIST and 0.68% error on Scaling MNIST, which is significantly better than the state-of-the-art results. We make use of visualization to further prove the effectiveness of our visual context-aware convolution filters. Our proposed visual context-aware convolution filter generation framework can also serve as a plugin for any CNN based architecture and enhance its modeling capacity.
Tasks
Published 2019-06-15
URL https://arxiv.org/abs/1906.09986v1
PDF https://arxiv.org/pdf/1906.09986v1.pdf
PWC https://paperswithcode.com/paper/visual-context-aware-convolution-filters-for
Repo
Framework

Language Transfer for Early Warning of Epidemics from Social Media

Title Language Transfer for Early Warning of Epidemics from Social Media
Authors Mattias Appelgren, Patrick Schrempf, Matúš Falis, Satoshi Ikeda, Alison Q O’Neil
Abstract Statements on social media can be analysed to identify individuals who are experiencing red flag medical symptoms, allowing early detection of the spread of disease such as influenza. Since disease does not respect cultural borders and may spread between populations speaking different languages, we would like to build multilingual models. However, the data required to train models for every language may be difficult, expensive and time-consuming to obtain, particularly for low-resource languages. Taking Japanese as our target language, we explore methods by which data in one language might be used to build models for a different language. We evaluate strategies of training on machine translated data and of zero-shot transfer through the use of multilingual models. We find that the choice of source language impacts the performance, with Chinese-Japanese being a better language pair than English-Japanese. Training on machine translated data shows promise, especially when used in conjunction with a small amount of target language data.
Tasks
Published 2019-10-10
URL https://arxiv.org/abs/1910.04519v1
PDF https://arxiv.org/pdf/1910.04519v1.pdf
PWC https://paperswithcode.com/paper/language-transfer-for-early-warning-of
Repo
Framework

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

Title Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning
Authors Harsh Gupta, R. Srikant, Lei Ying
Abstract We study two time-scale linear stochastic approximation algorithms, which can be used to model well-known reinforcement learning algorithms such as GTD, GTD2, and TDC. We present finite-time performance bounds for the case where the learning rate is fixed. The key idea in obtaining these bounds is to use a Lyapunov function motivated by singular perturbation theory for linear differential equations. We use the bound to design an adaptive learning rate scheme which significantly improves the convergence rate over the known optimal polynomial decay rule in our experiments, and can be used to potentially improve the performance of any other schedule where the learning rate is changed at pre-determined time instants.
Tasks
Published 2019-07-14
URL https://arxiv.org/abs/1907.06290v1
PDF https://arxiv.org/pdf/1907.06290v1.pdf
PWC https://paperswithcode.com/paper/finite-time-performance-bounds-and-adaptive
Repo
Framework

Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions

Title Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions
Authors Rui Zhang, Tao Yu, He Yang Er, Sungrok Shim, Eric Xue, Xi Victoria Lin, Tianze Shi, Caiming Xiong, Richard Socher, Dragomir Radev
Abstract We focus on the cross-domain context-dependent text-to-SQL generation task. Based on the observation that adjacent natural language questions are often linguistically dependent and their corresponding SQL queries tend to overlap, we utilize the interaction history by editing the previous predicted query to improve the generation quality. Our editing mechanism views SQL as sequences and reuses generation results at the token level in a simple manner. It is flexible to change individual tokens and robust to error propagation. Furthermore, to deal with complex table structures in different domains, we employ an utterance-table encoder and a table-aware decoder to incorporate the context of the user utterance and the table schema. We evaluate our approach on the SParC dataset and demonstrate the benefit of editing compared with the state-of-the-art baselines which generate SQL from scratch. Our code is available at https://github.com/ryanzhumich/sparc_atis_pytorch.
Tasks Text-To-Sql
Published 2019-09-02
URL https://arxiv.org/abs/1909.00786v2
PDF https://arxiv.org/pdf/1909.00786v2.pdf
PWC https://paperswithcode.com/paper/editing-based-sql-query-generation-for-cross
Repo
Framework

Augmenting Variational Autoencoders with Sparse Labels: A Unified Framework for Unsupervised, Semi-(un)supervised, and Supervised Learning

Title Augmenting Variational Autoencoders with Sparse Labels: A Unified Framework for Unsupervised, Semi-(un)supervised, and Supervised Learning
Authors Felix Berkhahn, Richard Keys, Wajih Ouertani, Nikhil Shetty, Dominik Geißler
Abstract We present a new flavor of Variational Autoencoder (VAE) that interpolates seamlessly between unsupervised, semi-supervised and fully supervised learning domains. We show that unlabeled datapoints not only boost unsupervised tasks, but also the classification performance. Vice versa, every label not only improves classification, but also unsupervised tasks. The proposed architecture is simple: A classification layer is connected to the topmost encoder layer, and then combined with the resampled latent layer for the decoder. The usual evidence lower bound (ELBO) loss is supplemented with a supervised loss target on this classification layer that is only applied for labeled datapoints. This simplicity allows for extending any existing VAE model to our proposed semi-supervised framework with minimal effort. In the context of classification, we found that this approach even outperforms a direct supervised setup.
Tasks
Published 2019-08-08
URL https://arxiv.org/abs/1908.03015v2
PDF https://arxiv.org/pdf/1908.03015v2.pdf
PWC https://paperswithcode.com/paper/one-model-to-rule-them-all
Repo
Framework

Quasi-Newton Optimization Methods For Deep Learning Applications

Title Quasi-Newton Optimization Methods For Deep Learning Applications
Authors Jacob Rafati, Roummel F. Marcia
Abstract Deep learning algorithms often require solving a highly non-linear and nonconvex unconstrained optimization problem. Methods for solving optimization problems in large-scale machine learning, such as deep learning and deep reinforcement learning (RL), are generally restricted to the class of first-order algorithms, like stochastic gradient descent (SGD). While SGD iterates are inexpensive to compute, they have slow theoretical convergence rates. Furthermore, they require exhaustive trial-and-error to fine-tune many learning parameters. Using second-order curvature information to find search directions can help with more robust convergence for non-convex optimization problems. However, computing Hessian matrices for large-scale problems is not computationally practical. Alternatively, quasi-Newton methods construct an approximate of the Hessian matrix to build a quadratic model of the objective function. Quasi-Newton methods, like SGD, require only first-order gradient information, but they can result in superlinear convergence, which makes them attractive alternatives to SGD. The limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) approach is one of the most popular quasi-Newton methods that construct positive definite Hessian approximations. In this chapter, we propose efficient optimization methods based on L-BFGS quasi-Newton methods using line search and trust-region strategies. Our methods bridge the disparity between first- and second-order methods by using gradient information to calculate low-rank updates to Hessian approximations. We provide formal convergence analysis of these methods as well as empirical results on deep learning applications, such as image classification tasks and deep reinforcement learning on a set of ATARI 2600 video games. Our results show a robust convergence with preferred generalization characteristics as well as fast training time.
Tasks Image Classification
Published 2019-09-04
URL https://arxiv.org/abs/1909.01994v1
PDF https://arxiv.org/pdf/1909.01994v1.pdf
PWC https://paperswithcode.com/paper/quasi-newton-optimization-methods-for-deep
Repo
Framework

Logical Segmentation of Source Code

Title Logical Segmentation of Source Code
Authors Jacob Dormuth, Ben Gelman, Jessica Moore, David Slater
Abstract Many software analysis methods have come to rely on machine learning approaches. Code segmentation - the process of decomposing source code into meaningful blocks - can augment these methods by featurizing code, reducing noise, and limiting the problem space. Traditionally, code segmentation has been done using syntactic cues; current approaches do not intentionally capture logical content. We develop a novel deep learning approach to generate logical code segments regardless of the language or syntactic correctness of the code. Due to the lack of logically segmented source code, we introduce a unique data set construction technique to approximate ground truth for logically segmented code. Logical code segmentation can improve tasks such as automatically commenting code, detecting software vulnerabilities, repairing bugs, labeling code functionality, and synthesizing new code.
Tasks
Published 2019-07-18
URL https://arxiv.org/abs/1907.08615v1
PDF https://arxiv.org/pdf/1907.08615v1.pdf
PWC https://paperswithcode.com/paper/logical-segmentation-of-source-code
Repo
Framework

End-to-End Adversarial Shape Learning for Abdomen Organ Deep Segmentation

Title End-to-End Adversarial Shape Learning for Abdomen Organ Deep Segmentation
Authors Jinzheng Cai, Yingda Xia, Dong Yang, Daguang Xu, Lin Yang, Holger Roth
Abstract Automatic segmentation of abdomen organs using medical imaging has many potential applications in clinical workflows. Recently, the state-of-the-art performance for organ segmentation has been achieved by deep learning models, i.e., convolutional neural network (CNN). However, it is challenging to train the conventional CNN-based segmentation models that aware of the shape and topology of organs. In this work, we tackle this problem by introducing a novel end-to-end shape learning architecture – organ point-network. It takes deep learning features as inputs and generates organ shape representations as points that located on organ surface. We later present a novel adversarial shape learning objective function to optimize the point-network to capture shape information better. We train the point-network together with a CNN-based segmentation model in a multi-task fashion so that the shared network parameters can benefit from both shape learning and segmentation tasks. We demonstrate our method with three challenging abdomen organs including liver, spleen, and pancreas. The point-network generates surface points with fine-grained details and it is found critical for improving organ segmentation. Consequently, the deep segmentation model is improved by the introduced shape learning as significantly better Dice scores are observed for spleen and pancreas segmentation.
Tasks Pancreas Segmentation
Published 2019-10-15
URL https://arxiv.org/abs/1910.06474v1
PDF https://arxiv.org/pdf/1910.06474v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-adversarial-shape-learning-for
Repo
Framework

Toxicity Prediction by Multimodal Deep Learning

Title Toxicity Prediction by Multimodal Deep Learning
Authors Abdul Karim, Jaspreet Singh, Avinash Mishra, Abdollah Dehzangi, M. A. Hakim Newton, Abdul Sattar
Abstract Prediction of toxicity levels of chemical compounds is an important issue in Quantitative Structure-Activity Relationship (QSAR) modeling. Although toxicity prediction has achieved significant progress in recent times through deep learning, prediction accuracy levels obtained by even very recent methods are not yet very high. We propose a multimodal deep learning method using multiple heterogeneous neural network types and data representations. We represent chemical compounds by strings, images, and numerical features. We train fully connected, convolutional, and recurrent neural networks and their ensembles. Each data representation or neural network type has its own strengths and weaknesses. Our motivation is to obtain a collective performance that could go beyond individual performance of each data representation or each neural network type. On a standard toxicity benchmark, our proposed method obtains significantly better accuracy levels than that by the state-of-the-art toxicity prediction methods.
Tasks
Published 2019-07-19
URL https://arxiv.org/abs/1907.08333v1
PDF https://arxiv.org/pdf/1907.08333v1.pdf
PWC https://paperswithcode.com/paper/toxicity-prediction-by-multimodal-deep
Repo
Framework

Power of the Few: Analyzing the Impact of Influential Users in Collaborative Recommender Systems

Title Power of the Few: Analyzing the Impact of Influential Users in Collaborative Recommender Systems
Authors Farzad Eskandanian, Nasim Sonboli, Bamshad Mobasher
Abstract Like other social systems, in collaborative filtering a small number of “influential” users may have a large impact on the recommendations of other users, thus affecting the overall behavior of the system. Identifying influential users and studying their impact on other users is an important problem because it provides insight into how small groups can inadvertently or intentionally affect the behavior of the system as a whole. Modeling these influences can also shed light on patterns and relationships that would otherwise be difficult to discern, hopefully leading to more transparency in how the system generates personalized content. In this work we first formalize the notion of “influence” in collaborative filtering using an Influence Discrimination Model. We then empirically identify and characterize influential users and analyze their impact on the system under different underlying recommendation algorithms and across three different recommendation domains: job, movie and book recommendations. Insights from these experiments can help in designing systems that are not only optimized for accuracy, but are also tuned to mitigate the impact of influential users when it might lead to potential imbalance or unfairness in the system’s outcomes.
Tasks Recommendation Systems
Published 2019-05-14
URL https://arxiv.org/abs/1905.08031v1
PDF https://arxiv.org/pdf/1905.08031v1.pdf
PWC https://paperswithcode.com/paper/power-of-the-few-analyzing-the-impact-of
Repo
Framework

Bit Efficient Quantization for Deep Neural Networks

Title Bit Efficient Quantization for Deep Neural Networks
Authors Prateeth Nayak, David Zhang, Sek Chai
Abstract Quantization for deep neural networks have afforded models for edge devices that use less on-board memory and enable efficient low-power inference. In this paper, we present a comparison of model-parameter driven quantization approaches that can achieve as low as 3-bit precision without affecting accuracy. The post-training quantization approaches are data-free, and the resulting weight values are closely tied to the dataset distribution on which the model has converged to optimality. We show quantization results for a number of state-of-art deep neural networks (DNN) using large dataset like ImageNet. To better analyze quantization results, we describe the overall range and local sparsity of values afforded through various quantization schemes. We show the methods to lower bit-precision beyond quantization limits with object class clustering.
Tasks Quantization
Published 2019-10-07
URL https://arxiv.org/abs/1910.04877v1
PDF https://arxiv.org/pdf/1910.04877v1.pdf
PWC https://paperswithcode.com/paper/bit-efficient-quantization-for-deep-neural
Repo
Framework

Infeasibility and structural bias in Differential Evolution

Title Infeasibility and structural bias in Differential Evolution
Authors Fabio Caraffini, Anna V. Kononova, David Corne
Abstract This paper thoroughly investigates a range of popular DE configurations to identify components responsible for the emergence of structural bias - recently identified tendency of the algorithm to prefer some regions of the search space for reasons directly unrelated to the objective function values. Such tendency was already studied in GA and PSO where a connection was established between the strength of structural bias and population sizes and potential weaknesses of these algorithms was highlighted. For DE, this study goes further and extends the range of aspects that can contribute to presence of structural bias by including algorithmic component which is usually overlooked - constraint handling technique. A wide range of DE configurations were subjected to the protocol for testing for bias. Results suggest that triggering mechanism for the bias in DE differs to the one previously found for GA and PSO - no clear dependency on population size exists. Setting of DE parameters is based on a separate study which on its own leads to interesting directions of new research. Overall, DE turned out to be robust against structural bias - only DE/current-to-best/1/bin is clearly biased but this effect is mitigated by the use of penalty constraint handling technique.
Tasks
Published 2019-01-18
URL http://arxiv.org/abs/1901.06153v1
PDF http://arxiv.org/pdf/1901.06153v1.pdf
PWC https://paperswithcode.com/paper/infeasibility-and-structural-bias-in
Repo
Framework

Mixed Variational Inference

Title Mixed Variational Inference
Authors Nikolaos Gianniotis
Abstract The Laplace approximation has been one of the workhorses of Bayesian inference. It often delivers good approximations in practice despite the fact that it does not strictly take into account where the volume of posterior density lies. Variational approaches avoid this issue by explicitly minimising the Kullback-Leibler divergence DKL between a postulated posterior and the true (unnormalised) logarithmic posterior. However, they rely on a closed form DKL in order to update the variational parameters. To address this, stochastic versions of variational inference have been devised that approximate the intractable DKL with a Monte Carlo average. This approximation allows calculating gradients with respect to the variational parameters. However, variational methods often postulate a factorised Gaussian approximating posterior. In doing so, they sacrifice a-posteriori correlations. In this work, we propose a method that combines the Laplace approximation with the variational approach. The advantages are that we maintain: applicability on non-conjugate models, posterior correlations and a reduced number of free variational parameters. Numerical experiments demonstrate improvement over the Laplace approximation and variational inference with factorised Gaussian posteriors.
Tasks Bayesian Inference
Published 2019-01-15
URL https://arxiv.org/abs/1901.04791v3
PDF https://arxiv.org/pdf/1901.04791v3.pdf
PWC https://paperswithcode.com/paper/mixed-variational-inference
Repo
Framework
comments powered by Disqus