February 1, 2020

3121 words 15 mins read

Paper Group AWR 171

Paper Group AWR 171

Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences. YOLACT: Real-time Instance Segmentation. A Unified Linear-Time Framework for Sentence-Level Discourse Parsing. DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding. Accelerated Experimental Design for Pairwise Comparisons. Rethin …

Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences

Title Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences
Authors Sandesh Ghimire, Prashnna Kumar Gyawali, Jwala Dhamala, John L Sapp, Milan Horacek, Linwei Wang
Abstract Deep learning networks have shown state-of-the-art performance in many image reconstruction problems. However, it is not well understood what properties of representation and learning may improve the generalization ability of the network. In this paper, we propose that the generalization ability of an encoder-decoder network for inverse reconstruction can be improved in two means. First, drawing from analytical learning theory, we theoretically show that a stochastic latent space will improve the ability of a network to generalize to test data outside the training distribution. Second, following the information bottleneck principle, we show that a latent representation minimally informative of the input data will help a network generalize to unseen input variations that are irrelevant to the output reconstruction. Therefore, we present a sequence image reconstruction network optimized by a variational approximation of the information bottleneck principle with stochastic latent space. In the application setting of reconstructing the sequence of cardiac transmembrane potential from bodysurface potential, we assess the two types of generalization abilities of the presented network against its deterministic counterpart. The results demonstrate that the generalization ability of an inverse reconstruction network can be improved by stochasticity as well as the information bottleneck.
Tasks Image Reconstruction
Published 2019-03-05
URL http://arxiv.org/abs/1903.02948v1
PDF http://arxiv.org/pdf/1903.02948v1.pdf
PWC https://paperswithcode.com/paper/improving-generalization-of-deep-networks-for
Repo https://github.com/sandeshgh/Improving-Generalization
Framework pytorch

YOLACT: Real-time Instance Segmentation

Title YOLACT: Real-time Instance Segmentation
Authors Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee
Abstract We present a simple, fully-convolutional model for real-time instance segmentation that achieves 29.8 mAP on MS COCO at 33.5 fps evaluated on a single Titan Xp, which is significantly faster than any previous competitive approach. Moreover, we obtain this result after training on only one GPU. We accomplish this by breaking instance segmentation into two parallel subtasks: (1) generating a set of prototype masks and (2) predicting per-instance mask coefficients. Then we produce instance masks by linearly combining the prototypes with the mask coefficients. We find that because this process doesn’t depend on repooling, this approach produces very high-quality masks and exhibits temporal stability for free. Furthermore, we analyze the emergent behavior of our prototypes and show they learn to localize instances on their own in a translation variant manner, despite being fully-convolutional. Finally, we also propose Fast NMS, a drop-in 12 ms faster replacement for standard NMS that only has a marginal performance penalty.
Tasks Instance Segmentation, Real-time Instance Segmentation, Semantic Segmentation
Published 2019-04-04
URL https://arxiv.org/abs/1904.02689v2
PDF https://arxiv.org/pdf/1904.02689v2.pdf
PWC https://paperswithcode.com/paper/yolact-real-time-instance-segmentation
Repo https://github.com/BigThreeMI/Utils
Framework tf

A Unified Linear-Time Framework for Sentence-Level Discourse Parsing

Title A Unified Linear-Time Framework for Sentence-Level Discourse Parsing
Authors Xiang Lin, Shafiq Joty, Prathyusha Jwalapuram, M Saiful Bari
Abstract We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in linear time. Our segmenter yields an $F_1$ score of 95.4, and our parser achieves an $F_1$ score of 81.7 on the aggregated labeled (relation) metric, surpassing previous approaches by a good margin and approaching human agreement on both tasks (98.3 and 83.0 $F_1$).
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.05682v2
PDF https://arxiv.org/pdf/1905.05682v2.pdf
PWC https://paperswithcode.com/paper/a-unified-linear-time-framework-for-sentence
Repo https://github.com/shawnlimn/UnifiedParser_RST
Framework pytorch

DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding

Title DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding
Authors Hyungro Lee, Heng Ma, Matteo Turilli, Debsindhu Bhowmik, Shantenu Jha, Arvind Ramanathan
Abstract Simulations of biological macromolecules play an important role in understanding the physical basis of a number of complex processes such as protein folding. Even with increasing computational power and evolution of specialized architectures, the ability to simulate protein folding at atomistic scales still remains challenging. This stems from the dual aspects of high dimensionality of protein conformational landscapes, and the inability of atomistic molecular dynamics (MD) simulations to sufficiently sample these landscapes to observe folding events. Machine learning/deep learning (ML/DL) techniques, when combined with atomistic MD simulations offer the opportunity to potentially overcome these limitations by: (1) effectively reducing the dimensionality of MD simulations to automatically build latent representations that correspond to biophysically relevant reaction coordinates (RCs), and (2) driving MD simulations to automatically sample potentially novel conformational states based on these RCs. We examine how coupling DL approaches with MD simulations can fold small proteins effectively on supercomputers. In particular, we study the computational costs and effectiveness of scaling DL-coupled MD workflows by folding two prototypical systems, viz., Fs-peptide and the fast-folding variant of the villin head piece protein. We demonstrate that a DL driven MD workflow is able to effectively learn latent representations and drive adaptive simulations. Compared to traditional MD-based approaches, our approach achieves an effective performance gain in sampling the folded states by at least 2.3x. Our study provides a quantitative basis to understand how DL driven MD simulations, can lead to effective performance gains and reduced times to solution on supercomputing resources.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.07817v1
PDF https://arxiv.org/pdf/1909.07817v1.pdf
PWC https://paperswithcode.com/paper/deepdrivemd-deep-learning-driven-adaptive
Repo https://github.com/braceal/DeepDriveMD
Framework pytorch

Accelerated Experimental Design for Pairwise Comparisons

Title Accelerated Experimental Design for Pairwise Comparisons
Authors Yuan Guo, Jennifer Dy, Deniz Erdogmus, Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis
Abstract Pairwise comparison labels are more informative and less variable than class labels, but generating them poses a challenge: their number grows quadratically in the dataset size. We study a natural experimental design objective, namely, D-optimality, that can be used to identify which $K$ pairwise comparisons to generate. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A na"ive greedy implementation has $O(N^2d^2K)$ complexity, where $N$ is the dataset size, $d$ is the feature space dimension, and $K$ is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset–namely, that it consists of pairwise comparisons–the greedy algorithm’s complexity can be reduced to $O(N^2(K+d)+N(dK+d^2) +d^2K).$ We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with $10^8$ comparisons; the na"ive greedy algorithm on the same dataset would require more than 10 days to terminate.
Tasks
Published 2019-01-18
URL http://arxiv.org/abs/1901.06080v1
PDF http://arxiv.org/pdf/1901.06080v1.pdf
PWC https://paperswithcode.com/paper/accelerated-experimental-design-for-pairwise
Repo https://github.com/neu-spiral/AcceleratedExperimentalDesign
Framework none

Rethinking the CSC Model for Natural Images

Title Rethinking the CSC Model for Natural Images
Authors Dror Simon, Michael Elad
Abstract Sparse representation with respect to an overcomplete dictionary is often used when regularizing inverse problems in signal and image processing. In recent years, the Convolutional Sparse Coding (CSC) model, in which the dictionary consists of shift-invariant filters, has gained renewed interest. While this model has been successfully used in some image processing problems, it still falls behind traditional patch-based methods on simple tasks such as denoising. In this work we provide new insights regarding the CSC model and its capability to represent natural images, and suggest a Bayesian connection between this model and its patch-based ancestor. Armed with these observations, we suggest a novel feed-forward network that follows an MMSE approximation process to the CSC model, using strided convolutions. The performance of this supervised architecture is shown to be on par with state of the art methods while using much fewer parameters.
Tasks Denoising
Published 2019-09-12
URL https://arxiv.org/abs/1909.05742v1
PDF https://arxiv.org/pdf/1909.05742v1.pdf
PWC https://paperswithcode.com/paper/rethinking-the-csc-model-for-natural-images
Repo https://github.com/drorsimon/CSCNet
Framework pytorch

Modulation of early visual processing alleviates capacity limits in solving multiple tasks

Title Modulation of early visual processing alleviates capacity limits in solving multiple tasks
Authors Sushrut Thorat, Giacomo Aldegheri, Marcel A. J. van Gerven, Marius V. Peelen
Abstract In daily life situations, we have to perform multiple tasks given a visual stimulus, which requires task-relevant information to be transmitted through our visual system. When it is not possible to transmit all the possibly relevant information to higher layers, due to a bottleneck, task-based modulation of early visual processing might be necessary. In this work, we report how the effectiveness of modulating the early processing stage of an artificial neural network depends on the information bottleneck faced by the network. The bottleneck is quantified by the number of tasks the network has to perform and the neural capacity of the later stage of the network. The effectiveness is gauged by the performance on multiple object detection tasks, where the network is trained with a recent multi-task optimization scheme. By associating neural modulations with task-based switching of the state of the network and characterizing when such switching is helpful in early processing, our results provide a functional perspective towards understanding why task-based modulation of early neural processes might be observed in the primate visual cortex
Tasks Object Detection
Published 2019-07-29
URL https://arxiv.org/abs/1907.12309v3
PDF https://arxiv.org/pdf/1907.12309v3.pdf
PWC https://paperswithcode.com/paper/modulation-of-early-visual-processing
Repo https://github.com/novelmartis/early-vs-late-multi-task
Framework tf

NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings

Title NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings
Authors Oscar J. Romero, Ankit Dangi, Sushma A. Akoju
Abstract Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations.
Tasks Sentence Embeddings
Published 2019-01-23
URL https://arxiv.org/abs/1901.07910v3
PDF https://arxiv.org/pdf/1901.07910v3.pdf
PWC https://paperswithcode.com/paper/190107910
Repo https://github.com/ojrlopez27/nl-service-composition
Framework none

Dynamic Local Regret for Non-convex Online Forecasting

Title Dynamic Local Regret for Non-convex Online Forecasting
Authors Sergul Aydore, Tianhao Zhu, Dean Foster
Abstract We consider online forecasting problems for non-convex machine learning models. Forecasting introduces several challenges such as (i) frequent updates are necessary to deal with concept drift issues since the dynamics of the environment change over time, and (ii) the state of the art models are non-convex models. We address these challenges with a novel regret framework. Standard regret measures commonly do not consider both dynamic environment and non-convex models. We introduce a local regret for non-convex models in a dynamic environment. We present an update rule incurring a cost, according to our proposed local regret, which is sublinear in time T. Our update uses time-smoothed gradients. Using a real-world dataset we show that our time-smoothed approach yields several benefits when compared with state-of-the-art competitors: results are more stable against new data; training is more robust to hyperparameter selection; and our approach is more computationally efficient than the alternatives.
Tasks
Published 2019-10-16
URL https://arxiv.org/abs/1910.07927v2
PDF https://arxiv.org/pdf/1910.07927v2.pdf
PWC https://paperswithcode.com/paper/dynamic-local-regret-for-non-convex-online
Repo https://github.com/Timbasa/Dynamic_Local_Regret_for_Non-convex_Online_Forecasting_NeurIPS2019
Framework pytorch

Co-training for Policy Learning

Title Co-training for Policy Learning
Authors Jialin Song, Ravi Lanka, Yisong Yue, Masahiro Ono
Abstract We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.
Tasks Combinatorial Optimization, Continuous Control, Decision Making, Imitation Learning
Published 2019-07-03
URL https://arxiv.org/abs/1907.04484v1
PDF https://arxiv.org/pdf/1907.04484v1.pdf
PWC https://paperswithcode.com/paper/co-training-for-policy-learning
Repo https://github.com/ravi-lanka-4/CoPiEr
Framework none

A New Look at an Old Problem: A Universal Learning Approach to Linear Regression

Title A New Look at an Old Problem: A Universal Learning Approach to Linear Regression
Authors Koby Bibas, Yaniv Fogel, Meir Feder
Abstract Linear regression is a classical paradigm in statistics. A new look at it is provided via the lens of universal learning. In applying universal learning to linear regression the hypotheses class represents the label $y\in {\cal R}$ as a linear combination of the feature vector $x^T\theta$ where $x\in {\cal R}^M$, within a Gaussian error. The Predictive Normalized Maximum Likelihood (pNML) solution for universal learning of individual data can be expressed analytically in this case, as well as its associated learnability measure. Interestingly, the situation where the number of parameters $M$ may even be larger than the number of training samples $N$ can be examined. As expected, in this case learnability cannot be attained in every situation; nevertheless, if the test vector resides mostly in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, linear regression can generalize despite the fact that it uses an ``over-parametrized’’ model. We demonstrate the results with a simulation of fitting a polynomial to data with a possibly large polynomial degree. |
Tasks
Published 2019-05-12
URL https://arxiv.org/abs/1905.04708v1
PDF https://arxiv.org/pdf/1905.04708v1.pdf
PWC https://paperswithcode.com/paper/a-new-look-at-an-old-problem-a-universal
Repo https://github.com/kobybibas/pnml_linear_regression_simulation
Framework none

Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping

Title Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping
Authors Suhas Lohit, Qiao Wang, Pavan Turaga
Abstract Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal misalignment arises due to various reasons including differing initial phase, sensor sampling rates, and elastic time-warps due to subject-specific biomechanics. Past work in this area has only looked at reducing intra-class variability by elastic temporal alignment. In this paper, we propose a hybrid model-based and data-driven approach to learn warping functions that not just reduce intra-class variability, but also increase inter-class separation. We call this a temporal transformer network (TTN). TTN is an interpretable differentiable module, which can be easily integrated at the front end of a classification network. The module is capable of reducing intra-class variance by generating input-dependent warping functions which lead to rate-robust representations. At the same time, it increases inter-class variance by learning warping functions that are more discriminative. We show improvements over strong baselines in 3D action recognition on challenging datasets using the proposed framework. The improvements are especially pronounced when training sets are smaller.
Tasks 3D Human Action Recognition, Time Series, Time Series Classification
Published 2019-06-13
URL https://arxiv.org/abs/1906.05947v1
PDF https://arxiv.org/pdf/1906.05947v1.pdf
PWC https://paperswithcode.com/paper/temporal-transformer-networks-joint-learning-1
Repo https://github.com/suhaslohit/TTN
Framework tf
Title ROC movies – a new generalization to a popular classic
Authors Tilmann Gneiting, Eva-Maria Walz
Abstract Throughout science and technology, receiver operating characteristic (ROC) curves and associated area under the curve (AUC) measures constitute powerful tools for assessing the predictive abilities of features, markers and tests in binary classification problems. Despite its immense popularity, ROC analysis has been subject to a fundamental restriction, in that it applies to dichotomous (yes or no) outcomes only. We introduce ROC movies and universal ROC (UROC) curves that apply to just any ordinal or real-valued outcome, along with a new, asymmetric coefficient of predictive ability (CPA) measure. CPA equals the area under the UROC curve and admits appealing interpretations in terms of probabilities and rank based covariances. ROC movies, UROC curves and CPA nest and generalize the classical ROC curve and AUC, and are bound to supersede them in a wealth of applications.
Tasks
Published 2019-11-29
URL https://arxiv.org/abs/1912.01956v1
PDF https://arxiv.org/pdf/1912.01956v1.pdf
PWC https://paperswithcode.com/paper/191201956
Repo https://github.com/evwalz/CPA_Example_NWP
Framework none

Self-adaptive Single and Multi-illuminant Estimation Framework based on Deep Learning

Title Self-adaptive Single and Multi-illuminant Estimation Framework based on Deep Learning
Authors Yongjie Liu, Sijie Shen
Abstract Illuminant estimation plays a key role in digital camera pipeline system, it aims at reducing color casting effect due to the influence of non-white illuminant. Recent researches handle this task by using Convolution Neural Network (CNN) as a mapping function from input image to a single illumination vector. However, global mapping approaches are difficult to deal with scenes under multi-light-sources. In this paper, we proposed a self-adaptive single and multi-illuminant estimation framework, which includes the following novelties: (1) Learning local self-adaptive kernels from the entire image for illuminant estimation with encoder-decoder CNN structure; (2) Providing confidence measurement for the prediction; (3) Clustering-based iterative fitting for computing single and multi-illumination vectors. The proposed global-to-local aggregation is able to predict multi-illuminant regionally by utilizing global information instead of training in patches, as well as brings significant improvement for single illuminant estimation. We outperform the state-of-the-art methods on standard benchmarks with the largest relative improvement of 16%. In addition, we collect a dataset contains over 13k images for illuminant estimation and evaluation. The code and dataset is available on https://github.com/LiamLYJ/KPF_WB
Tasks
Published 2019-02-13
URL http://arxiv.org/abs/1902.04705v1
PDF http://arxiv.org/pdf/1902.04705v1.pdf
PWC https://paperswithcode.com/paper/self-adaptive-single-and-multi-illuminant
Repo https://github.com/LiamLYJ/KPF_WB
Framework tf

Counterfactual Depth from a Single RGB Image

Title Counterfactual Depth from a Single RGB Image
Authors Theerasit Issaranon, Chuhang Zou, David Forsyth
Abstract We describe a method that predicts, from a single RGB image, a depth map that describes the scene when a masked object is removed - we call this “counterfactual depth” that models hidden scene geometry together with the observations. Our method works for the same reason that scene completion works: the spatial structure of objects is simple. But we offer a much higher resolution representation of space than current scene completion methods, as we operate at pixel-level precision and do not rely on a voxel representation. Furthermore, we do not require RGBD inputs. Our method uses a standard encoder-decoder architecture, and with a decoder modified to accept an object mask. We describe a small evaluation dataset that we have collected, which allows inference about what factors affect reconstruction most strongly. Using this dataset, we show that our depth predictions for masked objects are better than other baselines.
Tasks
Published 2019-09-03
URL https://arxiv.org/abs/1909.00915v1
PDF https://arxiv.org/pdf/1909.00915v1.pdf
PWC https://paperswithcode.com/paper/counterfactual-depth-from-a-single-rgb-image
Repo https://github.com/Theerasit/CounterfactualDepth
Framework none
comments powered by Disqus