February 1, 2020

3121 words 15 mins read

Paper Group AWR 171

Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences. YOLACT: Real-time Instance Segmentation. A Unified Linear-Time Framework for Sentence-Level Discourse Parsing. DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding. Accelerated Experimental Design for Pairwise Comparisons. Rethin …

Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences


Title	Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences
Authors	Sandesh Ghimire, Prashnna Kumar Gyawali, Jwala Dhamala, John L Sapp, Milan Horacek, Linwei Wang
Abstract	Deep learning networks have shown state-of-the-art performance in many image reconstruction problems. However, it is not well understood what properties of representation and learning may improve the generalization ability of the network. In this paper, we propose that the generalization ability of an encoder-decoder network for inverse reconstruction can be improved in two means. First, drawing from analytical learning theory, we theoretically show that a stochastic latent space will improve the ability of a network to generalize to test data outside the training distribution. Second, following the information bottleneck principle, we show that a latent representation minimally informative of the input data will help a network generalize to unseen input variations that are irrelevant to the output reconstruction. Therefore, we present a sequence image reconstruction network optimized by a variational approximation of the information bottleneck principle with stochastic latent space. In the application setting of reconstructing the sequence of cardiac transmembrane potential from bodysurface potential, we assess the two types of generalization abilities of the presented network against its deterministic counterpart. The results demonstrate that the generalization ability of an inverse reconstruction network can be improved by stochasticity as well as the information bottleneck.
Tasks	Image Reconstruction
Published	2019-03-05
URL	http://arxiv.org/abs/1903.02948v1
PDF	http://arxiv.org/pdf/1903.02948v1.pdf
PWC	https://paperswithcode.com/paper/improving-generalization-of-deep-networks-for
Repo	https://github.com/sandeshgh/Improving-Generalization
Framework	pytorch

YOLACT: Real-time Instance Segmentation


Title	YOLACT: Real-time Instance Segmentation
Authors	Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee
Abstract	We present a simple, fully-convolutional model for real-time instance segmentation that achieves 29.8 mAP on MS COCO at 33.5 fps evaluated on a single Titan Xp, which is significantly faster than any previous competitive approach. Moreover, we obtain this result after training on only one GPU. We accomplish this by breaking instance segmentation into two parallel subtasks: (1) generating a set of prototype masks and (2) predicting per-instance mask coefficients. Then we produce instance masks by linearly combining the prototypes with the mask coefficients. We find that because this process doesn’t depend on repooling, this approach produces very high-quality masks and exhibits temporal stability for free. Furthermore, we analyze the emergent behavior of our prototypes and show they learn to localize instances on their own in a translation variant manner, despite being fully-convolutional. Finally, we also propose Fast NMS, a drop-in 12 ms faster replacement for standard NMS that only has a marginal performance penalty.
Tasks	Instance Segmentation, Real-time Instance Segmentation, Semantic Segmentation
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02689v2
PDF	https://arxiv.org/pdf/1904.02689v2.pdf
PWC	https://paperswithcode.com/paper/yolact-real-time-instance-segmentation
Repo	https://github.com/BigThreeMI/Utils
Framework	tf

A Unified Linear-Time Framework for Sentence-Level Discourse Parsing


Title	A Unified Linear-Time Framework for Sentence-Level Discourse Parsing
Authors	Xiang Lin, Shafiq Joty, Prathyusha Jwalapuram, M Saiful Bari
Abstract	We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in linear time. Our segmenter yields an $F_1$ score of 95.4, and our parser achieves an $F_1$ score of 81.7 on the aggregated labeled (relation) metric, surpassing previous approaches by a good margin and approaching human agreement on both tasks (98.3 and 83.0 $F_1$).
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05682v2
PDF	https://arxiv.org/pdf/1905.05682v2.pdf
PWC	https://paperswithcode.com/paper/a-unified-linear-time-framework-for-sentence
Repo	https://github.com/shawnlimn/UnifiedParser_RST
Framework	pytorch

DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding


Title	DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding
Authors	Hyungro Lee, Heng Ma, Matteo Turilli, Debsindhu Bhowmik, Shantenu Jha, Arvind Ramanathan
Abstract	Simulations of biological macromolecules play an important role in understanding the physical basis of a number of complex processes such as protein folding. Even with increasing computational power and evolution of specialized architectures, the ability to simulate protein folding at atomistic scales still remains challenging. This stems from the dual aspects of high dimensionality of protein conformational landscapes, and the inability of atomistic molecular dynamics (MD) simulations to sufficiently sample these landscapes to observe folding events. Machine learning/deep learning (ML/DL) techniques, when combined with atomistic MD simulations offer the opportunity to potentially overcome these limitations by: (1) effectively reducing the dimensionality of MD simulations to automatically build latent representations that correspond to biophysically relevant reaction coordinates (RCs), and (2) driving MD simulations to automatically sample potentially novel conformational states based on these RCs. We examine how coupling DL approaches with MD simulations can fold small proteins effectively on supercomputers. In particular, we study the computational costs and effectiveness of scaling DL-coupled MD workflows by folding two prototypical systems, viz., Fs-peptide and the fast-folding variant of the villin head piece protein. We demonstrate that a DL driven MD workflow is able to effectively learn latent representations and drive adaptive simulations. Compared to traditional MD-based approaches, our approach achieves an effective performance gain in sampling the folded states by at least 2.3x. Our study provides a quantitative basis to understand how DL driven MD simulations, can lead to effective performance gains and reduced times to solution on supercomputing resources.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07817v1
PDF	https://arxiv.org/pdf/1909.07817v1.pdf
PWC	https://paperswithcode.com/paper/deepdrivemd-deep-learning-driven-adaptive
Repo	https://github.com/braceal/DeepDriveMD
Framework	pytorch

Accelerated Experimental Design for Pairwise Comparisons


Title	Accelerated Experimental Design for Pairwise Comparisons
Authors	Yuan Guo, Jennifer Dy, Deniz Erdogmus, Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis
Abstract	Pairwise comparison labels are more informative and less variable than class labels, but generating them poses a challenge: their number grows quadratically in the dataset size. We study a natural experimental design objective, namely, D-optimality, that can be used to identify which $K$ pairwise comparisons to generate. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A na"ive greedy implementation has $O(N^2d^2K)$ complexity, where $N$ is the dataset size, $d$ is the feature space dimension, and $K$ is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset–namely, that it consists of pairwise comparisons–the greedy algorithm’s complexity can be reduced to $O(N^2(K+d)+N(dK+d^2) +d^2K).$ We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with $10^8$ comparisons; the na"ive greedy algorithm on the same dataset would require more than 10 days to terminate.
Tasks
Published	2019-01-18
URL	http://arxiv.org/abs/1901.06080v1
PDF	http://arxiv.org/pdf/1901.06080v1.pdf
PWC	https://paperswithcode.com/paper/accelerated-experimental-design-for-pairwise
Repo	https://github.com/neu-spiral/AcceleratedExperimentalDesign
Framework	none

Rethinking the CSC Model for Natural Images


Title	Rethinking the CSC Model for Natural Images
Authors	Dror Simon, Michael Elad
Abstract	Sparse representation with respect to an overcomplete dictionary is often used when regularizing inverse problems in signal and image processing. In recent years, the Convolutional Sparse Coding (CSC) model, in which the dictionary consists of shift-invariant filters, has gained renewed interest. While this model has been successfully used in some image processing problems, it still falls behind traditional patch-based methods on simple tasks such as denoising. In this work we provide new insights regarding the CSC model and its capability to represent natural images, and suggest a Bayesian connection between this model and its patch-based ancestor. Armed with these observations, we suggest a novel feed-forward network that follows an MMSE approximation process to the CSC model, using strided convolutions. The performance of this supervised architecture is shown to be on par with state of the art methods while using much fewer parameters.
Tasks	Denoising
Published	2019-09-12
URL	https://arxiv.org/abs/1909.05742v1
PDF	https://arxiv.org/pdf/1909.05742v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-the-csc-model-for-natural-images
Repo	https://github.com/drorsimon/CSCNet
Framework	pytorch

Modulation of early visual processing alleviates capacity limits in solving multiple tasks


Title	Modulation of early visual processing alleviates capacity limits in solving multiple tasks
Authors	Sushrut Thorat, Giacomo Aldegheri, Marcel A. J. van Gerven, Marius V. Peelen
Abstract	In daily life situations, we have to perform multiple tasks given a visual stimulus, which requires task-relevant information to be transmitted through our visual system. When it is not possible to transmit all the possibly relevant information to higher layers, due to a bottleneck, task-based modulation of early visual processing might be necessary. In this work, we report how the effectiveness of modulating the early processing stage of an artificial neural network depends on the information bottleneck faced by the network. The bottleneck is quantified by the number of tasks the network has to perform and the neural capacity of the later stage of the network. The effectiveness is gauged by the performance on multiple object detection tasks, where the network is trained with a recent multi-task optimization scheme. By associating neural modulations with task-based switching of the state of the network and characterizing when such switching is helpful in early processing, our results provide a functional perspective towards understanding why task-based modulation of early neural processes might be observed in the primate visual cortex
Tasks	Object Detection
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12309v3
PDF	https://arxiv.org/pdf/1907.12309v3.pdf
PWC	https://paperswithcode.com/paper/modulation-of-early-visual-processing
Repo	https://github.com/novelmartis/early-vs-late-multi-task
Framework	tf

NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings


Title	NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings
Authors	Oscar J. Romero, Ankit Dangi, Sushma A. Akoju
Abstract	Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations.
Tasks	Sentence Embeddings
Published	2019-01-23
URL	https://arxiv.org/abs/1901.07910v3
PDF	https://arxiv.org/pdf/1901.07910v3.pdf
PWC	https://paperswithcode.com/paper/190107910
Repo	https://github.com/ojrlopez27/nl-service-composition
Framework	none

Dynamic Local Regret for Non-convex Online Forecasting


Title	Dynamic Local Regret for Non-convex Online Forecasting
Authors	Sergul Aydore, Tianhao Zhu, Dean Foster
Abstract	We consider online forecasting problems for non-convex machine learning models. Forecasting introduces several challenges such as (i) frequent updates are necessary to deal with concept drift issues since the dynamics of the environment change over time, and (ii) the state of the art models are non-convex models. We address these challenges with a novel regret framework. Standard regret measures commonly do not consider both dynamic environment and non-convex models. We introduce a local regret for non-convex models in a dynamic environment. We present an update rule incurring a cost, according to our proposed local regret, which is sublinear in time T. Our update uses time-smoothed gradients. Using a real-world dataset we show that our time-smoothed approach yields several benefits when compared with state-of-the-art competitors: results are more stable against new data; training is more robust to hyperparameter selection; and our approach is more computationally efficient than the alternatives.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07927v2
PDF	https://arxiv.org/pdf/1910.07927v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-local-regret-for-non-convex-online
Repo	https://github.com/Timbasa/Dynamic_Local_Regret_for_Non-convex_Online_Forecasting_NeurIPS2019
Framework	pytorch

Co-training for Policy Learning


Title	Co-training for Policy Learning
Authors	Jialin Song, Ravi Lanka, Yisong Yue, Masahiro Ono
Abstract	We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.
Tasks	Combinatorial Optimization, Continuous Control, Decision Making, Imitation Learning
Published	2019-07-03
URL	https://arxiv.org/abs/1907.04484v1
PDF	https://arxiv.org/pdf/1907.04484v1.pdf
PWC	https://paperswithcode.com/paper/co-training-for-policy-learning
Repo	https://github.com/ravi-lanka-4/CoPiEr
Framework	none

A New Look at an Old Problem: A Universal Learning Approach to Linear Regression


Title	A New Look at an Old Problem: A Universal Learning Approach to Linear Regression
Authors	Koby Bibas, Yaniv Fogel, Meir Feder
Abstract	Linear regression is a classical paradigm in statistics. A new look at it is provided via the lens of universal learning. In applying universal learning to linear regression the hypotheses class represents the label $y\in {\cal R}$ as a linear combination of the feature vector $x^T\theta$ where $x\in {\cal R}^M$, within a Gaussian error. The Predictive Normalized Maximum Likelihood (pNML) solution for universal learning of individual data can be expressed analytically in this case, as well as its associated learnability measure. Interestingly, the situation where the number of parameters $M$ may even be larger than the number of training samples $N$ can be examined. As expected, in this case learnability cannot be attained in every situation; nevertheless, if the test vector resides mostly in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, linear regression can generalize despite the fact that it uses an ``over-parametrized’’ model. We demonstrate the results with a simulation of fitting a polynomial to data with a possibly large polynomial degree. \|
Tasks
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04708v1
PDF	https://arxiv.org/pdf/1905.04708v1.pdf
PWC	https://paperswithcode.com/paper/a-new-look-at-an-old-problem-a-universal
Repo	https://github.com/kobybibas/pnml_linear_regression_simulation
Framework	none

Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping


Title	Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping
Authors	Suhas Lohit, Qiao Wang, Pavan Turaga
Abstract	Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal misalignment arises due to various reasons including differing initial phase, sensor sampling rates, and elastic time-warps due to subject-specific biomechanics. Past work in this area has only looked at reducing intra-class variability by elastic temporal alignment. In this paper, we propose a hybrid model-based and data-driven approach to learn warping functions that not just reduce intra-class variability, but also increase inter-class separation. We call this a temporal transformer network (TTN). TTN is an interpretable differentiable module, which can be easily integrated at the front end of a classification network. The module is capable of reducing intra-class variance by generating input-dependent warping functions which lead to rate-robust representations. At the same time, it increases inter-class variance by learning warping functions that are more discriminative. We show improvements over strong baselines in 3D action recognition on challenging datasets using the proposed framework. The improvements are especially pronounced when training sets are smaller.
Tasks	3D Human Action Recognition, Time Series, Time Series Classification
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05947v1
PDF	https://arxiv.org/pdf/1906.05947v1.pdf
PWC	https://paperswithcode.com/paper/temporal-transformer-networks-joint-learning-1
Repo	https://github.com/suhaslohit/TTN
Framework	tf

ROC movies – a new generalization to a popular classic


Title	ROC movies – a new generalization to a popular classic
Authors	Tilmann Gneiting, Eva-Maria Walz
Abstract	Throughout science and technology, receiver operating characteristic (ROC) curves and associated area under the curve (AUC) measures constitute powerful tools for assessing the predictive abilities of features, markers and tests in binary classification problems. Despite its immense popularity, ROC analysis has been subject to a fundamental restriction, in that it applies to dichotomous (yes or no) outcomes only. We introduce ROC movies and universal ROC (UROC) curves that apply to just any ordinal or real-valued outcome, along with a new, asymmetric coefficient of predictive ability (CPA) measure. CPA equals the area under the UROC curve and admits appealing interpretations in terms of probabilities and rank based covariances. ROC movies, UROC curves and CPA nest and generalize the classical ROC curve and AUC, and are bound to supersede them in a wealth of applications.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1912.01956v1
PDF	https://arxiv.org/pdf/1912.01956v1.pdf
PWC	https://paperswithcode.com/paper/191201956
Repo	https://github.com/evwalz/CPA_Example_NWP
Framework	none

Self-adaptive Single and Multi-illuminant Estimation Framework based on Deep Learning


Title	Self-adaptive Single and Multi-illuminant Estimation Framework based on Deep Learning
Authors	Yongjie Liu, Sijie Shen
Abstract	Illuminant estimation plays a key role in digital camera pipeline system, it aims at reducing color casting effect due to the influence of non-white illuminant. Recent researches handle this task by using Convolution Neural Network (CNN) as a mapping function from input image to a single illumination vector. However, global mapping approaches are difficult to deal with scenes under multi-light-sources. In this paper, we proposed a self-adaptive single and multi-illuminant estimation framework, which includes the following novelties: (1) Learning local self-adaptive kernels from the entire image for illuminant estimation with encoder-decoder CNN structure; (2) Providing confidence measurement for the prediction; (3) Clustering-based iterative fitting for computing single and multi-illumination vectors. The proposed global-to-local aggregation is able to predict multi-illuminant regionally by utilizing global information instead of training in patches, as well as brings significant improvement for single illuminant estimation. We outperform the state-of-the-art methods on standard benchmarks with the largest relative improvement of 16%. In addition, we collect a dataset contains over 13k images for illuminant estimation and evaluation. The code and dataset is available on https://github.com/LiamLYJ/KPF_WB
Tasks
Published	2019-02-13
URL	http://arxiv.org/abs/1902.04705v1
PDF	http://arxiv.org/pdf/1902.04705v1.pdf
PWC	https://paperswithcode.com/paper/self-adaptive-single-and-multi-illuminant
Repo	https://github.com/LiamLYJ/KPF_WB
Framework	tf

Counterfactual Depth from a Single RGB Image


Title	Counterfactual Depth from a Single RGB Image
Authors	Theerasit Issaranon, Chuhang Zou, David Forsyth
Abstract	We describe a method that predicts, from a single RGB image, a depth map that describes the scene when a masked object is removed - we call this “counterfactual depth” that models hidden scene geometry together with the observations. Our method works for the same reason that scene completion works: the spatial structure of objects is simple. But we offer a much higher resolution representation of space than current scene completion methods, as we operate at pixel-level precision and do not rely on a voxel representation. Furthermore, we do not require RGBD inputs. Our method uses a standard encoder-decoder architecture, and with a decoder modified to accept an object mask. We describe a small evaluation dataset that we have collected, which allows inference about what factors affect reconstruction most strongly. Using this dataset, we show that our depth predictions for masked objects are better than other baselines.
Tasks
Published	2019-09-03
URL	https://arxiv.org/abs/1909.00915v1
PDF	https://arxiv.org/pdf/1909.00915v1.pdf
PWC	https://paperswithcode.com/paper/counterfactual-depth-from-a-single-rgb-image
Repo	https://github.com/Theerasit/CounterfactualDepth
Framework	none