Paper Group AWR 171
Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences. YOLACT: Real-time Instance Segmentation. A Unified Linear-Time Framework for Sentence-Level Discourse Parsing. DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding. Accelerated Experimental Design for Pairwise Comparisons. Rethin …
Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences
Title | Improving Generalization of Deep Networks for Inverse Reconstruction of Image Sequences |
Authors | Sandesh Ghimire, Prashnna Kumar Gyawali, Jwala Dhamala, John L Sapp, Milan Horacek, Linwei Wang |
Abstract | Deep learning networks have shown state-of-the-art performance in many image reconstruction problems. However, it is not well understood what properties of representation and learning may improve the generalization ability of the network. In this paper, we propose that the generalization ability of an encoder-decoder network for inverse reconstruction can be improved in two means. First, drawing from analytical learning theory, we theoretically show that a stochastic latent space will improve the ability of a network to generalize to test data outside the training distribution. Second, following the information bottleneck principle, we show that a latent representation minimally informative of the input data will help a network generalize to unseen input variations that are irrelevant to the output reconstruction. Therefore, we present a sequence image reconstruction network optimized by a variational approximation of the information bottleneck principle with stochastic latent space. In the application setting of reconstructing the sequence of cardiac transmembrane potential from bodysurface potential, we assess the two types of generalization abilities of the presented network against its deterministic counterpart. The results demonstrate that the generalization ability of an inverse reconstruction network can be improved by stochasticity as well as the information bottleneck. |
Tasks | Image Reconstruction |
Published | 2019-03-05 |
URL | http://arxiv.org/abs/1903.02948v1 |
http://arxiv.org/pdf/1903.02948v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-generalization-of-deep-networks-for |
Repo | https://github.com/sandeshgh/Improving-Generalization |
Framework | pytorch |
YOLACT: Real-time Instance Segmentation
Title | YOLACT: Real-time Instance Segmentation |
Authors | Daniel Bolya, Chong Zhou, Fanyi Xiao, Yong Jae Lee |
Abstract | We present a simple, fully-convolutional model for real-time instance segmentation that achieves 29.8 mAP on MS COCO at 33.5 fps evaluated on a single Titan Xp, which is significantly faster than any previous competitive approach. Moreover, we obtain this result after training on only one GPU. We accomplish this by breaking instance segmentation into two parallel subtasks: (1) generating a set of prototype masks and (2) predicting per-instance mask coefficients. Then we produce instance masks by linearly combining the prototypes with the mask coefficients. We find that because this process doesn’t depend on repooling, this approach produces very high-quality masks and exhibits temporal stability for free. Furthermore, we analyze the emergent behavior of our prototypes and show they learn to localize instances on their own in a translation variant manner, despite being fully-convolutional. Finally, we also propose Fast NMS, a drop-in 12 ms faster replacement for standard NMS that only has a marginal performance penalty. |
Tasks | Instance Segmentation, Real-time Instance Segmentation, Semantic Segmentation |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1904.02689v2 |
https://arxiv.org/pdf/1904.02689v2.pdf | |
PWC | https://paperswithcode.com/paper/yolact-real-time-instance-segmentation |
Repo | https://github.com/BigThreeMI/Utils |
Framework | tf |
A Unified Linear-Time Framework for Sentence-Level Discourse Parsing
Title | A Unified Linear-Time Framework for Sentence-Level Discourse Parsing |
Authors | Xiang Lin, Shafiq Joty, Prathyusha Jwalapuram, M Saiful Bari |
Abstract | We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in linear time. Our segmenter yields an $F_1$ score of 95.4, and our parser achieves an $F_1$ score of 81.7 on the aggregated labeled (relation) metric, surpassing previous approaches by a good margin and approaching human agreement on both tasks (98.3 and 83.0 $F_1$). |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05682v2 |
https://arxiv.org/pdf/1905.05682v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-linear-time-framework-for-sentence |
Repo | https://github.com/shawnlimn/UnifiedParser_RST |
Framework | pytorch |
DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding
Title | DeepDriveMD: Deep-Learning Driven Adaptive Molecular Simulations for Protein Folding |
Authors | Hyungro Lee, Heng Ma, Matteo Turilli, Debsindhu Bhowmik, Shantenu Jha, Arvind Ramanathan |
Abstract | Simulations of biological macromolecules play an important role in understanding the physical basis of a number of complex processes such as protein folding. Even with increasing computational power and evolution of specialized architectures, the ability to simulate protein folding at atomistic scales still remains challenging. This stems from the dual aspects of high dimensionality of protein conformational landscapes, and the inability of atomistic molecular dynamics (MD) simulations to sufficiently sample these landscapes to observe folding events. Machine learning/deep learning (ML/DL) techniques, when combined with atomistic MD simulations offer the opportunity to potentially overcome these limitations by: (1) effectively reducing the dimensionality of MD simulations to automatically build latent representations that correspond to biophysically relevant reaction coordinates (RCs), and (2) driving MD simulations to automatically sample potentially novel conformational states based on these RCs. We examine how coupling DL approaches with MD simulations can fold small proteins effectively on supercomputers. In particular, we study the computational costs and effectiveness of scaling DL-coupled MD workflows by folding two prototypical systems, viz., Fs-peptide and the fast-folding variant of the villin head piece protein. We demonstrate that a DL driven MD workflow is able to effectively learn latent representations and drive adaptive simulations. Compared to traditional MD-based approaches, our approach achieves an effective performance gain in sampling the folded states by at least 2.3x. Our study provides a quantitative basis to understand how DL driven MD simulations, can lead to effective performance gains and reduced times to solution on supercomputing resources. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07817v1 |
https://arxiv.org/pdf/1909.07817v1.pdf | |
PWC | https://paperswithcode.com/paper/deepdrivemd-deep-learning-driven-adaptive |
Repo | https://github.com/braceal/DeepDriveMD |
Framework | pytorch |
Accelerated Experimental Design for Pairwise Comparisons
Title | Accelerated Experimental Design for Pairwise Comparisons |
Authors | Yuan Guo, Jennifer Dy, Deniz Erdogmus, Jayashree Kalpathy-Cramer, Susan Ostmo, J. Peter Campbell, Michael F. Chiang, Stratis Ioannidis |
Abstract | Pairwise comparison labels are more informative and less variable than class labels, but generating them poses a challenge: their number grows quadratically in the dataset size. We study a natural experimental design objective, namely, D-optimality, that can be used to identify which $K$ pairwise comparisons to generate. This objective is known to perform well in practice, and is submodular, making the selection approximable via the greedy algorithm. A na"ive greedy implementation has $O(N^2d^2K)$ complexity, where $N$ is the dataset size, $d$ is the feature space dimension, and $K$ is the number of generated comparisons. We show that, by exploiting the inherent geometry of the dataset–namely, that it consists of pairwise comparisons–the greedy algorithm’s complexity can be reduced to $O(N^2(K+d)+N(dK+d^2) +d^2K).$ We apply the same acceleration also to the so-called lazy greedy algorithm. When combined, the above improvements lead to an execution time of less than 1 hour for a dataset with $10^8$ comparisons; the na"ive greedy algorithm on the same dataset would require more than 10 days to terminate. |
Tasks | |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06080v1 |
http://arxiv.org/pdf/1901.06080v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-experimental-design-for-pairwise |
Repo | https://github.com/neu-spiral/AcceleratedExperimentalDesign |
Framework | none |
Rethinking the CSC Model for Natural Images
Title | Rethinking the CSC Model for Natural Images |
Authors | Dror Simon, Michael Elad |
Abstract | Sparse representation with respect to an overcomplete dictionary is often used when regularizing inverse problems in signal and image processing. In recent years, the Convolutional Sparse Coding (CSC) model, in which the dictionary consists of shift-invariant filters, has gained renewed interest. While this model has been successfully used in some image processing problems, it still falls behind traditional patch-based methods on simple tasks such as denoising. In this work we provide new insights regarding the CSC model and its capability to represent natural images, and suggest a Bayesian connection between this model and its patch-based ancestor. Armed with these observations, we suggest a novel feed-forward network that follows an MMSE approximation process to the CSC model, using strided convolutions. The performance of this supervised architecture is shown to be on par with state of the art methods while using much fewer parameters. |
Tasks | Denoising |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05742v1 |
https://arxiv.org/pdf/1909.05742v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-the-csc-model-for-natural-images |
Repo | https://github.com/drorsimon/CSCNet |
Framework | pytorch |
Modulation of early visual processing alleviates capacity limits in solving multiple tasks
Title | Modulation of early visual processing alleviates capacity limits in solving multiple tasks |
Authors | Sushrut Thorat, Giacomo Aldegheri, Marcel A. J. van Gerven, Marius V. Peelen |
Abstract | In daily life situations, we have to perform multiple tasks given a visual stimulus, which requires task-relevant information to be transmitted through our visual system. When it is not possible to transmit all the possibly relevant information to higher layers, due to a bottleneck, task-based modulation of early visual processing might be necessary. In this work, we report how the effectiveness of modulating the early processing stage of an artificial neural network depends on the information bottleneck faced by the network. The bottleneck is quantified by the number of tasks the network has to perform and the neural capacity of the later stage of the network. The effectiveness is gauged by the performance on multiple object detection tasks, where the network is trained with a recent multi-task optimization scheme. By associating neural modulations with task-based switching of the state of the network and characterizing when such switching is helpful in early processing, our results provide a functional perspective towards understanding why task-based modulation of early neural processes might be observed in the primate visual cortex |
Tasks | Object Detection |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12309v3 |
https://arxiv.org/pdf/1907.12309v3.pdf | |
PWC | https://paperswithcode.com/paper/modulation-of-early-visual-processing |
Repo | https://github.com/novelmartis/early-vs-late-multi-task |
Framework | tf |
NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings
Title | NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings |
Authors | Oscar J. Romero, Ankit Dangi, Sushma A. Akoju |
Abstract | Current approaches for service composition (assemblies of atomic services) require developers to use: (a) domain-specific semantics to formalize services that restrict the vocabulary for their descriptions, and (b) translation mechanisms for service retrieval to convert unstructured user requests to strongly-typed semantic representations. In our work, we argue that effort to developing service descriptions, request translations, and matching mechanisms could be reduced using unrestricted natural language; allowing both: (1) end-users to intuitively express their needs using natural language, and (2) service developers to develop services without relying on syntactic/semantic description languages. Although there are some natural language-based service composition approaches, they restrict service retrieval to syntactic/semantic matching. With recent developments in Machine learning and Natural Language Processing, we motivate the use of Sentence Embeddings by leveraging richer semantic representations of sentences for service description, matching and retrieval. Experimental results show that service composition development effort may be reduced by more than 44% while keeping a high precision/recall when matching high-level user requests with low-level service method invocations. |
Tasks | Sentence Embeddings |
Published | 2019-01-23 |
URL | https://arxiv.org/abs/1901.07910v3 |
https://arxiv.org/pdf/1901.07910v3.pdf | |
PWC | https://paperswithcode.com/paper/190107910 |
Repo | https://github.com/ojrlopez27/nl-service-composition |
Framework | none |
Dynamic Local Regret for Non-convex Online Forecasting
Title | Dynamic Local Regret for Non-convex Online Forecasting |
Authors | Sergul Aydore, Tianhao Zhu, Dean Foster |
Abstract | We consider online forecasting problems for non-convex machine learning models. Forecasting introduces several challenges such as (i) frequent updates are necessary to deal with concept drift issues since the dynamics of the environment change over time, and (ii) the state of the art models are non-convex models. We address these challenges with a novel regret framework. Standard regret measures commonly do not consider both dynamic environment and non-convex models. We introduce a local regret for non-convex models in a dynamic environment. We present an update rule incurring a cost, according to our proposed local regret, which is sublinear in time T. Our update uses time-smoothed gradients. Using a real-world dataset we show that our time-smoothed approach yields several benefits when compared with state-of-the-art competitors: results are more stable against new data; training is more robust to hyperparameter selection; and our approach is more computationally efficient than the alternatives. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07927v2 |
https://arxiv.org/pdf/1910.07927v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-local-regret-for-non-convex-online |
Repo | https://github.com/Timbasa/Dynamic_Local_Regret_for_Non-convex_Online_Forecasting_NeurIPS2019 |
Framework | pytorch |
Co-training for Policy Learning
Title | Co-training for Policy Learning |
Authors | Jialin Song, Ravi Lanka, Yisong Yue, Masahiro Ono |
Abstract | We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization. |
Tasks | Combinatorial Optimization, Continuous Control, Decision Making, Imitation Learning |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.04484v1 |
https://arxiv.org/pdf/1907.04484v1.pdf | |
PWC | https://paperswithcode.com/paper/co-training-for-policy-learning |
Repo | https://github.com/ravi-lanka-4/CoPiEr |
Framework | none |
A New Look at an Old Problem: A Universal Learning Approach to Linear Regression
Title | A New Look at an Old Problem: A Universal Learning Approach to Linear Regression |
Authors | Koby Bibas, Yaniv Fogel, Meir Feder |
Abstract | Linear regression is a classical paradigm in statistics. A new look at it is provided via the lens of universal learning. In applying universal learning to linear regression the hypotheses class represents the label $y\in {\cal R}$ as a linear combination of the feature vector $x^T\theta$ where $x\in {\cal R}^M$, within a Gaussian error. The Predictive Normalized Maximum Likelihood (pNML) solution for universal learning of individual data can be expressed analytically in this case, as well as its associated learnability measure. Interestingly, the situation where the number of parameters $M$ may even be larger than the number of training samples $N$ can be examined. As expected, in this case learnability cannot be attained in every situation; nevertheless, if the test vector resides mostly in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, linear regression can generalize despite the fact that it uses an ``over-parametrized’’ model. We demonstrate the results with a simulation of fitting a polynomial to data with a possibly large polynomial degree. | |
Tasks | |
Published | 2019-05-12 |
URL | https://arxiv.org/abs/1905.04708v1 |
https://arxiv.org/pdf/1905.04708v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-look-at-an-old-problem-a-universal |
Repo | https://github.com/kobybibas/pnml_linear_regression_simulation |
Framework | none |
Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping
Title | Temporal Transformer Networks: Joint Learning of Invariant and Discriminative Time Warping |
Authors | Suhas Lohit, Qiao Wang, Pavan Turaga |
Abstract | Many time-series classification problems involve developing metrics that are invariant to temporal misalignment. In human activity analysis, temporal misalignment arises due to various reasons including differing initial phase, sensor sampling rates, and elastic time-warps due to subject-specific biomechanics. Past work in this area has only looked at reducing intra-class variability by elastic temporal alignment. In this paper, we propose a hybrid model-based and data-driven approach to learn warping functions that not just reduce intra-class variability, but also increase inter-class separation. We call this a temporal transformer network (TTN). TTN is an interpretable differentiable module, which can be easily integrated at the front end of a classification network. The module is capable of reducing intra-class variance by generating input-dependent warping functions which lead to rate-robust representations. At the same time, it increases inter-class variance by learning warping functions that are more discriminative. We show improvements over strong baselines in 3D action recognition on challenging datasets using the proposed framework. The improvements are especially pronounced when training sets are smaller. |
Tasks | 3D Human Action Recognition, Time Series, Time Series Classification |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05947v1 |
https://arxiv.org/pdf/1906.05947v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-transformer-networks-joint-learning-1 |
Repo | https://github.com/suhaslohit/TTN |
Framework | tf |
ROC movies – a new generalization to a popular classic
Title | ROC movies – a new generalization to a popular classic |
Authors | Tilmann Gneiting, Eva-Maria Walz |
Abstract | Throughout science and technology, receiver operating characteristic (ROC) curves and associated area under the curve (AUC) measures constitute powerful tools for assessing the predictive abilities of features, markers and tests in binary classification problems. Despite its immense popularity, ROC analysis has been subject to a fundamental restriction, in that it applies to dichotomous (yes or no) outcomes only. We introduce ROC movies and universal ROC (UROC) curves that apply to just any ordinal or real-valued outcome, along with a new, asymmetric coefficient of predictive ability (CPA) measure. CPA equals the area under the UROC curve and admits appealing interpretations in terms of probabilities and rank based covariances. ROC movies, UROC curves and CPA nest and generalize the classical ROC curve and AUC, and are bound to supersede them in a wealth of applications. |
Tasks | |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1912.01956v1 |
https://arxiv.org/pdf/1912.01956v1.pdf | |
PWC | https://paperswithcode.com/paper/191201956 |
Repo | https://github.com/evwalz/CPA_Example_NWP |
Framework | none |
Self-adaptive Single and Multi-illuminant Estimation Framework based on Deep Learning
Title | Self-adaptive Single and Multi-illuminant Estimation Framework based on Deep Learning |
Authors | Yongjie Liu, Sijie Shen |
Abstract | Illuminant estimation plays a key role in digital camera pipeline system, it aims at reducing color casting effect due to the influence of non-white illuminant. Recent researches handle this task by using Convolution Neural Network (CNN) as a mapping function from input image to a single illumination vector. However, global mapping approaches are difficult to deal with scenes under multi-light-sources. In this paper, we proposed a self-adaptive single and multi-illuminant estimation framework, which includes the following novelties: (1) Learning local self-adaptive kernels from the entire image for illuminant estimation with encoder-decoder CNN structure; (2) Providing confidence measurement for the prediction; (3) Clustering-based iterative fitting for computing single and multi-illumination vectors. The proposed global-to-local aggregation is able to predict multi-illuminant regionally by utilizing global information instead of training in patches, as well as brings significant improvement for single illuminant estimation. We outperform the state-of-the-art methods on standard benchmarks with the largest relative improvement of 16%. In addition, we collect a dataset contains over 13k images for illuminant estimation and evaluation. The code and dataset is available on https://github.com/LiamLYJ/KPF_WB |
Tasks | |
Published | 2019-02-13 |
URL | http://arxiv.org/abs/1902.04705v1 |
http://arxiv.org/pdf/1902.04705v1.pdf | |
PWC | https://paperswithcode.com/paper/self-adaptive-single-and-multi-illuminant |
Repo | https://github.com/LiamLYJ/KPF_WB |
Framework | tf |
Counterfactual Depth from a Single RGB Image
Title | Counterfactual Depth from a Single RGB Image |
Authors | Theerasit Issaranon, Chuhang Zou, David Forsyth |
Abstract | We describe a method that predicts, from a single RGB image, a depth map that describes the scene when a masked object is removed - we call this “counterfactual depth” that models hidden scene geometry together with the observations. Our method works for the same reason that scene completion works: the spatial structure of objects is simple. But we offer a much higher resolution representation of space than current scene completion methods, as we operate at pixel-level precision and do not rely on a voxel representation. Furthermore, we do not require RGBD inputs. Our method uses a standard encoder-decoder architecture, and with a decoder modified to accept an object mask. We describe a small evaluation dataset that we have collected, which allows inference about what factors affect reconstruction most strongly. Using this dataset, we show that our depth predictions for masked objects are better than other baselines. |
Tasks | |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.00915v1 |
https://arxiv.org/pdf/1909.00915v1.pdf | |
PWC | https://paperswithcode.com/paper/counterfactual-depth-from-a-single-rgb-image |
Repo | https://github.com/Theerasit/CounterfactualDepth |
Framework | none |