January 30, 2020

3020 words 15 mins read

Paper Group ANR 243

Paper Group ANR 243

Physics Informed Extreme Learning Machine (PIELM) – A rapid method for the numerical solution of partial differential equations. Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks. GraphNAS: Graph Neural Architecture Search with Reinforcement Learning. Phoneme Level Language Models for Sequence Based Low Resource A …

Physics Informed Extreme Learning Machine (PIELM) – A rapid method for the numerical solution of partial differential equations

Title Physics Informed Extreme Learning Machine (PIELM) – A rapid method for the numerical solution of partial differential equations
Authors Vikas Dwivedi, Balaji Srinivasan
Abstract There has been rapid progress recently on the application of deep networks to the solution of partial differential equations, collectively labelled as Physics Informed Neural Networks (PINNs). In this paper, we develop Physics Informed Extreme Learning Machine (PIELM), a rapid version of PINNs which can be applied to stationary and time dependent linear partial differential equations. We demonstrate that PIELM matches or exceeds the accuracy of PINNs on a range of problems. We also discuss the limitations of neural network based approaches, including our PIELM, in the solution of PDEs on large domains and suggest an extension, a distributed version of our algorithm -{}- DPIELM. We show that DPIELM produces excellent results comparable to conventional numerical techniques in the solution of time-dependent problems. Collectively, this work contributes towards making the use of neural networks in the solution of partial differential equations in complex domains as a competitive alternative to conventional discretization techniques.
Tasks
Published 2019-07-08
URL https://arxiv.org/abs/1907.03507v1
PDF https://arxiv.org/pdf/1907.03507v1.pdf
PWC https://paperswithcode.com/paper/physics-informed-extreme-learning-machine
Repo
Framework

Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks

Title Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks
Authors Xiaoliang Dai, Hongxu Yin, Niraj K. Jha
Abstract Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications. However, their fixed architecture, substantial training cost, and significant model redundancy make it difficult to efficiently update them to accommodate previously unseen data. To solve these problems, we propose an incremental learning framework based on a grow-and-prune neural network synthesis paradigm. When new data arrive, the neural network first grows new connections based on the gradients to increase the network capacity to accommodate new data. Then, the framework iteratively prunes away connections based on the magnitude of weights to enhance network compactness, and hence recover efficiency. Finally, the model rests at a lightweight DNN that is both ready for inference and suitable for future grow-and-prune updates. The proposed framework improves accuracy, shrinks network size, and significantly reduces the additional training cost for incoming data compared to conventional approaches, such as training from scratch and network fine-tuning. For the LeNet-300-100 and LeNet-5 neural network architectures derived for the MNIST dataset, the framework reduces training cost by up to 64% (63%) and 67% (63%) compared to training from scratch (network fine-tuning), respectively. For the ResNet-18 architecture derived for the ImageNet dataset and DeepSpeech2 for the AN4 dataset, the corresponding training cost reductions against training from scratch (network fine-tunning) are 64% (60%) and 67% (62%), respectively. Our derived models contain fewer network parameters but achieve higher accuracy relative to conventional baselines.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1905.10952v1
PDF https://arxiv.org/pdf/1905.10952v1.pdf
PWC https://paperswithcode.com/paper/incremental-learning-using-a-grow-and-prune
Repo
Framework

GraphNAS: Graph Neural Architecture Search with Reinforcement Learning

Title GraphNAS: Graph Neural Architecture Search with Reinforcement Learning
Authors Yang Gao, Hong Yang, Peng Zhang, Chuan Zhou, Yue Hu
Abstract Graph Neural Networks (GNNs) have been popularly used for analyzing non-Euclidean data such as social network data and biological data. Despite their success, the design of graph neural networks requires a lot of manual work and domain knowledge. In this paper, we propose a Graph Neural Architecture Search method (GraphNAS for short) that enables automatic search of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS first uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and then trains the recurrent network with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation data set. Extensive experimental results on node classification tasks in both transductive and inductive learning settings demonstrate that GraphNAS can achieve consistently better performance on the Cora, Citeseer, Pubmed citation network, and protein-protein interaction network. On node classification tasks, GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy.
Tasks Neural Architecture Search, Node Classification
Published 2019-04-22
URL https://arxiv.org/abs/1904.09981v2
PDF https://arxiv.org/pdf/1904.09981v2.pdf
PWC https://paperswithcode.com/paper/graphnas-graph-neural-architecture-search
Repo
Framework

Phoneme Level Language Models for Sequence Based Low Resource ASR

Title Phoneme Level Language Models for Sequence Based Low Resource ASR
Authors Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze
Abstract Building multilingual and crosslingual models help bring different languages together in a language universal space. It allows models to share parameters and transfer knowledge across languages, enabling faster and better adaptation to a new language. These approaches are particularly useful for low resource languages. In this paper, we propose a phoneme-level language model that can be used multilingually and for crosslingual adaptation to a target language. We show that our model performs almost as well as the monolingual models by using six times fewer parameters, and is capable of better adaptation to languages not seen during training in a low resource scenario. We show that these phoneme-level language models can be used to decode sequence based Connectionist Temporal Classification (CTC) acoustic model outputs to obtain comparable word error rates with Weighted Finite State Transducer (WFST) based decoding in Babel languages. We also show that these phoneme-level language models outperform WFST decoding in various low-resource conditions like adapting to a new language and domain mismatch between training and testing data.
Tasks Language Modelling
Published 2019-02-20
URL http://arxiv.org/abs/1902.07613v1
PDF http://arxiv.org/pdf/1902.07613v1.pdf
PWC https://paperswithcode.com/paper/phoneme-level-language-models-for-sequence
Repo
Framework

PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction

Title PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction
Authors Sangdon Park, Osbert Bastani, Nikolai Matni, Insup Lee
Abstract We propose an algorithm combining calibrated prediction and generalization bounds from learning theory to construct confidence sets for deep neural networks with PAC guarantees—i.e., the confidence set for a given input contains the true label with high probability. We demonstrate how our approach can be used to construct PAC confidence sets on ResNet for ImageNet, a visual object tracking model, and a dynamics model for the half-cheetah reinforcement learning problem.
Tasks Object Tracking, Visual Object Tracking
Published 2019-12-31
URL https://arxiv.org/abs/2001.00106v2
PDF https://arxiv.org/pdf/2001.00106v2.pdf
PWC https://paperswithcode.com/paper/pac-confidence-sets-for-deep-neural-networks-1
Repo
Framework

Scoot: A Perceptual Metric for Facial Sketches

Title Scoot: A Perceptual Metric for Facial Sketches
Authors Deng-Ping Fan, ShengChuan Zhang, Yu-Huan Wu, Yun Liu, Ming-Ming Cheng, Bo Ren, Paul L. Rosin, Rongrong Ji
Abstract Human visual system has the strong ability to quick assess the perceptual similarity between two facial sketches. However, existing two widely-used facial sketch metrics, e.g., FSIM and SSIM fail to address this perceptual similarity in this field. Recent study in facial modeling area has verified that the inclusion of both structure and texture has a significant positive benefit for face sketch synthesis (FSS). But which statistics are more important, and are helpful for their success? In this paper, we design a perceptual metric,called Structure Co-Occurrence Texture (Scoot), which simultaneously considers the block-level spatial structure and co-occurrence texture statistics. To test the quality of metrics, we propose three novel meta-measures based on various reliable properties. Extensive experiments demonstrate that our Scoot metric exceeds the performance of prior work. Besides, we built the first large scale (152k judgments) human-perception-based sketch database that can evaluate how well a metric is consistent with human perception. Our results suggest that “spatial structure” and “co-occurrence texture” are two generally applicable perceptual features in face sketch synthesis.
Tasks Face Sketch Synthesis
Published 2019-08-21
URL https://arxiv.org/abs/1908.08433v2
PDF https://arxiv.org/pdf/1908.08433v2.pdf
PWC https://paperswithcode.com/paper/scoot-a-perceptual-metric-for-facial-sketches
Repo
Framework

Skin Cancer Recognition using Deep Residual Network

Title Skin Cancer Recognition using Deep Residual Network
Authors Brij Rokad, Dr. Sureshkumar Nagarajan
Abstract The advances in technology have enabled people to access internet from every part of the world. But to date, access to healthcare in remote areas is sparse. This proposed solution aims to bridge the gap between specialist doctors and patients. This prototype will be able to detect skin cancer from an image captured by the phone or any other camera. The network is deployed on cloud server-side processing for an even more accurate result. The Deep Residual learning model has been used for predicting the probability of cancer for server side The ResNet has three parametric layers. Each layer has Convolutional Neural Network, Batch Normalization, Maxpool and ReLU. Currently the model achieves an accuracy of 77% on the ISIC - 2017 challenge.
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.08610v1
PDF https://arxiv.org/pdf/1905.08610v1.pdf
PWC https://paperswithcode.com/paper/190508610
Repo
Framework

Bi-stream Pose Guided Region Ensemble Network for Fingertip Localization from Stereo Images

Title Bi-stream Pose Guided Region Ensemble Network for Fingertip Localization from Stereo Images
Authors Guijin Wang, Cairong Zhang, Xinghao Chen, Xiangyang Ji, Jing-Hao Xue, Hang Wang
Abstract In human-computer interaction, it is important to accurately estimate the hand pose especially fingertips. However, traditional approaches for fingertip localization mainly rely on depth images and thus suffer considerably from the noise and missing values. Instead of depth images, stereo images can also provide 3D information of hands and promote 3D hand pose estimation. There are nevertheless limitations on the dataset size, global viewpoints, hand articulations and hand shapes in the publicly available stereo-based hand pose datasets. To mitigate these limitations and promote further research on hand pose estimation from stereo images, we propose a new large-scale binocular hand pose dataset called THU-Bi-Hand, offering a new perspective for fingertip localization. In the THU-Bi-Hand dataset, there are 447k pairs of stereo images of different hand shapes from 10 subjects with accurate 3D location annotations of the wrist and five fingertips. Captured with minimal restriction on the range of hand motion, the dataset covers large global viewpoint space and hand articulation space. To better present the performance of fingertip localization on THU-Bi-Hand, we propose a novel scheme termed Bi-stream Pose Guided Region Ensemble Network (Bi-Pose-REN). It extracts more representative feature regions around joint points in the feature maps under the guidance of the previously estimated pose. The feature regions are integrated hierarchically according to the topology of hand joints to regress the refined hand pose. Bi-Pose-REN and several existing methods are evaluated on THU-Bi-Hand so that benchmarks are provided for further research. Experimental results show that our new method has achieved the best performance on THU-Bi-Hand.
Tasks Hand Pose Estimation, Pose Estimation
Published 2019-02-26
URL http://arxiv.org/abs/1902.09795v1
PDF http://arxiv.org/pdf/1902.09795v1.pdf
PWC https://paperswithcode.com/paper/bi-stream-pose-guided-region-ensemble-network
Repo
Framework

Learning Independently-Obtainable Reward Functions

Title Learning Independently-Obtainable Reward Functions
Authors Christopher Grimm, Satinder Singh
Abstract We present a novel method for learning a set of disentangled reward functions that sum to the original environment reward and are constrained to be independently obtainable. We define independent obtainability in terms of value functions with respect to obtaining one learned reward while pursuing another learned reward. Empirically, we illustrate that our method can learn meaningful reward decompositions in a variety of domains and that these decompositions exhibit some form of generalization performance when the environment’s reward is modified. Theoretically, we derive results about the effect of maximizing our method’s objective on the resulting reward functions and their corresponding optimal policies.
Tasks
Published 2019-01-24
URL http://arxiv.org/abs/1901.08649v3
PDF http://arxiv.org/pdf/1901.08649v3.pdf
PWC https://paperswithcode.com/paper/learning-independently-obtainable-reward
Repo
Framework

Feature Boosting Network For 3D Pose Estimation

Title Feature Boosting Network For 3D Pose Estimation
Authors Jun Liu, Henghui Ding, Amir Shahroudy, Ling-Yu Duan, Xudong Jiang, Gang Wang, Alex C. Kot
Abstract In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation. To improve the reliability of the features for representing each body part and enhance the LSTD module, we further introduce a context consistency gate (CCG) in this paper, with which the convolutional feature maps are modulated according to their consistency with the context representations. We evaluate the proposed method on challenging benchmark datasets for 3D hand pose estimation and 3D full body pose estimation. Experimental results show the effectiveness of our method that achieves state-of-the-art performance on both of the tasks.
Tasks 3D Pose Estimation, Hand Pose Estimation, Pose Estimation
Published 2019-01-15
URL https://arxiv.org/abs/1901.04877v2
PDF https://arxiv.org/pdf/1901.04877v2.pdf
PWC https://paperswithcode.com/paper/feature-boosting-network-for-3d-pose
Repo
Framework

Robust Visual Object Tracking with Natural Language Region Proposal Network

Title Robust Visual Object Tracking with Natural Language Region Proposal Network
Authors Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff
Abstract Tracking with natural-language (NL) specification is a powerful new paradigm to yield trackers that initialize without a manually-specified bounding box, stay on target in spite of occlusions, and auto-recover when diverged. These advantages stem in part from visual appearance and NL having distinct and complementary invariance properties. However, realizing these advantages is technically challenging: the two modalities have incompatible representations. In this paper, we present the first practical and competitive solution to the challenge of tracking with NL specification. Our first novelty is an NL region proposal network (NL-RPN) that transforms an NL description into a convolutional kernel and shares the search branch with siamese trackers; the combined network can be trained end-to-end. Secondly, we propose a novel formulation to represent the history of past visual exemplars and use those exemplars to automatically reset the tracker together with our NL-RPN. Empirical results over tracking benchmarks with NL annotations demonstrate the effectiveness of our approach.
Tasks Object Tracking, Visual Object Tracking
Published 2019-12-04
URL https://arxiv.org/abs/1912.02048v1
PDF https://arxiv.org/pdf/1912.02048v1.pdf
PWC https://paperswithcode.com/paper/robust-visual-object-tracking-with-natural
Repo
Framework

The ALOS Dataset for Advert Localization in Outdoor Scenes

Title The ALOS Dataset for Advert Localization in Outdoor Scenes
Authors Soumyabrata Dev, Murhaf Hossari, Matthew Nicholson, Killian McCabe, Atul Nautiyal, Clare Conran, Jian Tang, Wei Xu, François Pitié
Abstract The rapid increase in the number of online videos provides the marketing and advertising agents ample opportunities to reach out to their audience. One of the most widely used strategies is product placement, or embedded marketing, wherein new advertisements are integrated seamlessly into existing advertisements in videos. Such strategies involve accurately localizing the position of the advert in the image frame, either manually in the video editing phase, or by using machine learning frameworks. However, these machine learning techniques and deep neural networks need a massive amount of data for training. In this paper, we propose and release the first large-scale dataset of advertisement billboards, captured in outdoor scenes. We also benchmark several state-of-the-art semantic segmentation algorithms on our proposed dataset.
Tasks Semantic Segmentation
Published 2019-04-16
URL http://arxiv.org/abs/1904.07776v1
PDF http://arxiv.org/pdf/1904.07776v1.pdf
PWC https://paperswithcode.com/paper/the-alos-dataset-for-advert-localization-in
Repo
Framework

Deep Learning Models for Digital Pathology

Title Deep Learning Models for Digital Pathology
Authors Aïcha BenTaieb, Ghassan Hamarneh
Abstract Histopathology images; microscopy images of stained tissue biopsies contain fundamental prognostic information that forms the foundation of pathological analysis and diagnostic medicine. However, diagnostics from histopathology images generally rely on a visual cognitive assessment of tissue slides which implies an inherent element of interpretation and hence subjectivity. Access to digitized histopathology images enabled the development of computational systems aiming at reducing manual intervention and automating parts of pathologists’ workflow. Specifically, applications of deep learning to histopathology image analysis now offer opportunities for better quantitative modeling of disease appearance and hence possibly improved prediction of disease aggressiveness and patient outcome. However digitized histopathology tissue slides are unique in a variety of ways and come with their own set of computational challenges. In this survey, we summarize the different challenges facing computational systems for digital pathology and provide a review of state-of-the-art works that developed deep learning-based solutions for the predictive modeling of histopathology images from a detection, stain normalization, segmentation, and tissue classification perspective. We then discuss the challenges facing the validation and integration of such deep learning-based computational systems in clinical workflow and reflect on future opportunities for histopathology derived image measurements and better predictive modeling.
Tasks
Published 2019-10-27
URL https://arxiv.org/abs/1910.12329v2
PDF https://arxiv.org/pdf/1910.12329v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-models-for-digital-pathology
Repo
Framework

Machine Learning with Clos Networks

Title Machine Learning with Clos Networks
Authors Timothy Whithing, Thiam Khean Hah
Abstract We present a new methodology for improving the accuracy of small neural networks by applying the concept of a clos network to achieve maximum expression in a smaller network. We explore the design space to show that more layers is beneficial, given the same number of parameters. We also present findings on how the relu nonlinearity ffects accuracy in separable networks. We present results on early work with Cifar-10 dataset.
Tasks
Published 2019-01-18
URL http://arxiv.org/abs/1901.06433v1
PDF http://arxiv.org/pdf/1901.06433v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-with-clos-networks
Repo
Framework

Horseshoe Regularization for Machine Learning in Complex and Deep Models

Title Horseshoe Regularization for Machine Learning in Complex and Deep Models
Authors Anindya Bhadra, Jyotishka Datta, Yunfan Li, Nicholas G. Polson
Abstract Since the advent of the horseshoe priors for regularization, global-local shrinkage methods have proved to be a fertile ground for the development of Bayesian methodology in machine learning, specifically for high-dimensional regression and classification problems. They have achieved remarkable success in computation, and enjoy strong theoretical support. Most of the existing literature has focused on the linear Gaussian case; see Bhadra et al. (2019b) for a systematic survey. The purpose of the current article is to demonstrate that the horseshoe regularization is useful far more broadly, by reviewing both methodological and computational developments in complex models that are more relevant to machine learning applications. Specifically, we focus on methodological challenges in horseshoe regularization in nonlinear and non-Gaussian models; multivariate models; and deep neural networks. We also outline the recent computational developments in horseshoe shrinkage for complex models along with a list of available software implementations that allows one to venture out beyond the comfort zone of the canonical linear regression problems.
Tasks
Published 2019-04-24
URL https://arxiv.org/abs/1904.10939v2
PDF https://arxiv.org/pdf/1904.10939v2.pdf
PWC https://paperswithcode.com/paper/horseshoe-regularization-for-machine-learning
Repo
Framework
comments powered by Disqus