Paper Group ANR 243
Physics Informed Extreme Learning Machine (PIELM) – A rapid method for the numerical solution of partial differential equations. Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks. GraphNAS: Graph Neural Architecture Search with Reinforcement Learning. Phoneme Level Language Models for Sequence Based Low Resource A …
Physics Informed Extreme Learning Machine (PIELM) – A rapid method for the numerical solution of partial differential equations
Title | Physics Informed Extreme Learning Machine (PIELM) – A rapid method for the numerical solution of partial differential equations |
Authors | Vikas Dwivedi, Balaji Srinivasan |
Abstract | There has been rapid progress recently on the application of deep networks to the solution of partial differential equations, collectively labelled as Physics Informed Neural Networks (PINNs). In this paper, we develop Physics Informed Extreme Learning Machine (PIELM), a rapid version of PINNs which can be applied to stationary and time dependent linear partial differential equations. We demonstrate that PIELM matches or exceeds the accuracy of PINNs on a range of problems. We also discuss the limitations of neural network based approaches, including our PIELM, in the solution of PDEs on large domains and suggest an extension, a distributed version of our algorithm -{}- DPIELM. We show that DPIELM produces excellent results comparable to conventional numerical techniques in the solution of time-dependent problems. Collectively, this work contributes towards making the use of neural networks in the solution of partial differential equations in complex domains as a competitive alternative to conventional discretization techniques. |
Tasks | |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03507v1 |
https://arxiv.org/pdf/1907.03507v1.pdf | |
PWC | https://paperswithcode.com/paper/physics-informed-extreme-learning-machine |
Repo | |
Framework | |
Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks
Title | Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks |
Authors | Xiaoliang Dai, Hongxu Yin, Niraj K. Jha |
Abstract | Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications. However, their fixed architecture, substantial training cost, and significant model redundancy make it difficult to efficiently update them to accommodate previously unseen data. To solve these problems, we propose an incremental learning framework based on a grow-and-prune neural network synthesis paradigm. When new data arrive, the neural network first grows new connections based on the gradients to increase the network capacity to accommodate new data. Then, the framework iteratively prunes away connections based on the magnitude of weights to enhance network compactness, and hence recover efficiency. Finally, the model rests at a lightweight DNN that is both ready for inference and suitable for future grow-and-prune updates. The proposed framework improves accuracy, shrinks network size, and significantly reduces the additional training cost for incoming data compared to conventional approaches, such as training from scratch and network fine-tuning. For the LeNet-300-100 and LeNet-5 neural network architectures derived for the MNIST dataset, the framework reduces training cost by up to 64% (63%) and 67% (63%) compared to training from scratch (network fine-tuning), respectively. For the ResNet-18 architecture derived for the ImageNet dataset and DeepSpeech2 for the AN4 dataset, the corresponding training cost reductions against training from scratch (network fine-tunning) are 64% (60%) and 67% (62%), respectively. Our derived models contain fewer network parameters but achieve higher accuracy relative to conventional baselines. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.10952v1 |
https://arxiv.org/pdf/1905.10952v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-learning-using-a-grow-and-prune |
Repo | |
Framework | |
GraphNAS: Graph Neural Architecture Search with Reinforcement Learning
Title | GraphNAS: Graph Neural Architecture Search with Reinforcement Learning |
Authors | Yang Gao, Hong Yang, Peng Zhang, Chuan Zhou, Yue Hu |
Abstract | Graph Neural Networks (GNNs) have been popularly used for analyzing non-Euclidean data such as social network data and biological data. Despite their success, the design of graph neural networks requires a lot of manual work and domain knowledge. In this paper, we propose a Graph Neural Architecture Search method (GraphNAS for short) that enables automatic search of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS first uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and then trains the recurrent network with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation data set. Extensive experimental results on node classification tasks in both transductive and inductive learning settings demonstrate that GraphNAS can achieve consistently better performance on the Cora, Citeseer, Pubmed citation network, and protein-protein interaction network. On node classification tasks, GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. |
Tasks | Neural Architecture Search, Node Classification |
Published | 2019-04-22 |
URL | https://arxiv.org/abs/1904.09981v2 |
https://arxiv.org/pdf/1904.09981v2.pdf | |
PWC | https://paperswithcode.com/paper/graphnas-graph-neural-architecture-search |
Repo | |
Framework | |
Phoneme Level Language Models for Sequence Based Low Resource ASR
Title | Phoneme Level Language Models for Sequence Based Low Resource ASR |
Authors | Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze |
Abstract | Building multilingual and crosslingual models help bring different languages together in a language universal space. It allows models to share parameters and transfer knowledge across languages, enabling faster and better adaptation to a new language. These approaches are particularly useful for low resource languages. In this paper, we propose a phoneme-level language model that can be used multilingually and for crosslingual adaptation to a target language. We show that our model performs almost as well as the monolingual models by using six times fewer parameters, and is capable of better adaptation to languages not seen during training in a low resource scenario. We show that these phoneme-level language models can be used to decode sequence based Connectionist Temporal Classification (CTC) acoustic model outputs to obtain comparable word error rates with Weighted Finite State Transducer (WFST) based decoding in Babel languages. We also show that these phoneme-level language models outperform WFST decoding in various low-resource conditions like adapting to a new language and domain mismatch between training and testing data. |
Tasks | Language Modelling |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07613v1 |
http://arxiv.org/pdf/1902.07613v1.pdf | |
PWC | https://paperswithcode.com/paper/phoneme-level-language-models-for-sequence |
Repo | |
Framework | |
PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction
Title | PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction |
Authors | Sangdon Park, Osbert Bastani, Nikolai Matni, Insup Lee |
Abstract | We propose an algorithm combining calibrated prediction and generalization bounds from learning theory to construct confidence sets for deep neural networks with PAC guarantees—i.e., the confidence set for a given input contains the true label with high probability. We demonstrate how our approach can be used to construct PAC confidence sets on ResNet for ImageNet, a visual object tracking model, and a dynamics model for the half-cheetah reinforcement learning problem. |
Tasks | Object Tracking, Visual Object Tracking |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/2001.00106v2 |
https://arxiv.org/pdf/2001.00106v2.pdf | |
PWC | https://paperswithcode.com/paper/pac-confidence-sets-for-deep-neural-networks-1 |
Repo | |
Framework | |
Scoot: A Perceptual Metric for Facial Sketches
Title | Scoot: A Perceptual Metric for Facial Sketches |
Authors | Deng-Ping Fan, ShengChuan Zhang, Yu-Huan Wu, Yun Liu, Ming-Ming Cheng, Bo Ren, Paul L. Rosin, Rongrong Ji |
Abstract | Human visual system has the strong ability to quick assess the perceptual similarity between two facial sketches. However, existing two widely-used facial sketch metrics, e.g., FSIM and SSIM fail to address this perceptual similarity in this field. Recent study in facial modeling area has verified that the inclusion of both structure and texture has a significant positive benefit for face sketch synthesis (FSS). But which statistics are more important, and are helpful for their success? In this paper, we design a perceptual metric,called Structure Co-Occurrence Texture (Scoot), which simultaneously considers the block-level spatial structure and co-occurrence texture statistics. To test the quality of metrics, we propose three novel meta-measures based on various reliable properties. Extensive experiments demonstrate that our Scoot metric exceeds the performance of prior work. Besides, we built the first large scale (152k judgments) human-perception-based sketch database that can evaluate how well a metric is consistent with human perception. Our results suggest that “spatial structure” and “co-occurrence texture” are two generally applicable perceptual features in face sketch synthesis. |
Tasks | Face Sketch Synthesis |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.08433v2 |
https://arxiv.org/pdf/1908.08433v2.pdf | |
PWC | https://paperswithcode.com/paper/scoot-a-perceptual-metric-for-facial-sketches |
Repo | |
Framework | |
Skin Cancer Recognition using Deep Residual Network
Title | Skin Cancer Recognition using Deep Residual Network |
Authors | Brij Rokad, Dr. Sureshkumar Nagarajan |
Abstract | The advances in technology have enabled people to access internet from every part of the world. But to date, access to healthcare in remote areas is sparse. This proposed solution aims to bridge the gap between specialist doctors and patients. This prototype will be able to detect skin cancer from an image captured by the phone or any other camera. The network is deployed on cloud server-side processing for an even more accurate result. The Deep Residual learning model has been used for predicting the probability of cancer for server side The ResNet has three parametric layers. Each layer has Convolutional Neural Network, Batch Normalization, Maxpool and ReLU. Currently the model achieves an accuracy of 77% on the ISIC - 2017 challenge. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.08610v1 |
https://arxiv.org/pdf/1905.08610v1.pdf | |
PWC | https://paperswithcode.com/paper/190508610 |
Repo | |
Framework | |
Bi-stream Pose Guided Region Ensemble Network for Fingertip Localization from Stereo Images
Title | Bi-stream Pose Guided Region Ensemble Network for Fingertip Localization from Stereo Images |
Authors | Guijin Wang, Cairong Zhang, Xinghao Chen, Xiangyang Ji, Jing-Hao Xue, Hang Wang |
Abstract | In human-computer interaction, it is important to accurately estimate the hand pose especially fingertips. However, traditional approaches for fingertip localization mainly rely on depth images and thus suffer considerably from the noise and missing values. Instead of depth images, stereo images can also provide 3D information of hands and promote 3D hand pose estimation. There are nevertheless limitations on the dataset size, global viewpoints, hand articulations and hand shapes in the publicly available stereo-based hand pose datasets. To mitigate these limitations and promote further research on hand pose estimation from stereo images, we propose a new large-scale binocular hand pose dataset called THU-Bi-Hand, offering a new perspective for fingertip localization. In the THU-Bi-Hand dataset, there are 447k pairs of stereo images of different hand shapes from 10 subjects with accurate 3D location annotations of the wrist and five fingertips. Captured with minimal restriction on the range of hand motion, the dataset covers large global viewpoint space and hand articulation space. To better present the performance of fingertip localization on THU-Bi-Hand, we propose a novel scheme termed Bi-stream Pose Guided Region Ensemble Network (Bi-Pose-REN). It extracts more representative feature regions around joint points in the feature maps under the guidance of the previously estimated pose. The feature regions are integrated hierarchically according to the topology of hand joints to regress the refined hand pose. Bi-Pose-REN and several existing methods are evaluated on THU-Bi-Hand so that benchmarks are provided for further research. Experimental results show that our new method has achieved the best performance on THU-Bi-Hand. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09795v1 |
http://arxiv.org/pdf/1902.09795v1.pdf | |
PWC | https://paperswithcode.com/paper/bi-stream-pose-guided-region-ensemble-network |
Repo | |
Framework | |
Learning Independently-Obtainable Reward Functions
Title | Learning Independently-Obtainable Reward Functions |
Authors | Christopher Grimm, Satinder Singh |
Abstract | We present a novel method for learning a set of disentangled reward functions that sum to the original environment reward and are constrained to be independently obtainable. We define independent obtainability in terms of value functions with respect to obtaining one learned reward while pursuing another learned reward. Empirically, we illustrate that our method can learn meaningful reward decompositions in a variety of domains and that these decompositions exhibit some form of generalization performance when the environment’s reward is modified. Theoretically, we derive results about the effect of maximizing our method’s objective on the resulting reward functions and their corresponding optimal policies. |
Tasks | |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08649v3 |
http://arxiv.org/pdf/1901.08649v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-independently-obtainable-reward |
Repo | |
Framework | |
Feature Boosting Network For 3D Pose Estimation
Title | Feature Boosting Network For 3D Pose Estimation |
Authors | Jun Liu, Henghui Ding, Amir Shahroudy, Ling-Yu Duan, Xudong Jiang, Gang Wang, Alex C. Kot |
Abstract | In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation. To improve the reliability of the features for representing each body part and enhance the LSTD module, we further introduce a context consistency gate (CCG) in this paper, with which the convolutional feature maps are modulated according to their consistency with the context representations. We evaluate the proposed method on challenging benchmark datasets for 3D hand pose estimation and 3D full body pose estimation. Experimental results show the effectiveness of our method that achieves state-of-the-art performance on both of the tasks. |
Tasks | 3D Pose Estimation, Hand Pose Estimation, Pose Estimation |
Published | 2019-01-15 |
URL | https://arxiv.org/abs/1901.04877v2 |
https://arxiv.org/pdf/1901.04877v2.pdf | |
PWC | https://paperswithcode.com/paper/feature-boosting-network-for-3d-pose |
Repo | |
Framework | |
Robust Visual Object Tracking with Natural Language Region Proposal Network
Title | Robust Visual Object Tracking with Natural Language Region Proposal Network |
Authors | Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff |
Abstract | Tracking with natural-language (NL) specification is a powerful new paradigm to yield trackers that initialize without a manually-specified bounding box, stay on target in spite of occlusions, and auto-recover when diverged. These advantages stem in part from visual appearance and NL having distinct and complementary invariance properties. However, realizing these advantages is technically challenging: the two modalities have incompatible representations. In this paper, we present the first practical and competitive solution to the challenge of tracking with NL specification. Our first novelty is an NL region proposal network (NL-RPN) that transforms an NL description into a convolutional kernel and shares the search branch with siamese trackers; the combined network can be trained end-to-end. Secondly, we propose a novel formulation to represent the history of past visual exemplars and use those exemplars to automatically reset the tracker together with our NL-RPN. Empirical results over tracking benchmarks with NL annotations demonstrate the effectiveness of our approach. |
Tasks | Object Tracking, Visual Object Tracking |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02048v1 |
https://arxiv.org/pdf/1912.02048v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-visual-object-tracking-with-natural |
Repo | |
Framework | |
The ALOS Dataset for Advert Localization in Outdoor Scenes
Title | The ALOS Dataset for Advert Localization in Outdoor Scenes |
Authors | Soumyabrata Dev, Murhaf Hossari, Matthew Nicholson, Killian McCabe, Atul Nautiyal, Clare Conran, Jian Tang, Wei Xu, François Pitié |
Abstract | The rapid increase in the number of online videos provides the marketing and advertising agents ample opportunities to reach out to their audience. One of the most widely used strategies is product placement, or embedded marketing, wherein new advertisements are integrated seamlessly into existing advertisements in videos. Such strategies involve accurately localizing the position of the advert in the image frame, either manually in the video editing phase, or by using machine learning frameworks. However, these machine learning techniques and deep neural networks need a massive amount of data for training. In this paper, we propose and release the first large-scale dataset of advertisement billboards, captured in outdoor scenes. We also benchmark several state-of-the-art semantic segmentation algorithms on our proposed dataset. |
Tasks | Semantic Segmentation |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07776v1 |
http://arxiv.org/pdf/1904.07776v1.pdf | |
PWC | https://paperswithcode.com/paper/the-alos-dataset-for-advert-localization-in |
Repo | |
Framework | |
Deep Learning Models for Digital Pathology
Title | Deep Learning Models for Digital Pathology |
Authors | Aïcha BenTaieb, Ghassan Hamarneh |
Abstract | Histopathology images; microscopy images of stained tissue biopsies contain fundamental prognostic information that forms the foundation of pathological analysis and diagnostic medicine. However, diagnostics from histopathology images generally rely on a visual cognitive assessment of tissue slides which implies an inherent element of interpretation and hence subjectivity. Access to digitized histopathology images enabled the development of computational systems aiming at reducing manual intervention and automating parts of pathologists’ workflow. Specifically, applications of deep learning to histopathology image analysis now offer opportunities for better quantitative modeling of disease appearance and hence possibly improved prediction of disease aggressiveness and patient outcome. However digitized histopathology tissue slides are unique in a variety of ways and come with their own set of computational challenges. In this survey, we summarize the different challenges facing computational systems for digital pathology and provide a review of state-of-the-art works that developed deep learning-based solutions for the predictive modeling of histopathology images from a detection, stain normalization, segmentation, and tissue classification perspective. We then discuss the challenges facing the validation and integration of such deep learning-based computational systems in clinical workflow and reflect on future opportunities for histopathology derived image measurements and better predictive modeling. |
Tasks | |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12329v2 |
https://arxiv.org/pdf/1910.12329v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-models-for-digital-pathology |
Repo | |
Framework | |
Machine Learning with Clos Networks
Title | Machine Learning with Clos Networks |
Authors | Timothy Whithing, Thiam Khean Hah |
Abstract | We present a new methodology for improving the accuracy of small neural networks by applying the concept of a clos network to achieve maximum expression in a smaller network. We explore the design space to show that more layers is beneficial, given the same number of parameters. We also present findings on how the relu nonlinearity ffects accuracy in separable networks. We present results on early work with Cifar-10 dataset. |
Tasks | |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06433v1 |
http://arxiv.org/pdf/1901.06433v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-with-clos-networks |
Repo | |
Framework | |
Horseshoe Regularization for Machine Learning in Complex and Deep Models
Title | Horseshoe Regularization for Machine Learning in Complex and Deep Models |
Authors | Anindya Bhadra, Jyotishka Datta, Yunfan Li, Nicholas G. Polson |
Abstract | Since the advent of the horseshoe priors for regularization, global-local shrinkage methods have proved to be a fertile ground for the development of Bayesian methodology in machine learning, specifically for high-dimensional regression and classification problems. They have achieved remarkable success in computation, and enjoy strong theoretical support. Most of the existing literature has focused on the linear Gaussian case; see Bhadra et al. (2019b) for a systematic survey. The purpose of the current article is to demonstrate that the horseshoe regularization is useful far more broadly, by reviewing both methodological and computational developments in complex models that are more relevant to machine learning applications. Specifically, we focus on methodological challenges in horseshoe regularization in nonlinear and non-Gaussian models; multivariate models; and deep neural networks. We also outline the recent computational developments in horseshoe shrinkage for complex models along with a list of available software implementations that allows one to venture out beyond the comfort zone of the canonical linear regression problems. |
Tasks | |
Published | 2019-04-24 |
URL | https://arxiv.org/abs/1904.10939v2 |
https://arxiv.org/pdf/1904.10939v2.pdf | |
PWC | https://paperswithcode.com/paper/horseshoe-regularization-for-machine-learning |
Repo | |
Framework | |