Paper Group ANR 1679
Deep Reasoning with Multi-Scale Context for Salient Object Detection. Mixed Precision DNNs: All you need is a good parametrization. Neural network gradient-based learning of black-box function interfaces. Attentive CT Lesion Detection Using Deep Pyramid Inference with Multi-Scale Booster. Salient Object Detection with Lossless Feature Reflection an …
Deep Reasoning with Multi-Scale Context for Salient Object Detection
Title | Deep Reasoning with Multi-Scale Context for Salient Object Detection |
Authors | Zun Li, Congyan Lang, Yunpeng Chen, Junhao Liew, Jiashi Feng |
Abstract | To detect salient objects accurately, existing methods usually design complex backbone network architectures to learn and fuse powerful features. However, the saliency inference module that performs saliency prediction from the fused features receives much less attention on its architecture design and typically adopts only a few fully convolutional layers. In this paper, we find the limited capacity of the saliency inference module indeed makes a fundamental performance bottleneck, and enhancing its capacity is critical for obtaining better saliency prediction. Correspondingly, we propose a deep yet light-weight saliency inference module that adopts a multi-dilated depth-wise convolution architecture. Such a deep inference module, though with simple architecture, can directly perform reasoning about salient objects from the multi-scale convolutional features fast, and give superior salient object detection performance with less computational cost. To our best knowledge, we are the first to reveal the importance of the inference module for salient object detection, and present a novel architecture design with attractive efficiency and accuracy. Extensive experimental evaluations demonstrate that our simple framework performs favorably compared with the state-of-the-art methods with complex backbone design. |
Tasks | Object Detection, Saliency Detection, Saliency Prediction, Salient Object Detection |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08362v2 |
http://arxiv.org/pdf/1901.08362v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reasoning-with-multi-scale-context-for |
Repo | |
Framework | |
Mixed Precision DNNs: All you need is a good parametrization
Title | Mixed Precision DNNs: All you need is a good parametrization |
Authors | Stefan Uhlich, Lukas Mauch, Fabien Cardinaux, Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura |
Abstract | Efficient deep neural network (DNN) inference on mobile or embedded devices typically involves quantization of the network parameters and activations. In particular, mixed precision networks achieve better performance than networks with homogeneous bitwidth for the same size constraint. Since choosing the optimal bitwidths is not straight forward, training methods, which can learn them, are desirable. Differentiable quantization with straight-through gradients allows to learn the quantizer’s parameters using gradient methods. We show that a suited parametrization of the quantizer is the key to achieve a stable training and a good final performance. Specifically, we propose to parametrize the quantizer with the step size and dynamic range. The bitwidth can then be inferred from them. Other parametrizations, which explicitly use the bitwidth, consistently perform worse. We confirm our findings with experiments on CIFAR-10 and ImageNet and we obtain mixed precision DNNs with learned quantization parameters, achieving state-of-the-art performance. |
Tasks | Quantization |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11452v2 |
https://arxiv.org/pdf/1905.11452v2.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-quantization-of-deep-neural |
Repo | |
Framework | |
Neural network gradient-based learning of black-box function interfaces
Title | Neural network gradient-based learning of black-box function interfaces |
Authors | Alon Jacovi, Guy Hadash, Einat Kermany, Boaz Carmeli, Ofer Lavi, George Kour, Jonathan Berant |
Abstract | Deep neural networks work well at approximating complicated functions when provided with data and trained by gradient descent methods. At the same time, there is a vast amount of existing functions that programmatically solve different tasks in a precise manner eliminating the need for training. In many cases, it is possible to decompose a task to a series of functions, of which for some we may prefer to use a neural network to learn the functionality, while for others the preferred method would be to use existing black-box functions. We propose a method for end-to-end training of a base neural network that integrates calls to existing black-box functions. We do so by approximating the black-box functionality with a differentiable neural network in a way that drives the base network to comply with the black-box function interface during the end-to-end optimization process. At inference time, we replace the differentiable estimator with its external black-box non-differentiable counterpart such that the base network output matches the input arguments of the black-box function. Using this “Estimate and Replace” paradigm, we train a neural network, end to end, to compute the input to black-box functionality while eliminating the need for intermediate labels. We show that by leveraging the existing precise black-box function during inference, the integrated model generalizes better than a fully differentiable model, and learns more efficiently compared to RL-based methods. |
Tasks | |
Published | 2019-01-13 |
URL | http://arxiv.org/abs/1901.03995v1 |
http://arxiv.org/pdf/1901.03995v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-gradient-based-learning-of |
Repo | |
Framework | |
Attentive CT Lesion Detection Using Deep Pyramid Inference with Multi-Scale Booster
Title | Attentive CT Lesion Detection Using Deep Pyramid Inference with Multi-Scale Booster |
Authors | Qingbin Shao, Lijun Gong, Kai Ma, Hualuo Liu, Yefeng Zheng |
Abstract | Accurate lesion detection in computer tomography (CT) slices benefits pathologic organ analysis in the medical diagnosis process. More recently, it has been tackled as an object detection problem using the Convolutional Neural Networks (CNNs). Despite the achievements from off-the-shelf CNN models, the current detection accuracy is limited by the inability of CNNs on lesions at vastly different scales. In this paper, we propose a Multi-Scale Booster (MSB) with channel and spatial attention integrated into the backbone Feature Pyramid Network (FPN). In each pyramid level, the proposed MSB captures fine-grained scale variations by using Hierarchically Dilated Convolutions (HDC). Meanwhile, the proposed channel and spatial attention modules increase the network’s capability of selecting relevant features response for lesion detection. Extensive experiments on the DeepLesion benchmark dataset demonstrate that the proposed method performs superiorly against state-of-the-art approaches. |
Tasks | Medical Diagnosis, Object Detection |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.03958v1 |
https://arxiv.org/pdf/1907.03958v1.pdf | |
PWC | https://paperswithcode.com/paper/attentive-ct-lesion-detection-using-deep |
Repo | |
Framework | |
Salient Object Detection with Lossless Feature Reflection and Weighted Structural Loss
Title | Salient Object Detection with Lossless Feature Reflection and Weighted Structural Loss |
Authors | Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen |
Abstract | Salient object detection (SOD), which aims to identify and locate the most salient pixels or regions in images, has been attracting more and more interest due to its various real-world applications. However, this vision task is quite challenging, especially under complex image scenes. Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection. Specifically, we design a symmetrical fully convolutional network (SFCN) to effectively learn complementary saliency features under the guidance of lossless feature reflection. The location information, together with contextual and semantic information, of salient objects are jointly utilized to supervise the proposed network for more accurate saliency predictions. In addition, to overcome the blurry boundary problem, we propose a new weighted structural loss function to ensure clear object boundaries and spatially consistent saliency. The coarse prediction results are effectively refined by these structural information for performance improvements. Extensive experiments on seven saliency detection datasets demonstrate that our approach achieves consistently superior performance and outperforms the very recent state-of-the-art methods with a large margin. |
Tasks | Object Detection, Saliency Detection, Salient Object Detection |
Published | 2019-01-21 |
URL | http://arxiv.org/abs/1901.06823v1 |
http://arxiv.org/pdf/1901.06823v1.pdf | |
PWC | https://paperswithcode.com/paper/salient-object-detection-with-lossless |
Repo | |
Framework | |
Spectral-based Graph Convolutional Network for Directed Graphs
Title | Spectral-based Graph Convolutional Network for Directed Graphs |
Authors | Yi Ma, Jianye Hao, Yaodong Yang, Han Li, Junqi Jin, Guangyong Chen |
Abstract | Graph convolutional networks(GCNs) have become the most popular approaches for graph data in these days because of their powerful ability to extract features from graph. GCNs approaches are divided into two categories, spectral-based and spatial-based. As the earliest convolutional networks for graph data, spectral-based GCNs have achieved impressive results in many graph related analytics tasks. However, spectral-based models cannot directly work on directed graphs. In this paper, we propose an improved spectral-based GCN for the directed graph by leveraging redefined Laplacians to improve its propagation model. Our approach can work directly on directed graph data in semi-supervised nodes classification tasks. Experiments on a number of directed graph datasets demonstrate that our approach outperforms the state-of-the-art methods. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08990v1 |
https://arxiv.org/pdf/1907.08990v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-based-graph-convolutional-network |
Repo | |
Framework | |
Pretrained language model transfer on neural named entity recognition in Indonesian conversational texts
Title | Pretrained language model transfer on neural named entity recognition in Indonesian conversational texts |
Authors | Rezka Leonandya, Fariz Ikhwantri |
Abstract | Named entity recognition (NER) is an important task in NLP, which is all the more challenging in conversational domain with their noisy facets. Moreover, conversational texts are often available in limited amount, making supervised tasks infeasible. To learn from small data, strong inductive biases are required. Previous work relied on hand-crafted features to encode these biases until transfer learning emerges. Here, we explore a transfer learning method, namely language model pretraining, on NER task in Indonesian conversational texts. We utilize large unlabeled data (generic domain) to be transferred to conversational texts, enabling supervised training on limited in-domain data. We report two transfer learning variants, namely supervised model fine-tuning and unsupervised pretrained LM fine-tuning. Our experiments show that both variants outperform baseline neural models when trained on small data (100 sentences), yielding an absolute improvement of 32 points of test F1 score. Furthermore, we find that the pretrained LM encodes part-of-speech information which is a strong predictor for NER. |
Tasks | Language Modelling, Named Entity Recognition, Transfer Learning |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.07938v1 |
http://arxiv.org/pdf/1902.07938v1.pdf | |
PWC | https://paperswithcode.com/paper/pretrained-language-model-transfer-on-neural |
Repo | |
Framework | |
An Experimental Evaluation of Large Scale GBDT Systems
Title | An Experimental Evaluation of Large Scale GBDT Systems |
Authors | Fangcheng Fu, Jiawei Jiang, Yingxia Shao, Bin Cui |
Abstract | Gradient boosting decision tree (GBDT) is a widely-used machine learning algorithm in both data analytic competitions and real-world industrial applications. Further, driven by the rapid increase in data volume, efforts have been made to train GBDT in a distributed setting to support large-scale workloads. However, we find it surprising that the existing systems manage the training dataset in different ways, but none of them have studied the impact of data management. To that end, this paper aims to study the pros and cons of different data management methods regarding the performance of distributed GBDT. We first introduce a quadrant categorization of data management policies based on data partitioning and data storage. Then we conduct an in-depth systematic analysis and summarize the advantageous scenarios of the quadrants. Based on the analysis, we further propose a novel distributed GBDT system named Vero, which adopts the unexplored composition of vertical partitioning and row-store and suits for many large-scale cases. To validate our analysis empirically, we implement different quadrants in the same code base and compare them under extensive workloads, and finally compare Vero with other state-of-the-art systems over a wide range of datasets. Our theoretical and experimental results provide a guideline on choosing a proper data management policy for a given workload. |
Tasks | |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01882v2 |
https://arxiv.org/pdf/1907.01882v2.pdf | |
PWC | https://paperswithcode.com/paper/an-experimental-evaluation-of-large-scale |
Repo | |
Framework | |
StrokeSave: A Novel, High-Performance Mobile Application for Stroke Diagnosis using Deep Learning and Computer Vision
Title | StrokeSave: A Novel, High-Performance Mobile Application for Stroke Diagnosis using Deep Learning and Computer Vision |
Authors | Ankit Gupta |
Abstract | According to the WHO, Cerebrovascular Stroke, or CS, is the second largest cause of death worldwide. Current diagnosis of CS relies on labor and cost intensive neuroimaging techniques, unsuitable for areas with inadequate access to quality medical facilities. Thus, there is a great need for an efficient diagnosis alternative. StrokeSave is a platform for users to self-diagnose for prevalence to stroke. The mobile app is continuously updated with heart rate, blood pressure, and blood oxygen data from sensors on the patient wrist. Once these measurements reach a threshold for possible stroke, the patient takes facial images and vocal recordings to screen for paralysis attributed to stroke. A custom designed lens attached to a phone’s camera then takes retinal images for the deep learning model to classify based on presence of retinopathy and sends a comprehensive diagnosis. The deep learning model, which consists of a RNN trained on 100 voice slurred audio files, a SVM trained on 410 vascular data points, and a CNN trained on 520 retinopathy images, achieved a holistic accuracy of 95.0 percent when validated on 327 samples. This value exceeds that of clinical examination accuracy, which is around 40 to 89 percent, further demonstrating the vital utility of such a medical device. Through this automated platform, users receive efficient, highly accurate diagnosis without professional medical assistance, revolutionizing medical diagnosis of CS and potentially saving millions of lives. |
Tasks | Medical Diagnosis |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.05358v1 |
https://arxiv.org/pdf/1907.05358v1.pdf | |
PWC | https://paperswithcode.com/paper/strokesave-a-novel-high-performance-mobile |
Repo | |
Framework | |
MimicGAN: Robust Projection onto Image Manifolds with Corruption Mimicking
Title | MimicGAN: Robust Projection onto Image Manifolds with Corruption Mimicking |
Authors | Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Timo Bremer |
Abstract | In the past few years, Generative Adversarial Networks (GANs) have dramatically advanced our ability to represent and parameterize high-dimensional, non-linear image manifolds. As a result, they have been widely adopted across a variety of applications, ranging from challenging inverse problems like image completion, to problems such as anomaly detection and adversarial defense. A recurring theme in many of these applications is the notion of projecting an image observation onto the manifold that is inferred by the generator. In this context, Projected Gradient Descent (PGD) has been the most popular approach, which essentially optimizes for a latent vector that minimizes the discrepancy between a generated image and the given observation. However, PGD is a brittle optimization technique that fails to identify the right projection (or latent vector) when the observation is corrupted, or perturbed even by a small amount. Such corruptions are common in the real world, for example images in the wild come with unknown crops, rotations, missing pixels, or other kinds of non-linear distributional shifts which break current encoding methods, rendering downstream applications unusable. To address this, we propose corruption mimicking – a new robust projection technique, that utilizes a surrogate network to approximate the unknown corruption directly at test time, without the need for additional supervision or data augmentation. The proposed method is significantly more robust than PGD and other competing methods under a wide variety of corruptions, thereby enabling a more effective use of GANs in real-world applications. More importantly, we show that our approach produces state-of-the-art performance in several GAN-based applications – anomaly detection, domain adaptation, and adversarial defense, that benefit from an accurate projection. |
Tasks | Adversarial Defense, Anomaly Detection, Data Augmentation, Domain Adaptation |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07748v2 |
https://arxiv.org/pdf/1912.07748v2.pdf | |
PWC | https://paperswithcode.com/paper/mimicgan-robust-projection-onto-image |
Repo | |
Framework | |
Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model
Title | Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model |
Authors | Arnab Sen Sharma, Maruf Ahmed Mridul, Md Saiful Islam |
Abstract | Widespread of satirical news in online communities is an ongoing trend. The nature of satires is so inherently ambiguous that sometimes it’s too hard even for humans to understand whether it’s actually satire or not. So, research interest has grown in this field. The purpose of this research is to detect Bangla satirical news spread in online news portals as well as social media. In this paper, we propose a hybrid technique for extracting features from text documents combining Word2Vec and TF-IDF. Using our proposed feature extraction technique, with standard CNN architecture we could detect whether a Bangla text document is satire or not with an accuracy of more than 96%. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.11062v1 |
https://arxiv.org/pdf/1911.11062v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-detection-of-satire-in-bangla |
Repo | |
Framework | |
Domain adaptation for holistic skin detection
Title | Domain adaptation for holistic skin detection |
Authors | Aloisio Dourado, Frederico Guth, Teofilo Emidio de Campos, Li Weigang |
Abstract | Human skin detection in images is a widely studied topic of Computer Vision for which it is commonly accepted that analysis of pixel color or local patches may suffice. This is because skin regions appear to be relatively uniform and many argue that there is a small chromatic variation among different samples. However, we found that there are strong biases in the datasets commonly used to train or tune skin detection methods. Furthermore, the lack of contextual information may hinder the performance of local approaches. In this paper we present a comprehensive evaluation of holistic and local Convolutional Neural Network (CNN) approaches on in-domain and cross-domain experiments and compare with state-of-the-art pixel-based approaches. We also propose a combination of inductive transfer learning and unsupervised domain adaptation methods, which are evaluated on different domains under several amounts of labelled data availability. We show a clear superiority of CNN over pixel-based approaches even without labelled training samples on the target domain. Furthermore, we provide experimental support for the counter-intuitive superiority of holistic over local approaches for human skin detection. |
Tasks | Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation |
Published | 2019-03-16 |
URL | https://arxiv.org/abs/1903.06969v2 |
https://arxiv.org/pdf/1903.06969v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptation-for-holistic-skin-detection |
Repo | |
Framework | |
Requirements-driven Test Generation for Autonomous Vehicles with Machine Learning Components
Title | Requirements-driven Test Generation for Autonomous Vehicles with Machine Learning Components |
Authors | Cumhur Erkan Tuncali, Georgios Fainekos, Danil Prokhorov, Hisahiro Ito, James Kapinski |
Abstract | Autonomous vehicles are complex systems that are challenging to test and debug. A requirements-driven approach to the development process can decrease the resources required to design and test these systems, while simultaneously increasing the reliability. We present a testing framework that uses signal temporal logic (STL), which is a precise and unambiguous requirements language. Our framework evaluates test cases against the STL formulae and additionally uses the requirements to automatically identify test cases that fail to satisfy the requirements. One of the key features of our tool is the support for machine learning (ML) components in the system design, such as deep neural networks. The framework allows evaluation of the control algorithms, including the ML components, and it also includes models of CCD camera, lidar, and radar sensors, as well as the vehicle environment. We use multiple methods to generate test cases, including covering arrays, which is an efficient method to search discrete variable spaces. The resulting test cases can be used to debug the controller design by identifying controller behaviors that do not satisfy requirements. The test cases can also enhance the testing phase of development by identifying critical corner cases that correspond to the limits of the system’s allowed behaviors. We present STL requirements for an autonomous vehicle system, which capture both component-level and system-level behaviors. Additionally, we present three driving scenarios and demonstrate how our requirements-driven testing framework can be used to identify critical system behaviors, which can be used to support the development process. |
Tasks | Autonomous Vehicles |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.01094v1 |
https://arxiv.org/pdf/1908.01094v1.pdf | |
PWC | https://paperswithcode.com/paper/requirements-driven-test-generation-for |
Repo | |
Framework | |
Deep neural network for Wannier function centers
Title | Deep neural network for Wannier function centers |
Authors | Linfeng Zhang, Mohan Chen, Xifan Wu, Han Wang, Weinan E, Roberto Car |
Abstract | We introduce a deep neural network (DNN) model that assigns the position of the centers of the electronic charge in each atomic configuration on a molecular dynamics trajectory. The electronic centers are uniquely specified by the unitary transformation that maps the occupied eigenstates onto maximally localized Wannier functions. In combination with deep potential molecular dynamics, a DNN approach to represent the potential energy surface of a multi-atom system at the ab-initio density functional level of theory, the scheme makes possible to predict the dielectric properties of insulators using samples and trajectories inaccessible to direct ab-initio molecular dynamics simulation, while retaining the accuracy of that approach. As an example, we report calculations of the infrared absorption spectra of light and heavy water at a dispersion inclusive hybrid functional level of theory, finding good agreement with experiment. Extensions to other spectroscopies, like Raman and sum frequency generation, are discussed. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11434v1 |
https://arxiv.org/pdf/1906.11434v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-network-for-wannier-function |
Repo | |
Framework | |
Predicting Global Variations in Outdoor PM2.5 Concentrations using Satellite Images and Deep Convolutional Neural Networks
Title | Predicting Global Variations in Outdoor PM2.5 Concentrations using Satellite Images and Deep Convolutional Neural Networks |
Authors | Kris Y. Hong, Pedro O. Pinheiro, Scott Weichenthal |
Abstract | Here we present a new method of estimating global variations in outdoor PM$_{2.5}$ concentrations using satellite images combined with ground-level measurements and deep convolutional neural networks. Specifically, new deep learning models were trained over the global PM$_{2.5}$ concentration range ($<$1-436 $\mu$g/m$^3$) using a large database of satellite images paired with ground level PM$_{2.5}$ measurements available from the World Health Organization. Final model selection was based on a systematic evaluation of well-known architectures for the convolutional base including InceptionV3, Xception, and VGG16. The Xception architecture performed best and the final global model had a root mean square error (RMSE) value of 13.01 $\mu$g/m$^3$ (R$^2$=0.75) in the disjoint test set. The predictive performance of our new global model (called IMAGE-PM$_{2.5}$) is similar to the current state-of-the-art model used in the Global Burden of Disease study but relies only on satellite images as input. As a result, the IMAGE-PM$_{2.5}$ model offers a fast, cost-effective means of estimating global variations in long-term average PM$_{2.5}$ concentrations and may be particularly useful for regions without ground monitoring data or detailed emissions inventories. The IMAGE-PM$_{2.5}$ model can be used as a stand-alone method of global exposure estimation or incorporated into more complex hierarchical model structures. |
Tasks | Model Selection |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.03975v1 |
https://arxiv.org/pdf/1906.03975v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-global-variations-in-outdoor-pm25 |
Repo | |
Framework | |