October 18, 2019

3246 words 16 mins read

Paper Group ANR 472

Paper Group ANR 472

Bridgeout: stochastic bridge regularization for deep neural networks. Web Applicable Computer-aided Diagnosis of Glaucoma Using Deep Learning. Robust Adaptive Median Binary Pattern for noisy texture classification and retrieval. Double Deep Q-Learning for Optimal Execution. Memory Warps for Learning Long-Term Online Video Representations. Multi-Vie …

Bridgeout: stochastic bridge regularization for deep neural networks

Title Bridgeout: stochastic bridge regularization for deep neural networks
Authors Najeeb Khan, Jawad Shah, Ian Stavness
Abstract A major challenge in training deep neural networks is overfitting, i.e. inferior performance on unseen test examples compared to performance on training examples. To reduce overfitting, stochastic regularization methods have shown superior performance compared to deterministic weight penalties on a number of image recognition tasks. Stochastic methods such as Dropout and Shakeout, in expectation, are equivalent to imposing a ridge and elastic-net penalty on the model parameters, respectively. However, the choice of the norm of weight penalty is problem dependent and is not restricted to ${L_1,L_2}$. Therefore, in this paper we propose the Bridgeout stochastic regularization technique and prove that it is equivalent to an $L_q$ penalty on the weights, where the norm $q$ can be learned as a hyperparameter from data. Experimental results show that Bridgeout results in sparse model weights, improved gradients and superior classification performance compared to Dropout and Shakeout on synthetic and real datasets.
Tasks
Published 2018-04-21
URL http://arxiv.org/abs/1804.08042v1
PDF http://arxiv.org/pdf/1804.08042v1.pdf
PWC https://paperswithcode.com/paper/bridgeout-stochastic-bridge-regularization
Repo
Framework

Web Applicable Computer-aided Diagnosis of Glaucoma Using Deep Learning

Title Web Applicable Computer-aided Diagnosis of Glaucoma Using Deep Learning
Authors Mijung Kim, Olivier Janssens, Ho-min Park, Jasper Zuallaert, Sofie Van Hoecke, Wesley De Neve
Abstract Glaucoma is a major eye disease, leading to vision loss in the absence of proper medical treatment. Current diagnosis of glaucoma is performed by ophthalmologists who are often analyzing several types of medical images generated by different types of medical equipment. Capturing and analyzing these medical images is labor-intensive and expensive. In this paper, we present a novel computational approach towards glaucoma diagnosis and localization, only making use of eye fundus images that are analyzed by state-of-the-art deep learning techniques. Specifically, our approach leverages Convolutional Neural Networks (CNNs) and Gradient-weighted Class Activation Mapping (Grad-CAM) for glaucoma diagnosis and localization, respectively. Quantitative and qualitative results, as obtained for a small-sized dataset with no segmentation ground truth, demonstrate that the proposed approach is promising, for instance achieving an accuracy of 0.91$\pm0.02$ and an ROC-AUC score of 0.94 for the diagnosis task. Furthermore, we present a publicly available prototype web application that integrates our predictive model, with the goal of making effective glaucoma diagnosis available to a wide audience.
Tasks
Published 2018-12-06
URL http://arxiv.org/abs/1812.02405v2
PDF http://arxiv.org/pdf/1812.02405v2.pdf
PWC https://paperswithcode.com/paper/web-applicable-computer-aided-diagnosis-of
Repo
Framework

Robust Adaptive Median Binary Pattern for noisy texture classification and retrieval

Title Robust Adaptive Median Binary Pattern for noisy texture classification and retrieval
Authors Mohammad Alkhatib, Adel Hafiane
Abstract Texture is an important cue for different computer vision tasks and applications. Local Binary Pattern (LBP) is considered one of the best yet efficient texture descriptors. However, LBP has some notable limitations, mostly the sensitivity to noise. In this paper, we address these criteria by introducing a novel texture descriptor, Robust Adaptive Median Binary Pattern (RAMBP). RAMBP based on classification process of noisy pixels, adaptive analysis window, scale analysis and image regions median comparison. The proposed method handles images with high noisy textures, and increases the discriminative properties by capturing microstructure and macrostructure texture information. The proposed method has been evaluated on popular texture datasets for classification and retrieval tasks, and under different high noise conditions. Without any train or prior knowledge of noise type, RAMBP achieved the best classification compared to state-of-the-art techniques. It scored more than $90%$ under $50%$ impulse noise densities, more than $95%$ under Gaussian noised textures with standard deviation $\sigma = 5$, and more than $99%$ under Gaussian blurred textures with standard deviation $\sigma = 1.25$. The proposed method yielded competitive results and high performance as one of the best descriptors in noise-free texture classification. Furthermore, RAMBP showed also high performance for the problem of noisy texture retrieval providing high scores of recall and precision measures for textures with high levels of noise.
Tasks Texture Classification
Published 2018-05-15
URL http://arxiv.org/abs/1805.05732v1
PDF http://arxiv.org/pdf/1805.05732v1.pdf
PWC https://paperswithcode.com/paper/robust-adaptive-median-binary-pattern-for
Repo
Framework

Double Deep Q-Learning for Optimal Execution

Title Double Deep Q-Learning for Optimal Execution
Authors Brian Ning, Franco Ho Ting Ling, Sebastian Jaimungal
Abstract Optimal trade execution is an important problem faced by essentially all traders. Much research into optimal execution uses stringent model assumptions and applies continuous time stochastic control to solve them. Here, we instead take a model free approach and develop a variation of Deep Q-Learning to estimate the optimal actions of a trader. The model is a fully connected Neural Network trained using Experience Replay and Double DQN with input features given by the current state of the limit order book, other trading signals, and available execution actions, while the output is the Q-value function estimating the future rewards under an arbitrary action. We apply our model to nine different stocks and find that it outperforms the standard benchmark approach on most stocks using the measures of (i) mean and median out-performance, (ii) probability of out-performance, and (iii) gain-loss ratios.
Tasks Q-Learning
Published 2018-12-17
URL http://arxiv.org/abs/1812.06600v1
PDF http://arxiv.org/pdf/1812.06600v1.pdf
PWC https://paperswithcode.com/paper/double-deep-q-learning-for-optimal-execution
Repo
Framework

Memory Warps for Learning Long-Term Online Video Representations

Title Memory Warps for Learning Long-Term Online Video Representations
Authors Tuan-Hung Vu, Wongun Choi, Samuel Schulter, Manmohan Chandraker
Abstract This paper proposes a novel memory-based online video representation that is efficient, accurate and predictive. This is in contrast to prior works that often rely on computationally heavy 3D convolutions, ignore actual motion when aligning features over time, or operate in an off-line mode to utilize future frames. In particular, our memory (i) holds the feature representation, (ii) is spatially warped over time to compensate for observer and scene motions, (iii) can carry long-term information, and (iv) enables predicting feature representations in future frames. By exploring a variant that operates at multiple temporal scales, we efficiently learn across even longer time horizons. We apply our online framework to object detection in videos, obtaining a large 2.3 times speed-up and losing only 0.9% mAP on ImageNet-VID dataset, compared to prior works that even use future frames. Finally, we demonstrate the predictive property of our representation in two novel detection setups, where features are propagated over time to (i) significantly enhance a real-time detector by more than 10% mAP in a multi-threaded online setup and to (ii) anticipate objects in future frames.
Tasks Object Detection
Published 2018-03-28
URL http://arxiv.org/abs/1803.10861v1
PDF http://arxiv.org/pdf/1803.10861v1.pdf
PWC https://paperswithcode.com/paper/memory-warps-for-learning-long-term-online
Repo
Framework

Multi-View Bayesian Correlated Component Analysis

Title Multi-View Bayesian Correlated Component Analysis
Authors Simon Kamronn, Andreas Trier Poulsen, Lars Kai Hansen
Abstract Correlated component analysis as proposed by Dmochowski et al. (2012) is a tool for investigating brain process similarity in the responses to multiple views of a given stimulus. Correlated components are identified under the assumption that the involved spatial networks are identical. Here we propose a hierarchical probabilistic model that can infer the level of universality in such multi-view data, from completely unrelated representations, corresponding to canonical correlation analysis, to identical representations as in correlated component analysis. This new model, which we denote Bayesian correlated component analysis, evaluates favourably against three relevant algorithms in simulated data. A well-established benchmark EEG dataset is used to further validate the new model and infer the variability of spatial representations across multiple subjects.
Tasks EEG
Published 2018-02-07
URL http://arxiv.org/abs/1802.02343v1
PDF http://arxiv.org/pdf/1802.02343v1.pdf
PWC https://paperswithcode.com/paper/multi-view-bayesian-correlated-component
Repo
Framework

Multilayer Complex Network Descriptors for Color-Texture Characterization

Title Multilayer Complex Network Descriptors for Color-Texture Characterization
Authors Leonardo F S Scabini, Rayner H M Condori, Wesley N Gonçalves, Odemir M Bruno
Abstract A new method based on complex networks is proposed for color-texture analysis. The proposal consists on modeling the image as a multilayer complex network where each color channel is a layer, and each pixel (in each color channel) is represented as a network vertex. The network dynamic evolution is accessed using a set of modeling parameters (radii and thresholds), and new characterization techniques are introduced to capt information regarding within and between color channel spatial interaction. An automatic and adaptive approach for threshold selection is also proposed. We conduct classification experiments on 5 well-known datasets: Vistex, Usptex, Outex13, CURet and MBT. Results among various literature methods are compared, including deep convolutional neural networks with pre-trained architectures. The proposed method presented the highest overall performance over the 5 datasets, with 97.7 of mean accuracy against 97.0 achieved by the ResNet convolutional neural network with 50 layers.
Tasks Texture Classification
Published 2018-04-02
URL http://arxiv.org/abs/1804.00501v1
PDF http://arxiv.org/pdf/1804.00501v1.pdf
PWC https://paperswithcode.com/paper/multilayer-complex-network-descriptors-for
Repo
Framework

Abdominal multi-organ segmentation with organ-attention networks and statistical fusion

Title Abdominal multi-organ segmentation with organ-attention networks and statistical fusion
Authors Yan Wang, Yuyin Zhou, Wei Shen, Seyoun Park, Elliot K. Fishman, Alan L. Yuille
Abstract Accurate and robust segmentation of abdominal organs on CT is essential for many clinical applications such as computer-aided diagnosis and computer-aided surgery. But this task is challenging due to the weak boundaries of organs, the complexity of the background, and the variable sizes of different organs. To address these challenges, we introduce a novel framework for multi-organ segmentation by using organ-attention networks with reverse connections (OAN-RCs) which are applied to 2D views, of the 3D CT volume, and output estimates which are combined by statistical fusion exploiting structural similarity. OAN is a two-stage deep convolutional network, where deep network features from the first stage are combined with the original image, in a second stage, to reduce the complex background and enhance the discriminative information for the target organs. RCs are added to the first stage to give the lower layers semantic information thereby enabling them to adapt to the sizes of different organs. Our networks are trained on 2D views enabling us to use holistic information and allowing efficient computation. To compensate for the limited cross-sectional information of the original 3D volumetric CT, multi-sectional images are reconstructed from the three different 2D view directions. Then we combine the segmentation results from the different views using statistical fusion, with a novel term relating the structural similarity of the 2D views to the original 3D structure. To train the network and evaluate results, 13 structures were manually annotated by four human raters and confirmed by a senior expert on 236 normal cases. We tested our algorithm and computed Dice-Sorensen similarity coefficients and surface distances for evaluating our estimates of the 13 structures. Our experiments show that the proposed approach outperforms 2D- and 3D-patch based state-of-the-art methods.
Tasks
Published 2018-04-23
URL http://arxiv.org/abs/1804.08414v1
PDF http://arxiv.org/pdf/1804.08414v1.pdf
PWC https://paperswithcode.com/paper/abdominal-multi-organ-segmentation-with-organ
Repo
Framework

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Title Quantization Mimic: Towards Very Tiny CNN for Object Detection
Authors Yi Wei, Xinyu Pan, Hongwei Qin, Wanli Ouyang, Junjie Yan
Abstract In this paper, we propose a simple and general framework for training very tiny CNNs for object detection. Due to limited representation ability, it is challenging to train very tiny networks for complicated tasks like detection. To the best of our knowledge, our method, called Quantization Mimic, is the first one focusing on very tiny networks. We utilize two types of acceleration methods: mimic and quantization. Mimic improves the performance of a student network by transfering knowledge from a teacher network. Quantization converts a full-precision network to a quantized one without large degradation of performance. If the teacher network is quantized, the search scope of the student network will be smaller. Using this feature of the quantization, we propose Quantization Mimic. It first quantizes the large network, then mimic a quantized small network. The quantization operation can help student network to better match the feature maps from teacher network. To evaluate our approach, we carry out experiments on various popular CNNs including VGG and Resnet, as well as different detection frameworks including Faster R-CNN and R-FCN. Experiments on Pascal VOC and WIDER FACE verify that our Quantization Mimic algorithm can be applied on various settings and outperforms state-of-the-art model acceleration methods given limited computing resouces.
Tasks Object Detection, Quantization
Published 2018-05-06
URL http://arxiv.org/abs/1805.02152v3
PDF http://arxiv.org/pdf/1805.02152v3.pdf
PWC https://paperswithcode.com/paper/quantization-mimic-towards-very-tiny-cnn-for
Repo
Framework

Explaining Explanations: An Overview of Interpretability of Machine Learning

Title Explaining Explanations: An Overview of Interpretability of Machine Learning
Authors Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, Lalana Kagal
Abstract There has recently been a surge of work in explanatory artificial intelligence (XAI). This research area tackles the important problem that complex machines and algorithms often cannot provide insights into their behavior and thought processes. XAI allows users and parts of the internal system to be more transparent, providing explanations of their decisions in some level of detail. These explanations are important to ensure algorithmic fairness, identify potential bias/problems in the training data, and to ensure that the algorithms perform as expected. However, explanations produced by these systems is neither standardized nor systematically assessed. In an effort to create best practices and identify open challenges, we provide our definition of explainability and show how it can be used to classify existing literature. We discuss why current approaches to explanatory methods especially for deep neural networks are insufficient. Finally, based on our survey, we conclude with suggested future research directions for explanatory artificial intelligence.
Tasks
Published 2018-05-31
URL http://arxiv.org/abs/1806.00069v3
PDF http://arxiv.org/pdf/1806.00069v3.pdf
PWC https://paperswithcode.com/paper/explaining-explanations-an-overview-of
Repo
Framework

A novel method for predicting and mapping the presence of sun glare using Google Street View

Title A novel method for predicting and mapping the presence of sun glare using Google Street View
Authors Xiaojiang Li, Bill Yang Cai, Waishan Qiu, Jinhua Zhao, Carlo Ratti
Abstract The sun glare is one of the major environmental hazards that cause traffic accidents. Every year, many people died and injured in traffic accidents related to sun glare. Providing accurate information about when and where sun glare happens would be helpful to prevent sun glare caused traffic accidents and save lives. In this study, we proposed to use publicly accessible Google Street View (GSV) panorama images to estimate and predict the occurrence of sun glare. GSV images have view sight similar to drivers, which would make GSV images suitable for estimating the visibility of sun glare to drivers. A recently developed convolutional neural network algorithm was used to segment GSV images and predict obstructions on sun glare. Based on the predicted obstructions for given locations, we further estimated the time windows of sun glare by estimating the sun positions and the relative angles between drivers and the sun for those locations. We conducted a case study in Cambridge, Massachusetts, USA. Results show that the method can predict the presence of sun glare precisely. The proposed method would provide an important tool for drivers and traffic planners to mitigate the sun glare and decrease the potential traffic accidents caused by the sun glare.
Tasks
Published 2018-08-05
URL http://arxiv.org/abs/1808.04436v1
PDF http://arxiv.org/pdf/1808.04436v1.pdf
PWC https://paperswithcode.com/paper/a-novel-method-for-predicting-and-mapping-the
Repo
Framework

RED-Net: A Recurrent Encoder-Decoder Network for Video-based Face Alignment

Title RED-Net: A Recurrent Encoder-Decoder Network for Video-based Face Alignment
Authors Xi Peng, Rogerio S. Feris, Xiaoyu Wang, Dimitris N. Metaxas
Abstract We propose a novel method for real-time face alignment in videos based on a recurrent encoder-decoder network model. Our proposed model predicts 2D facial point heat maps regularized by both detection and regression loss, while uniquely exploiting recurrent learning at both spatial and temporal dimensions. At the spatial level, we add a feedback loop connection between the combined output response map and the input, in order to enable iterative coarse-to-fine face alignment using a single network model, instead of relying on traditional cascaded model ensembles. At the temporal level, we first decouple the features in the bottleneck of the network into temporal-variant factors, such as pose and expression, and temporal-invariant factors, such as identity information. Temporal recurrent learning is then applied to the decoupled temporal-variant features. We show that such feature disentangling yields better generalization and significantly more accurate results at test time. We perform a comprehensive experimental analysis, showing the importance of each component of our proposed model, as well as superior results over the state of the art and several variations of our method in standard datasets.
Tasks Face Alignment
Published 2018-01-17
URL http://arxiv.org/abs/1801.06066v1
PDF http://arxiv.org/pdf/1801.06066v1.pdf
PWC https://paperswithcode.com/paper/red-net-a-recurrent-encoder-decoder-network
Repo
Framework

Innovative Texture Database Collecting Approach and Feature Extraction Method based on Combination of Gray Tone Difference Matrixes, Local Binary Patterns,and K-means Clustering

Title Innovative Texture Database Collecting Approach and Feature Extraction Method based on Combination of Gray Tone Difference Matrixes, Local Binary Patterns,and K-means Clustering
Authors Shervan Fekri-Ershad
Abstract Texture analysis and classification are some of the problems which have been paid much attention by image processing scientists since late 80s. If texture analysis is done accurately, it can be used in many cases such as object tracking, visual pattern recognition, and face recognition.Since now, so many methods are offered to solve this problem. Against their technical differences, all of them used same popular databases to evaluate their performance such asBrodatz or Outex, which may be made their performance biased on these databases. In this paper, an approach is proposed to collect more efficient databases of texture images. The proposed approach is included two stages. The first one is developing feature representation based on gray tone difference matrixes and local binary patterns features and the next one is consisted an innovative algorithm which is based on K-means clustering to collect images based on evaluated features. In order to evaluate the performance of the proposed approach, a texture database is collected and fisher rate is computed for collected one and well known databases. Also, texture classification is evaluated based on offered feature extraction and the accuracy is compared by some state of the art texture classification methods.
Tasks Object Tracking, Texture Classification
Published 2018-03-12
URL http://arxiv.org/abs/1803.04125v1
PDF http://arxiv.org/pdf/1803.04125v1.pdf
PWC https://paperswithcode.com/paper/innovative-texture-database-collecting
Repo
Framework

Discriminative training of RNNLMs with the average word error criterion

Title Discriminative training of RNNLMs with the average word error criterion
Authors Rémi Francis, Tom Ash, Will Williams
Abstract In automatic speech recognition (ASR), recurrent neural language models (RNNLM) are typically used to refine hypotheses in the form of lattices or n-best lists, which are generated by a beam search decoder with a weaker language model. The RNNLMs are usually trained generatively using the perplexity (PPL) criterion on large corpora of grammatically correct text. However, the hypotheses are noisy, and the RNNLM doesn’t always make the choices that minimise the metric we optimise for, the word error rate (WER). To address this mismatch we propose to use a task specific loss to train an RNNLM to discriminate between multiple hypotheses within lattice rescoring scenario. By fine-tuning the RNNLM on lattices with the average edit distance loss, we show that we obtain a 1.9% relative improvement in word error rate over a purely generatively trained model.
Tasks Language Modelling, Speech Recognition
Published 2018-11-06
URL http://arxiv.org/abs/1811.02528v2
PDF http://arxiv.org/pdf/1811.02528v2.pdf
PWC https://paperswithcode.com/paper/discriminative-training-of-rnnlms-with-the
Repo
Framework

Information content of coevolutionary game landscapes

Title Information content of coevolutionary game landscapes
Authors Hendrik Richter
Abstract Coevolutionary game dynamics is the result of players that may change their strategies and their network of interaction. For such games, and based on interpreting strategies as configurations, strategy-to-payoff maps can be defined for every interaction network, which opens up to derive game landscapes. This paper presents an analysis of these game landscapes by their information content. By this analysis, we particularly study the effect of a rescaled payoff matrix generalizing social dilemmas and differences between well-mixed and structured populations.
Tasks
Published 2018-03-20
URL http://arxiv.org/abs/1803.07307v1
PDF http://arxiv.org/pdf/1803.07307v1.pdf
PWC https://paperswithcode.com/paper/information-content-of-coevolutionary-game
Repo
Framework
comments powered by Disqus