October 16, 2019

2722 words 13 mins read

Paper Group ANR 1022

Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder. Multi-Branch Siamese Networks with Online Selection for Object Tracking. MotifNet: a motif-based Graph Convolutional Network for directed graphs. Hierarchical LSTMs with Adaptive Attention for Visual Captioning. Model-based Hand Pose Estimation for Generalized Hand S …

Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder


Title	Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder
Authors	Rogelio Andrade Mancisidor, Michael Kampffmeyer, Kjersti Aas, Robert Jenssen
Abstract	Identifying customer segments in retail banking portfolios with different risk profiles can improve the accuracy of credit scoring. The Variational Autoencoder (VAE) has shown promising results in different research domains, and it has been documented the powerful information embedded in the latent space of the VAE. We use the VAE and show that transforming the input data into a meaningful representation, it is possible to steer configurations in the latent space of the VAE. Specifically, the Weight of Evidence (WoE) transformation encapsulates the propensity to fall into financial distress and the latent space in the VAE preserves this characteristic in a well-defined clustering structure. These clusters have considerably different risk profiles and therefore are suitable not only for credit scoring but also for marketing and customer purposes. This new clustering methodology offers solutions to some of the challenges in the existing clustering algorithms, e.g., suggests the number of clusters, assigns cluster labels to new customers, enables cluster visualization, scales to large datasets, captures non-linear relationships among others. Finally, for portfolios with a large number of customers in each cluster, developing one classifier model per cluster can improve the credit scoring assessment.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02538v1
PDF	http://arxiv.org/pdf/1806.02538v1.pdf
PWC	https://paperswithcode.com/paper/segment-based-credit-scoring-using-latent
Repo
Framework

Multi-Branch Siamese Networks with Online Selection for Object Tracking


Title	Multi-Branch Siamese Networks with Online Selection for Object Tracking
Authors	Zhenxi Li, Guillaume-Alexandre Bilodeau, Wassim Bouachir
Abstract	In this paper, we propose a robust object tracking algorithm based on a branch selection mechanism to choose the most efficient object representations from multi-branch siamese networks. While most deep learning trackers use a single CNN for target representation, the proposed Multi-Branch Siamese Tracker (MBST) employs multiple branches of CNNs pre-trained for different tasks, and used for various target representations in our tracking method. With our branch selection mechanism, the appropriate CNN branch is selected depending on the target characteristics in an online manner. By using the most adequate target representation with respect to the tracked object, our method achieves real-time tracking, while obtaining improved performance compared to standard Siamese network trackers on object tracking benchmarks.
Tasks	Object Tracking
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07349v3
PDF	http://arxiv.org/pdf/1808.07349v3.pdf
PWC	https://paperswithcode.com/paper/multi-branch-siamese-networks-with-online
Repo
Framework

MotifNet: a motif-based Graph Convolutional Network for directed graphs


Title	MotifNet: a motif-based Graph Convolutional Network for directed graphs
Authors	Federico Monti, Karl Otness, Michael M. Bronstein
Abstract	Deep learning on graphs and in particular, graph convolutional neural networks, have recently attracted significant attention in the machine learning community. Many of such techniques explore the analogy between the graph Laplacian eigenvectors and the classical Fourier basis, allowing to formulate the convolution as a multiplication in the spectral domain. One of the key drawback of spectral CNNs is their explicit assumption of an undirected graph, leading to a symmetric Laplacian matrix with orthogonal eigendecomposition. In this work we propose MotifNet, a graph CNN capable of dealing with directed graphs by exploiting local graph motifs. We present experimental evidence showing the advantage of our approach on real data.
Tasks
Published	2018-02-04
URL	http://arxiv.org/abs/1802.01572v1
PDF	http://arxiv.org/pdf/1802.01572v1.pdf
PWC	https://paperswithcode.com/paper/motifnet-a-motif-based-graph-convolutional
Repo
Framework

Hierarchical LSTMs with Adaptive Attention for Visual Captioning


Title	Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Authors	Jingkuan Song, Xiangpeng Li, Lianli Gao, Heng Tao Shen
Abstract	Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., “gun” and “shooting”) and non-visual words (e.g. “the”, “a”). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of visual captioning. Furthermore, the hierarchy of LSTMs enables more complex representation of visual data, capturing information at different scales. To address these issues, we propose a hierarchical LSTM with adaptive attention (hLSTMat) approach for image and video captioning. Specifically, the proposed framework utilizes the spatial or temporal attention for selecting specific regions or frames to predict the related words, while the adaptive attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the caption generation. We initially design our hLSTMat for video captioning task. Then, we further refine it and apply it to image captioning task. To demonstrate the effectiveness of our proposed framework, we test our method on both video and image captioning tasks. Experimental results show that our approach achieves the state-of-the-art performance for most of the evaluation metrics on both tasks. The effect of important components is also well exploited in the ablation study.
Tasks	Image Captioning, Language Modelling, Video Captioning
Published	2018-12-26
URL	http://arxiv.org/abs/1812.11004v1
PDF	http://arxiv.org/pdf/1812.11004v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-lstms-with-adaptive-attention
Repo
Framework

Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization


Title	Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization
Authors	Jan Wöhlke, Shile Li, Dongheui Lee
Abstract	Since the emergence of large annotated datasets, state-of-the-art hand pose estimation methods have been mostly based on discriminative learning. Recently, a hybrid approach has embedded a kinematic layer into the deep learning structure in such a way that the pose estimates obey the physical constraints of human hand kinematics. However, the existing approach relies on a single person’s hand shape parameters, which are fixed constants. Therefore, the existing hybrid method has problems to generalize to new, unseen hands. In this work, we extend the kinematic layer to make the hand shape parameters learnable. In this way, the learnt network can generalize towards arbitrary hand shapes. Furthermore, inspired by the idea of Spatial Transformer Networks, we apply a cascade of appearance normalization networks to decrease the variance in the input data. The input images are shifted, rotated, and globally scaled to a similar appearance. The effectiveness and limitations of our proposed approach are extensively evaluated on the Hands 2017 challenge dataset and the NYU dataset.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00898v1
PDF	http://arxiv.org/pdf/1807.00898v1.pdf
PWC	https://paperswithcode.com/paper/model-based-hand-pose-estimation-for
Repo
Framework

Using NLP on news headlines to predict index trends


Title	Using NLP on news headlines to predict index trends
Authors	Marc Velay, Fabrice Daniel
Abstract	This paper attempts to provide a state of the art in trend prediction using news headlines. We present the research done on predicting DJIA trends using Natural Language Processing. We will explain the different algorithms we have used as well as the various embedding techniques attempted. We rely on statistical and deep learning models in order to extract information from the corpuses.
Tasks
Published	2018-06-22
URL	http://arxiv.org/abs/1806.09533v1
PDF	http://arxiv.org/pdf/1806.09533v1.pdf
PWC	https://paperswithcode.com/paper/using-nlp-on-news-headlines-to-predict-index
Repo
Framework

Coupled Recurrent Models for Polyphonic Music Composition


Title	Coupled Recurrent Models for Polyphonic Music Composition
Authors	John Thickstun, Zaid Harchaoui, Dean P. Foster, Sham M. Kakade
Abstract	This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music. We propose an efficient new conditional probabilistic factorization of musical scores, viewing a score as a collection of concurrent, coupled sequences: i.e. voices. To model the conditional distributions, we borrow ideas from both convolutional and recurrent neural models; we argue that these ideas are natural for capturing music’s pitch invariances, temporal structure, and polyphony. We train models for single-voice and multi-voice composition on 2,300 scores from the KernScores dataset.
Tasks	Time Series
Published	2018-11-20
URL	https://arxiv.org/abs/1811.08045v2
PDF	https://arxiv.org/pdf/1811.08045v2.pdf
PWC	https://paperswithcode.com/paper/coupled-recurrent-models-for-polyphonic-music
Repo
Framework

Fast, Better Training Trick — Random Gradient


Title	Fast, Better Training Trick — Random Gradient
Authors	Jiakai Wei
Abstract	In this paper, we will show an unprecedented method to accelerate training and improve performance, which called random gradient (RG). This method can be easier to the training of any model without extra calculation cost, we use Image classification, Semantic segmentation, and GANs to confirm this method can improve speed which is training model in computer vision. The central idea is using the loss multiplied by a random number to random reduce the back-propagation gradient. We can use this method to produce a better result in Pascal VOC, Cifar, Cityscapes datasets.
Tasks	Image Classification, Semantic Segmentation
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04293v1
PDF	http://arxiv.org/pdf/1808.04293v1.pdf
PWC	https://paperswithcode.com/paper/fast-better-training-trick-random-gradient
Repo
Framework

Disentangling Latent Hands for Image Synthesis and Pose Estimation


Title	Disentangling Latent Hands for Image Synthesis and Pose Estimation
Authors	Linlin Yang, Angela Yao
Abstract	Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint. To better analyze these factors of variation, we propose the use of disentangled representations and a disentangled variational autoencoder (dVAE) that allows for specific sampling and inference of these factors. The derived objective from the variational lower bound as well as the proposed training strategy are highly flexible, allowing us to handle cross-modal encoders and decoders as well as semi-supervised learning scenarios. Experiments show that our dVAE can synthesize highly realistic images of the hand specifiable by both pose and image background content and also estimate 3D hand poses from RGB images with accuracy competitive with state-of-the-art on two public benchmarks.
Tasks	Image Generation, Pose Estimation
Published	2018-12-03
URL	http://arxiv.org/abs/1812.01002v2
PDF	http://arxiv.org/pdf/1812.01002v2.pdf
PWC	https://paperswithcode.com/paper/disentangling-latent-hands-for-image
Repo
Framework

Dynamic Measurement Scheduling for Adverse Event Forecasting using Deep RL


Title	Dynamic Measurement Scheduling for Adverse Event Forecasting using Deep RL
Authors	Chun-Hao Chang, Mingjie Mai, Anna Goldenberg
Abstract	Current clinical practice to monitor patients’ health follows either regular or heuristic-based lab test (e.g. blood test) scheduling. Such practice not only gives rise to redundant measurements accruing cost, but may even lead to unnecessary patient discomfort. From the computational perspective, heuristic-based test scheduling might lead to reduced accuracy of clinical forecasting models. Computationally learning an optimal clinical test scheduling and measurement collection, is likely to lead to both, better predictive models and patient outcome improvement. We address the scheduling problem using deep reinforcement learning (RL) to achieve high predictive gain and low measurement cost, by scheduling fewer, but strategically timed tests. We first show that in the simulation our policy outperforms heuristic-based measurement scheduling with higher predictive gain or lower cost measured by accumulated reward. We then learn a scheduling policy for mortality forecasting in the real-world clinical dataset (MIMIC3), our learned policy is able to provide useful clinical insights. To our knowledge, this is the first RL application on multi-measurement scheduling problem in the clinical setting.
Tasks
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00268v1
PDF	http://arxiv.org/pdf/1812.00268v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-measurement-scheduling-for-adverse
Repo
Framework

From Machine to Machine: An OCT-trained Deep Learning Algorithm for Objective Quantification of Glaucomatous Damage in Fundus Photographs


Title	From Machine to Machine: An OCT-trained Deep Learning Algorithm for Objective Quantification of Glaucomatous Damage in Fundus Photographs
Authors	Felipe A. Medeiros, Alessandro A. Jammal, Atalie C. Thompson
Abstract	Previous approaches using deep learning algorithms to classify glaucomatous damage on fundus photographs have been limited by the requirement for human labeling of a reference training set. We propose a new approach using spectral-domain optical coherence tomography (SDOCT) data to train a deep learning algorithm to quantify glaucomatous structural damage on optic disc photographs. The dataset included 32,820 pairs of optic disc photos and SDOCT retinal nerve fiber layer (RNFL) scans from 2,312 eyes of 1,198 subjects. A deep learning convolutional neural network was trained to assess optic disc photographs and predict SDOCT average RNFL thickness. The performance of the algorithm was evaluated in an independent test sample. The mean prediction of average RNFL thickness from all 6,292 optic disc photos in the test set was 83.3$\pm$14.5 $\mu$m, whereas the mean average RNFL thickness from all corresponding SDOCT scans was 82.5$\pm$16.8 $\mu$m (P = 0.164). There was a very strong correlation between predicted and observed RNFL thickness values (r = 0.832; P<0.001), with mean absolute error of the predictions of 7.39 $\mu$m. The areas under the receiver operating characteristic curves for discriminating glaucoma from healthy eyes with the deep learning predictions and actual SDOCT measurements were 0.944 (95$%$ CI: 0.912- 0.966) and 0.940 (95$%$ CI: 0.902 - 0.966), respectively (P = 0.724). In conclusion, we introduced a novel deep learning approach to assess optic disc photographs and provide quantitative information about the amount of neural damage. This approach could potentially be used to diagnose and stage glaucomatous damage from optic disc photographs.
Tasks
Published	2018-10-20
URL	http://arxiv.org/abs/1810.10343v1
PDF	http://arxiv.org/pdf/1810.10343v1.pdf
PWC	https://paperswithcode.com/paper/from-machine-to-machine-an-oct-trained-deep
Repo
Framework

Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words


Title	Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words
Authors	A. B. Veretennikov
Abstract	Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.
Tasks
Published	2018-01-27
URL	http://arxiv.org/abs/1801.09079v2
PDF	http://arxiv.org/pdf/1801.09079v2.pdf
PWC	https://paperswithcode.com/paper/using-additional-indexes-for-fast-full-text
Repo
Framework

Learning by Playing - Solving Sparse Reward Tasks from Scratch


Title	Learning by Playing - Solving Sparse Reward Tasks from Scratch
Authors	Martin Riedmiller, Roland Hafner, Thomas Lampe, Michael Neunert, Jonas Degrave, Tom Van de Wiele, Volodymyr Mnih, Nicolas Heess, Jost Tobias Springenberg
Abstract	We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors - from scratch - in the presence of multiple sparse reward signals. To this end, the agent is equipped with a set of general auxiliary tasks, that it attempts to learn simultaneously via off-policy RL. The key idea behind our method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment - enabling it to excel at sparse reward RL. Our experiments in several challenging robotic manipulation settings demonstrate the power of our approach.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10567v1
PDF	http://arxiv.org/pdf/1802.10567v1.pdf
PWC	https://paperswithcode.com/paper/learning-by-playing-solving-sparse-reward
Repo
Framework

Joint Optimization Framework for Learning with Noisy Labels


Title	Joint Optimization Framework for Learning with Noisy Labels
Authors	Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa
Abstract	Deep neural networks (DNNs) trained on large-scale datasets have exhibited significant performance in image classification. Many large-scale datasets are collected from websites, however they tend to contain inaccurate labels that are termed as noisy labels. Training on such noisy labeled datasets causes performance degradation because DNNs easily overfit to noisy labels. To overcome this problem, we propose a joint optimization framework of learning DNN parameters and estimating true labels. Our framework can correct labels during training by alternating update of network parameters and labels. We conduct experiments on the noisy CIFAR-10 datasets and the Clothing1M dataset. The results indicate that our approach significantly outperforms other state-of-the-art methods.
Tasks	Image Classification
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11364v1
PDF	http://arxiv.org/pdf/1803.11364v1.pdf
PWC	https://paperswithcode.com/paper/joint-optimization-framework-for-learning
Repo
Framework

Robust Bhattacharyya bound linear discriminant analysis through adaptive algorithm


Title	Robust Bhattacharyya bound linear discriminant analysis through adaptive algorithm
Authors	Chun-Na Li, Yuan-Hai Shao, Zhen Wang, Nai-Yang Deng
Abstract	In this paper, we propose a novel linear discriminant analysis criterion via the Bhattacharyya error bound estimation based on a novel L1-norm (L1BLDA) and L2-norm (L2BLDA). Both L1BLDA and L2BLDA maximize the between-class scatters which are measured by the weighted pairwise distances of class means and meanwhile minimize the within-class scatters under the L1-norm and L2-norm, respectively. The proposed models can avoid the small sample size (SSS) problem and have no rank limit that may encounter in LDA. It is worth mentioning that, the employment of L1-norm gives a robust performance of L1BLDA, and L1BLDA is solved through an effective non-greedy alternating direction method of multipliers (ADMM), where all the projection vectors can be obtained once for all. In addition, the weighting constants of L1BLDA and L2BLDA between the between-class and within-class terms are determined by the involved data set, which makes our L1BLDA and L2BLDA adaptive. The experimental results on both benchmark data sets as well as the handwritten digit databases demonstrate the effectiveness of the proposed methods.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02384v1
PDF	http://arxiv.org/pdf/1811.02384v1.pdf
PWC	https://paperswithcode.com/paper/robust-bhattacharyya-bound-linear
Repo
Framework