Paper Group ANR 1022
Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder. Multi-Branch Siamese Networks with Online Selection for Object Tracking. MotifNet: a motif-based Graph Convolutional Network for directed graphs. Hierarchical LSTMs with Adaptive Attention for Visual Captioning. Model-based Hand Pose Estimation for Generalized Hand S …
Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder
Title | Segment-Based Credit Scoring Using Latent Clusters in the Variational Autoencoder |
Authors | Rogelio Andrade Mancisidor, Michael Kampffmeyer, Kjersti Aas, Robert Jenssen |
Abstract | Identifying customer segments in retail banking portfolios with different risk profiles can improve the accuracy of credit scoring. The Variational Autoencoder (VAE) has shown promising results in different research domains, and it has been documented the powerful information embedded in the latent space of the VAE. We use the VAE and show that transforming the input data into a meaningful representation, it is possible to steer configurations in the latent space of the VAE. Specifically, the Weight of Evidence (WoE) transformation encapsulates the propensity to fall into financial distress and the latent space in the VAE preserves this characteristic in a well-defined clustering structure. These clusters have considerably different risk profiles and therefore are suitable not only for credit scoring but also for marketing and customer purposes. This new clustering methodology offers solutions to some of the challenges in the existing clustering algorithms, e.g., suggests the number of clusters, assigns cluster labels to new customers, enables cluster visualization, scales to large datasets, captures non-linear relationships among others. Finally, for portfolios with a large number of customers in each cluster, developing one classifier model per cluster can improve the credit scoring assessment. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02538v1 |
http://arxiv.org/pdf/1806.02538v1.pdf | |
PWC | https://paperswithcode.com/paper/segment-based-credit-scoring-using-latent |
Repo | |
Framework | |
Multi-Branch Siamese Networks with Online Selection for Object Tracking
Title | Multi-Branch Siamese Networks with Online Selection for Object Tracking |
Authors | Zhenxi Li, Guillaume-Alexandre Bilodeau, Wassim Bouachir |
Abstract | In this paper, we propose a robust object tracking algorithm based on a branch selection mechanism to choose the most efficient object representations from multi-branch siamese networks. While most deep learning trackers use a single CNN for target representation, the proposed Multi-Branch Siamese Tracker (MBST) employs multiple branches of CNNs pre-trained for different tasks, and used for various target representations in our tracking method. With our branch selection mechanism, the appropriate CNN branch is selected depending on the target characteristics in an online manner. By using the most adequate target representation with respect to the tracked object, our method achieves real-time tracking, while obtaining improved performance compared to standard Siamese network trackers on object tracking benchmarks. |
Tasks | Object Tracking |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07349v3 |
http://arxiv.org/pdf/1808.07349v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-branch-siamese-networks-with-online |
Repo | |
Framework | |
MotifNet: a motif-based Graph Convolutional Network for directed graphs
Title | MotifNet: a motif-based Graph Convolutional Network for directed graphs |
Authors | Federico Monti, Karl Otness, Michael M. Bronstein |
Abstract | Deep learning on graphs and in particular, graph convolutional neural networks, have recently attracted significant attention in the machine learning community. Many of such techniques explore the analogy between the graph Laplacian eigenvectors and the classical Fourier basis, allowing to formulate the convolution as a multiplication in the spectral domain. One of the key drawback of spectral CNNs is their explicit assumption of an undirected graph, leading to a symmetric Laplacian matrix with orthogonal eigendecomposition. In this work we propose MotifNet, a graph CNN capable of dealing with directed graphs by exploiting local graph motifs. We present experimental evidence showing the advantage of our approach on real data. |
Tasks | |
Published | 2018-02-04 |
URL | http://arxiv.org/abs/1802.01572v1 |
http://arxiv.org/pdf/1802.01572v1.pdf | |
PWC | https://paperswithcode.com/paper/motifnet-a-motif-based-graph-convolutional |
Repo | |
Framework | |
Hierarchical LSTMs with Adaptive Attention for Visual Captioning
Title | Hierarchical LSTMs with Adaptive Attention for Visual Captioning |
Authors | Jingkuan Song, Xiangpeng Li, Lianli Gao, Heng Tao Shen |
Abstract | Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., “gun” and “shooting”) and non-visual words (e.g. “the”, “a”). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of visual captioning. Furthermore, the hierarchy of LSTMs enables more complex representation of visual data, capturing information at different scales. To address these issues, we propose a hierarchical LSTM with adaptive attention (hLSTMat) approach for image and video captioning. Specifically, the proposed framework utilizes the spatial or temporal attention for selecting specific regions or frames to predict the related words, while the adaptive attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the caption generation. We initially design our hLSTMat for video captioning task. Then, we further refine it and apply it to image captioning task. To demonstrate the effectiveness of our proposed framework, we test our method on both video and image captioning tasks. Experimental results show that our approach achieves the state-of-the-art performance for most of the evaluation metrics on both tasks. The effect of important components is also well exploited in the ablation study. |
Tasks | Image Captioning, Language Modelling, Video Captioning |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.11004v1 |
http://arxiv.org/pdf/1812.11004v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-lstms-with-adaptive-attention |
Repo | |
Framework | |
Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization
Title | Model-based Hand Pose Estimation for Generalized Hand Shape with Appearance Normalization |
Authors | Jan Wöhlke, Shile Li, Dongheui Lee |
Abstract | Since the emergence of large annotated datasets, state-of-the-art hand pose estimation methods have been mostly based on discriminative learning. Recently, a hybrid approach has embedded a kinematic layer into the deep learning structure in such a way that the pose estimates obey the physical constraints of human hand kinematics. However, the existing approach relies on a single person’s hand shape parameters, which are fixed constants. Therefore, the existing hybrid method has problems to generalize to new, unseen hands. In this work, we extend the kinematic layer to make the hand shape parameters learnable. In this way, the learnt network can generalize towards arbitrary hand shapes. Furthermore, inspired by the idea of Spatial Transformer Networks, we apply a cascade of appearance normalization networks to decrease the variance in the input data. The input images are shifted, rotated, and globally scaled to a similar appearance. The effectiveness and limitations of our proposed approach are extensively evaluated on the Hands 2017 challenge dataset and the NYU dataset. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00898v1 |
http://arxiv.org/pdf/1807.00898v1.pdf | |
PWC | https://paperswithcode.com/paper/model-based-hand-pose-estimation-for |
Repo | |
Framework | |
Using NLP on news headlines to predict index trends
Title | Using NLP on news headlines to predict index trends |
Authors | Marc Velay, Fabrice Daniel |
Abstract | This paper attempts to provide a state of the art in trend prediction using news headlines. We present the research done on predicting DJIA trends using Natural Language Processing. We will explain the different algorithms we have used as well as the various embedding techniques attempted. We rely on statistical and deep learning models in order to extract information from the corpuses. |
Tasks | |
Published | 2018-06-22 |
URL | http://arxiv.org/abs/1806.09533v1 |
http://arxiv.org/pdf/1806.09533v1.pdf | |
PWC | https://paperswithcode.com/paper/using-nlp-on-news-headlines-to-predict-index |
Repo | |
Framework | |
Coupled Recurrent Models for Polyphonic Music Composition
Title | Coupled Recurrent Models for Polyphonic Music Composition |
Authors | John Thickstun, Zaid Harchaoui, Dean P. Foster, Sham M. Kakade |
Abstract | This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music. We propose an efficient new conditional probabilistic factorization of musical scores, viewing a score as a collection of concurrent, coupled sequences: i.e. voices. To model the conditional distributions, we borrow ideas from both convolutional and recurrent neural models; we argue that these ideas are natural for capturing music’s pitch invariances, temporal structure, and polyphony. We train models for single-voice and multi-voice composition on 2,300 scores from the KernScores dataset. |
Tasks | Time Series |
Published | 2018-11-20 |
URL | https://arxiv.org/abs/1811.08045v2 |
https://arxiv.org/pdf/1811.08045v2.pdf | |
PWC | https://paperswithcode.com/paper/coupled-recurrent-models-for-polyphonic-music |
Repo | |
Framework | |
Fast, Better Training Trick — Random Gradient
Title | Fast, Better Training Trick — Random Gradient |
Authors | Jiakai Wei |
Abstract | In this paper, we will show an unprecedented method to accelerate training and improve performance, which called random gradient (RG). This method can be easier to the training of any model without extra calculation cost, we use Image classification, Semantic segmentation, and GANs to confirm this method can improve speed which is training model in computer vision. The central idea is using the loss multiplied by a random number to random reduce the back-propagation gradient. We can use this method to produce a better result in Pascal VOC, Cifar, Cityscapes datasets. |
Tasks | Image Classification, Semantic Segmentation |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04293v1 |
http://arxiv.org/pdf/1808.04293v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-better-training-trick-random-gradient |
Repo | |
Framework | |
Disentangling Latent Hands for Image Synthesis and Pose Estimation
Title | Disentangling Latent Hands for Image Synthesis and Pose Estimation |
Authors | Linlin Yang, Angela Yao |
Abstract | Hand image synthesis and pose estimation from RGB images are both highly challenging tasks due to the large discrepancy between factors of variation ranging from image background content to camera viewpoint. To better analyze these factors of variation, we propose the use of disentangled representations and a disentangled variational autoencoder (dVAE) that allows for specific sampling and inference of these factors. The derived objective from the variational lower bound as well as the proposed training strategy are highly flexible, allowing us to handle cross-modal encoders and decoders as well as semi-supervised learning scenarios. Experiments show that our dVAE can synthesize highly realistic images of the hand specifiable by both pose and image background content and also estimate 3D hand poses from RGB images with accuracy competitive with state-of-the-art on two public benchmarks. |
Tasks | Image Generation, Pose Estimation |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.01002v2 |
http://arxiv.org/pdf/1812.01002v2.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-latent-hands-for-image |
Repo | |
Framework | |
Dynamic Measurement Scheduling for Adverse Event Forecasting using Deep RL
Title | Dynamic Measurement Scheduling for Adverse Event Forecasting using Deep RL |
Authors | Chun-Hao Chang, Mingjie Mai, Anna Goldenberg |
Abstract | Current clinical practice to monitor patients’ health follows either regular or heuristic-based lab test (e.g. blood test) scheduling. Such practice not only gives rise to redundant measurements accruing cost, but may even lead to unnecessary patient discomfort. From the computational perspective, heuristic-based test scheduling might lead to reduced accuracy of clinical forecasting models. Computationally learning an optimal clinical test scheduling and measurement collection, is likely to lead to both, better predictive models and patient outcome improvement. We address the scheduling problem using deep reinforcement learning (RL) to achieve high predictive gain and low measurement cost, by scheduling fewer, but strategically timed tests. We first show that in the simulation our policy outperforms heuristic-based measurement scheduling with higher predictive gain or lower cost measured by accumulated reward. We then learn a scheduling policy for mortality forecasting in the real-world clinical dataset (MIMIC3), our learned policy is able to provide useful clinical insights. To our knowledge, this is the first RL application on multi-measurement scheduling problem in the clinical setting. |
Tasks | |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00268v1 |
http://arxiv.org/pdf/1812.00268v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-measurement-scheduling-for-adverse |
Repo | |
Framework | |
From Machine to Machine: An OCT-trained Deep Learning Algorithm for Objective Quantification of Glaucomatous Damage in Fundus Photographs
Title | From Machine to Machine: An OCT-trained Deep Learning Algorithm for Objective Quantification of Glaucomatous Damage in Fundus Photographs |
Authors | Felipe A. Medeiros, Alessandro A. Jammal, Atalie C. Thompson |
Abstract | Previous approaches using deep learning algorithms to classify glaucomatous damage on fundus photographs have been limited by the requirement for human labeling of a reference training set. We propose a new approach using spectral-domain optical coherence tomography (SDOCT) data to train a deep learning algorithm to quantify glaucomatous structural damage on optic disc photographs. The dataset included 32,820 pairs of optic disc photos and SDOCT retinal nerve fiber layer (RNFL) scans from 2,312 eyes of 1,198 subjects. A deep learning convolutional neural network was trained to assess optic disc photographs and predict SDOCT average RNFL thickness. The performance of the algorithm was evaluated in an independent test sample. The mean prediction of average RNFL thickness from all 6,292 optic disc photos in the test set was 83.3$\pm$14.5 $\mu$m, whereas the mean average RNFL thickness from all corresponding SDOCT scans was 82.5$\pm$16.8 $\mu$m (P = 0.164). There was a very strong correlation between predicted and observed RNFL thickness values (r = 0.832; P<0.001), with mean absolute error of the predictions of 7.39 $\mu$m. The areas under the receiver operating characteristic curves for discriminating glaucoma from healthy eyes with the deep learning predictions and actual SDOCT measurements were 0.944 (95$%$ CI: 0.912- 0.966) and 0.940 (95$%$ CI: 0.902 - 0.966), respectively (P = 0.724). In conclusion, we introduced a novel deep learning approach to assess optic disc photographs and provide quantitative information about the amount of neural damage. This approach could potentially be used to diagnose and stage glaucomatous damage from optic disc photographs. |
Tasks | |
Published | 2018-10-20 |
URL | http://arxiv.org/abs/1810.10343v1 |
http://arxiv.org/pdf/1810.10343v1.pdf | |
PWC | https://paperswithcode.com/paper/from-machine-to-machine-an-oct-trained-deep |
Repo | |
Framework | |
Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words
Title | Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words |
Authors | A. B. Veretennikov |
Abstract | Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files. |
Tasks | |
Published | 2018-01-27 |
URL | http://arxiv.org/abs/1801.09079v2 |
http://arxiv.org/pdf/1801.09079v2.pdf | |
PWC | https://paperswithcode.com/paper/using-additional-indexes-for-fast-full-text |
Repo | |
Framework | |
Learning by Playing - Solving Sparse Reward Tasks from Scratch
Title | Learning by Playing - Solving Sparse Reward Tasks from Scratch |
Authors | Martin Riedmiller, Roland Hafner, Thomas Lampe, Michael Neunert, Jonas Degrave, Tom Van de Wiele, Volodymyr Mnih, Nicolas Heess, Jost Tobias Springenberg |
Abstract | We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors - from scratch - in the presence of multiple sparse reward signals. To this end, the agent is equipped with a set of general auxiliary tasks, that it attempts to learn simultaneously via off-policy RL. The key idea behind our method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment - enabling it to excel at sparse reward RL. Our experiments in several challenging robotic manipulation settings demonstrate the power of our approach. |
Tasks | |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10567v1 |
http://arxiv.org/pdf/1802.10567v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-by-playing-solving-sparse-reward |
Repo | |
Framework | |
Joint Optimization Framework for Learning with Noisy Labels
Title | Joint Optimization Framework for Learning with Noisy Labels |
Authors | Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa |
Abstract | Deep neural networks (DNNs) trained on large-scale datasets have exhibited significant performance in image classification. Many large-scale datasets are collected from websites, however they tend to contain inaccurate labels that are termed as noisy labels. Training on such noisy labeled datasets causes performance degradation because DNNs easily overfit to noisy labels. To overcome this problem, we propose a joint optimization framework of learning DNN parameters and estimating true labels. Our framework can correct labels during training by alternating update of network parameters and labels. We conduct experiments on the noisy CIFAR-10 datasets and the Clothing1M dataset. The results indicate that our approach significantly outperforms other state-of-the-art methods. |
Tasks | Image Classification |
Published | 2018-03-30 |
URL | http://arxiv.org/abs/1803.11364v1 |
http://arxiv.org/pdf/1803.11364v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-optimization-framework-for-learning |
Repo | |
Framework | |
Robust Bhattacharyya bound linear discriminant analysis through adaptive algorithm
Title | Robust Bhattacharyya bound linear discriminant analysis through adaptive algorithm |
Authors | Chun-Na Li, Yuan-Hai Shao, Zhen Wang, Nai-Yang Deng |
Abstract | In this paper, we propose a novel linear discriminant analysis criterion via the Bhattacharyya error bound estimation based on a novel L1-norm (L1BLDA) and L2-norm (L2BLDA). Both L1BLDA and L2BLDA maximize the between-class scatters which are measured by the weighted pairwise distances of class means and meanwhile minimize the within-class scatters under the L1-norm and L2-norm, respectively. The proposed models can avoid the small sample size (SSS) problem and have no rank limit that may encounter in LDA. It is worth mentioning that, the employment of L1-norm gives a robust performance of L1BLDA, and L1BLDA is solved through an effective non-greedy alternating direction method of multipliers (ADMM), where all the projection vectors can be obtained once for all. In addition, the weighting constants of L1BLDA and L2BLDA between the between-class and within-class terms are determined by the involved data set, which makes our L1BLDA and L2BLDA adaptive. The experimental results on both benchmark data sets as well as the handwritten digit databases demonstrate the effectiveness of the proposed methods. |
Tasks | |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02384v1 |
http://arxiv.org/pdf/1811.02384v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-bhattacharyya-bound-linear |
Repo | |
Framework | |