January 31, 2020

3473 words 17 mins read

Paper Group ANR 161

Paper Group ANR 161

Learning to Select, Track, and Generate for Data-to-Text. Conditional Hierarchical Bayesian Tucker Decomposition. Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework. Clustering by Directly Disentangling Latent Space. Topology Based Scalable Graph Kernels. Locally Accelerated Conditional Gradients. Label Aware Grap …

Learning to Select, Track, and Generate for Data-to-Text

Title Learning to Select, Track, and Generate for Data-to-Text
Authors Hayate Iso, Yui Uehara, Tatsuya Ishigaki, Hiroshi Noji, Eiji Aramaki, Ichiro Kobayashi, Yusuke Miyao, Naoaki Okazaki, Hiroya Takamura
Abstract We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking module selects and keeps track of salient information and memorizes which record has been mentioned. Our generation module generates a summary conditioned on the state of tracking module. Our model is considered to simulate the human-like writing process that gradually selects the information by determining the intermediate variables while writing the summary. In addition, we also explore the effectiveness of the writer information for generation. Experimental results show that our model outperforms existing models in all evaluation metrics even without writer information. Incorporating writer information further improves the performance, contributing to content planning and surface realization.
Tasks Data-to-Text Generation, Text Generation
Published 2019-07-23
URL https://arxiv.org/abs/1907.09699v1
PDF https://arxiv.org/pdf/1907.09699v1.pdf
PWC https://paperswithcode.com/paper/learning-to-select-track-and-generate-for
Repo
Framework

Conditional Hierarchical Bayesian Tucker Decomposition

Title Conditional Hierarchical Bayesian Tucker Decomposition
Authors Adam Sandler, Diego Klabjan, Yuan Luo
Abstract Our research focuses on studying and developing methods for reducing the dimensionality of large datasets, common in biomedical applications. A major problem when learning information about patients based on genetic sequencing data is that there are often more feature variables (genetic data) than observations (patients). This makes direct supervised learning difficult. One way of reducing the feature space is to use latent Dirichlet allocation in order to group genetic variants in an unsupervised manner. Latent Dirichlet allocation is a common model in natural language processing, which describes a document as a mixture of topics, each with a probability of generating certain words. This can be generalized as a Bayesian tensor decomposition to account for multiple feature variables. While we made some progress improving and modifying these methods, our significant contributions are with hierarchical topic modeling. We developed distinct methods of incorporating hierarchical topic modeling, based on nested Chinese restaurant processes and Pachinko Allocation Machine, into Bayesian tensor decompositions. We apply these models to predict whether or not patients have autism spectrum disorder based on genetic sequencing data. We examine a dataset from National Database for Autism Research consisting of paired siblings – one with autism, and the other without – and counts of their genetic variants. Additionally, we linked the genes with their Reactome biological pathways. We combine this information into a tensor of patients, counts of their genetic variants, and the membership of these genes in pathways. Once we decompose this tensor, we use logistic regression on the reduced features in order to predict if patients have autism. We also perform a similar analysis of a dataset of patients with one of four common types of cancer (breast, lung, prostate, and colorectal).
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1911.12426v2
PDF https://arxiv.org/pdf/1911.12426v2.pdf
PWC https://paperswithcode.com/paper/conditional-hierarchical-bayesian-tucker
Repo
Framework

Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework

Title Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework
Authors Haoran Wang, Xun Yu Zhou
Abstract We approach the continuous-time mean-variance (MV) portfolio selection with reinforcement learning (RL). The problem is to achieve the best tradeoff between exploration and exploitation, and is formulated as an entropy-regularized, relaxed stochastic control problem. We prove that the optimal feedback policy for this problem must be Gaussian, with time-decaying variance. We then establish connections between the entropy-regularized MV and the classical MV, including the solvability equivalence and the convergence as exploration weighting parameter decays to zero. Finally, we prove a policy improvement theorem, based on which we devise an implementable RL algorithm. We find that our algorithm outperforms both an adaptive control based method and a deep neural networks based algorithm by a large margin in our simulations.
Tasks Continuous Control, Portfolio Optimization
Published 2019-04-25
URL https://arxiv.org/abs/1904.11392v2
PDF https://arxiv.org/pdf/1904.11392v2.pdf
PWC https://paperswithcode.com/paper/continuous-time-mean-variance-portfolio
Repo
Framework

Clustering by Directly Disentangling Latent Space

Title Clustering by Directly Disentangling Latent Space
Authors Fei Ding, Feng Luo
Abstract To overcome the high dimensionality of data, learning latent feature representations for clustering has been widely studied recently. However, it is still challenging to learn “cluster-friendly” latent representations due to the unsupervised fashion of clustering. In this paper, we propose Disentangling Latent Space Clustering (DLS-Clustering), a new clustering mechanism that directly learning cluster assignment during the disentanglement of latent spacing without constructing the “cluster-friendly” latent representation and additional clustering methods. We achieve the bidirectional mapping by enforcing an inference network (i.e. encoder) and the generator of GAN to form a deterministic encoder-decoder pair with a maximum mean discrepancy (MMD)-based regularization. We utilize a weight-sharing procedure to disentangle latent space into the one-hot discrete latent variables and the continuous latent variables. The disentangling process is actually performing the clustering operation. Eventually the one-hot discrete latent variables can be directly expressed as clusters, and the continuous latent variables represent remaining unspecified factors. Experiments on six benchmark datasets of different types demonstrate that our method outperforms existing state-of-the-art methods. We further show that the latent representations from DLS-Clustering also maintain the ability to generate diverse and high-quality images, which can support more promising application scenarios.
Tasks
Published 2019-11-13
URL https://arxiv.org/abs/1911.05210v1
PDF https://arxiv.org/pdf/1911.05210v1.pdf
PWC https://paperswithcode.com/paper/clustering-by-directly-disentangling-latent
Repo
Framework

Topology Based Scalable Graph Kernels

Title Topology Based Scalable Graph Kernels
Authors Kin Sum Liu, Chien-Chun Ni, Yu-Yao Lin, Jie Gao
Abstract We propose a new graph kernel for graph classification and comparison using Ollivier Ricci curvature. The Ricci curvature of an edge in a graph describes the connectivity in the local neighborhood. An edge in a densely connected neighborhood has positive curvature and an edge serving as a local bridge has negative curvature. We use the edge curvature distribution to form a graph kernel which is then used to compare and cluster graphs. The curvature kernel uses purely the graph topology and thereby works for settings when node attributes are not available.
Tasks Graph Classification
Published 2019-07-15
URL https://arxiv.org/abs/1907.07129v1
PDF https://arxiv.org/pdf/1907.07129v1.pdf
PWC https://paperswithcode.com/paper/topology-based-scalable-graph-kernels
Repo
Framework

Locally Accelerated Conditional Gradients

Title Locally Accelerated Conditional Gradients
Authors Jelena Diakonikolas, Alejandro Carderera, Sebastian Pokutta
Abstract Conditional gradients constitute a class of projection-free first-order algorithms for smooth convex optimization. As such, they are frequently used in solving smooth convex optimization problems over polytopes, for which the computational cost of orthogonal projections would be prohibitive. However, they do not enjoy the optimal convergence rates achieved by projection-based accelerated methods; moreover, achieving such globally-accelerated rates is information-theoretically impossible for these methods. To address this issue, we present Locally Accelerated Conditional Gradients – an algorithmic framework that couples accelerated steps with conditional gradient steps to achieve local acceleration on smooth strongly convex problems. Our approach does not require projections onto the feasible set, but only on (typically low-dimensional) simplices, thus keeping the computational cost of projections at bay. Further, it achieves the optimal accelerated local convergence. Our theoretical results are supported by numerical experiments, which demonstrate significant speedups of our framework over state of the art methods in both per-iteration progress and wall-clock time.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.07867v2
PDF https://arxiv.org/pdf/1906.07867v2.pdf
PWC https://paperswithcode.com/paper/locally-accelerated-conditional-gradients
Repo
Framework

Label Aware Graph Convolutional Network – Not All Edges Deserve Your Attention

Title Label Aware Graph Convolutional Network – Not All Edges Deserve Your Attention
Authors Hao Chen, Lu Wang, Senzhang Wang, Dijun Luo, Wenbing Huang, Zhoujun Li
Abstract Graph classification is practically important in many domains. To solve this problem, one usually calculates a low-dimensional representation for each node in the graph with supervised or unsupervised approaches. Most existing approaches consider all the edges between nodes while overlooking whether the edge will brings positive or negative influence to the node representation learning. In many real-world applications, however, some connections among the nodes can be noisy for graph convolution, and not all the edges deserve your attention. In this work, we distinguish the positive and negative impacts of the neighbors to the node in graph node classification, and propose to enhance the graph convolutional network by considering the labels between the neighbor edges. We present a novel GCN framework, called Label-aware Graph Convolutional Network (LAGCN), which incorporates the supervised and unsupervised learning by introducing the edge label predictor. As a general model, LAGCN can be easily adapted in various previous GCN and enhance their performance with some theoretical guarantees. Experimental results on multiple real-world datasets show that LAGCN is competitive against various state-of-the-art methods in graph classification.
Tasks Graph Classification, Node Classification, Representation Learning
Published 2019-07-10
URL https://arxiv.org/abs/1907.04707v1
PDF https://arxiv.org/pdf/1907.04707v1.pdf
PWC https://paperswithcode.com/paper/label-aware-graph-convolutional-network-not
Repo
Framework

Improving Image Classification Robustness through Selective CNN-Filters Fine-Tuning

Title Improving Image Classification Robustness through Selective CNN-Filters Fine-Tuning
Authors Alessandro Bianchi, Moreno Raimondo Vendra, Pavlos Protopapas, Marco Brambilla
Abstract Image quality plays a big role in CNN-based image classification performance. Fine-tuning the network with distorted samples may be too costly for large networks. To solve this issue, we propose a transfer learning approach optimized to keep into account that in each layer of a CNN some filters are more susceptible to image distortion than others. Our method identifies the most susceptible filters and applies retraining only to the filters that show the highest activation maps distance between clean and distorted images. Filters are ranked using the Borda count election method and then only the most affected filters are fine-tuned. This significantly reduces the number of parameters to retrain. We evaluate this approach on the CIFAR-10 and CIFAR-100 datasets, testing it on two different models and two different types of distortion. Results show that the proposed transfer learning technique recovers most of the lost performance due to input data distortion, at a considerably faster pace with respect to existing methods, thanks to the reduced number of parameters to fine-tune. When few noisy samples are provided for training, our filter-level fine tuning performs particularly well, also outperforming state of the art layer-level transfer learning approaches.
Tasks Image Classification, Transfer Learning
Published 2019-04-08
URL http://arxiv.org/abs/1904.03949v1
PDF http://arxiv.org/pdf/1904.03949v1.pdf
PWC https://paperswithcode.com/paper/improving-image-classification-robustness
Repo
Framework

Estimating sex and age for forensic applications using machine learning based on facial measurements from frontal cephalometric landmarks

Title Estimating sex and age for forensic applications using machine learning based on facial measurements from frontal cephalometric landmarks
Authors Lucas F. Porto, Laise N. Correia Lima, Ademir Franco, Donald M. Pianto, Carlos Eduardo Machado Palhares, Donald M. Pianto, Flavio de Barros Vidal
Abstract Facial analysis permits many investigations some of the most important of which are craniofacial identification, facial recognition, and age and sex estimation. In forensics, photo-anthropometry describes the study of facial growth and allows the identification of patterns in facial skull development by using a group of cephalometric landmarks to estimate anthropological information. In several areas, automation of manual procedures has achieved advantages over and similar measurement confidence as a forensic expert. This manuscript presents an approach using photo-anthropometric indexes, generated from frontal faces cephalometric landmarks, to create an artificial neural network classifier that allows the estimation of anthropological information, in this specific case age and sex. The work is focused on four tasks: i) sex estimation over ages from 5 to 22 years old, evaluating the interference of age on sex estimation; ii) age estimation from photo-anthropometric indexes for four age intervals (1 year, 2 years, 4 years and 5 years); iii) age group estimation for thresholds of over 14 and over 18 years old; and; iv) the provision of a new data set, available for academic purposes only, with a large and complete set of facial photo-anthropometric points marked and checked by forensic experts, measured from over 18,000 faces of individuals from Brazil over the last 4 years. The proposed classifier obtained significant results, using this new data set, for the sex estimation of individuals over 14 years old, achieving accuracy values greater than 0.85 by the F_1 measure. For age estimation, the accuracy results are 0.72 for measure with an age interval of 5 years. For the age group estimation, the measures of accuracy are greater than 0.93 and 0.83 for thresholds of 14 and 18 years, respectively.
Tasks Age Estimation
Published 2019-08-06
URL https://arxiv.org/abs/1908.02353v1
PDF https://arxiv.org/pdf/1908.02353v1.pdf
PWC https://paperswithcode.com/paper/estimating-sex-and-age-for-forensic
Repo
Framework

SenseBERT: Driving Some Sense into BERT

Title SenseBERT: Driving Some Sense into BERT
Authors Yoav Levine, Barak Lenz, Or Dagan, Dan Padnos, Or Sharir, Shai Shalev-Shwartz, Amnon Shashua, Yoav Shoham
Abstract Self-supervision techniques have allowed neural language models to advance the frontier in Natural Language Understanding. However, existing self-supervision techniques operate at the word-form level, which serves as a surrogate for the underlying semantic content. This paper proposes a method to employ self-supervision directly at the word-sense level. Our model, named SenseBERT, is pre-trained to predict not only the masked words but also their WordNet supersenses. Accordingly, we attain a lexical-semantic level language model, without the use of human annotation. SenseBERT achieves significantly improved lexical understanding, as we demonstrate by experimenting on SemEval, and by attaining a state of the art result on the Word in Context (WiC) task. Our approach is extendable to other linguistic signals, which can be similarly integrated into the pre-training process, leading to increasingly semantically informed language models.
Tasks Language Modelling
Published 2019-08-15
URL https://arxiv.org/abs/1908.05646v1
PDF https://arxiv.org/pdf/1908.05646v1.pdf
PWC https://paperswithcode.com/paper/sensebert-driving-some-sense-into-bert
Repo
Framework

Biometric Fish Classification of Temperate Species Using Convolutional Neural Network with Squeeze-and-Excitation

Title Biometric Fish Classification of Temperate Species Using Convolutional Neural Network with Squeeze-and-Excitation
Authors Erlend Olsvik, Christian M. D. Trinh, Kristian Muri Knausgård, Arne Wiklund, Tonje Knutsen Sørdalen, Alf Ring Kleiven, Lei Jiao, Morten Goodwin
Abstract Our understanding and ability to effectively monitor and manage coastal ecosystems are severely limited by observation methods. Automatic recognition of species in natural environment is a promising tool which would revolutionize video and image analysis for a wide range of applications in marine ecology. However, classifying fish from images captured by underwater cameras is in general very challenging due to noise and illumination variations in water. Previous classification methods in the literature relies on filtering the images to separate the fish from the background or sharpening the images by removing background noise. This pre-filtering process may negatively impact the classification accuracy. In this work, we propose a Convolutional Neural Network (CNN) using the Squeeze-and-Excitation (SE) architecture for classifying images of fish without pre-filtering. Different from conventional schemes, this scheme is divided into two steps. The first step is to train the fish classifier via a public data set, i.e., Fish4Knowledge, without using image augmentation, named as pre-training. The second step is to train the classifier based on a new data set consisting of species that we are interested in for classification, named as post-training. The weights obtained from pre-training are applied to post-training as a priori. This is also known as transfer learning. Our solution achieves the state-of-the-art accuracy of 99.27% accuracy on the pre-training. The accuracy on the post-training is 83.68%. Experiments on the post-training with image augmentation yields an accuracy of 87.74%, indicating that the solution is viable with a larger data set.
Tasks Image Augmentation, Transfer Learning
Published 2019-04-04
URL http://arxiv.org/abs/1904.02768v1
PDF http://arxiv.org/pdf/1904.02768v1.pdf
PWC https://paperswithcode.com/paper/biometric-fish-classification-of-temperate
Repo
Framework

Less Memory, Faster Speed: Refining Self-Attention Module for Image Reconstruction

Title Less Memory, Faster Speed: Refining Self-Attention Module for Image Reconstruction
Authors Zheng Wang, Jianwu Li, Ge Song, Tieling Li
Abstract Self-attention (SA) mechanisms can capture effectively global dependencies in deep neural networks, and have been applied to natural language processing and image processing successfully. However, SA modules for image reconstruction have high time and space complexity, which restrict their applications to higher-resolution images. In this paper, we refine the SA module in self-attention generative adversarial networks (SAGAN) via adapting a non-local operation, revising the connectivity among the units in SA module and re-implementing its computational pattern, such that its time and space complexity is reduced from $\text{O}(n^2)$ to $\text{O}(n)$, but it is still equivalent to the original SA module. Further, we explore the principles behind the module and discover that our module is a special kind of channel attention mechanisms. Experimental results based on two benchmark datasets of image reconstruction, verify that under the same computational environment, two models can achieve comparable effectiveness for image reconstruction, but the proposed one runs faster and takes up less memory space.
Tasks Image Reconstruction
Published 2019-05-20
URL https://arxiv.org/abs/1905.08008v1
PDF https://arxiv.org/pdf/1905.08008v1.pdf
PWC https://paperswithcode.com/paper/less-memory-faster-speed-refining-self
Repo
Framework

Gradient Boosts the Approximate Vanishing Ideal

Title Gradient Boosts the Approximate Vanishing Ideal
Authors Hiroshi Kera, Yoshihiko Hasegawa
Abstract In the last decade, the approximate vanishing ideal and its basis construction algorithms have been extensively studied in computer algebra and machine learning as a general model to reconstruct the algebraic variety on which noisy data approximately lie. In particular, the basis construction algorithms developed in machine learning are widely used in applications across many fields because of their monomial-order-free property; however, they lose many of the theoretical properties of computer-algebraic algorithms. In this paper, we propose general methods that equip monomial-order-free algorithms with several advantageous theoretical properties. Specifically, we exploit the gradient to (i) sidestep the spurious vanishing problem in polynomial time to remove symbolically trivial redundant bases, (ii) achieve consistent output with respect to the translation and scaling of input, and (iii) remove nontrivially redundant bases. The proposed methods work in a fully numerical manner, whereas existing algorithms require the awkward monomial order or exponentially costly (and mostly symbolic) computation to realize properties (i) and (iii). To our knowledge, property (ii) has not been achieved by any existing basis construction algorithm of the approximate vanishing ideal.
Tasks
Published 2019-11-11
URL https://arxiv.org/abs/1911.04174v1
PDF https://arxiv.org/pdf/1911.04174v1.pdf
PWC https://paperswithcode.com/paper/gradient-boosts-the-approximate-vanishing
Repo
Framework

Competing Ratio Loss for Discriminative Multi-class Image Classification

Title Competing Ratio Loss for Discriminative Multi-class Image Classification
Authors Ke Zhang, Xinsheng Wang, Yurong Guo, Zhenbing Zhao, Zhanyu Ma, Tony X. Han
Abstract The development of deep convolutional neural network architecture is critical to the improvement of image classification task performance. A lot of studies of image classification based on deep convolutional neural network focus on the network structure to improve the image classification performance. Contrary to these studies, we focus on the loss function. Cross-entropy Loss (CEL) is widely used for training a multi-class classification deep convolutional neural network. While CEL has been successfully implemented in image classification tasks, it only focuses on the posterior probability of correct class when the labels of training images are one-hot. It cannot be discriminated against the classes not belong to correct class (wrong classes) directly. In order to solve the problem of CEL, we propose Competing Ratio Loss (CRL), which calculates the posterior probability ratio between the correct class and competing wrong classes to better discriminate the correct class from competing wrong classes, increasing the difference between the negative log likelihood of the correct class and the negative log likelihood of competing wrong classes, widening the difference between the probability of the correct class and the probabilities of wrong classes. To demonstrate the effectiveness of our loss function, we perform some sets of experiments on different types of image classification datasets, including CIFAR, SVHN, CUB200- 2011, Adience and ImageNet datasets. The experimental results show the effectiveness and robustness of our loss function on different deep convolutional neural network architectures and different image classification tasks, such as fine-grained image classification, hard face age estimation and large-scale image classification.
Tasks Age Estimation, Fine-Grained Image Classification, Image Classification
Published 2019-07-31
URL https://arxiv.org/abs/1907.13349v2
PDF https://arxiv.org/pdf/1907.13349v2.pdf
PWC https://paperswithcode.com/paper/competing-ratio-loss-for-discriminative-multi
Repo
Framework

Deep Differentiable Random Forests for Age Estimation

Title Deep Differentiable Random Forests for Age Estimation
Authors Wei Shen, Yilu Guo, Yan Wang, Kai Zhao, Bo Wang, Alan Yuille
Abstract Age estimation from facial images is typically cast as a label distribution learning or regression problem, since aging is a gradual progress. Its main challenge is the facial feature space w.r.t. ages is inhomogeneous, due to the large variation in facial appearance across different persons of the same age and the non-stationary property of aging. In this paper, we propose two Deep Differentiable Random Forests methods, Deep Label Distribution Learning Forest (DLDLF) and Deep Regression Forest (DRF), for age estimation. Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes. This joint learning follows an alternating strategy: (1) Fixing the leaf nodes and optimizing the split nodes and the CNN parameters by Back-propagation; (2) Fixing the split nodes and optimizing the leaf nodes by Variational Bounding. Two Deterministic Annealing processes are introduced into the learning of the split and leaf nodes, respectively, to avoid poor local optima and obtain better estimates of tree parameters free of initial values. Experimental results show that DLDLF and DRF achieve state-of-the-art performance on three age estimation datasets.
Tasks Age Estimation
Published 2019-07-23
URL https://arxiv.org/abs/1907.10665v2
PDF https://arxiv.org/pdf/1907.10665v2.pdf
PWC https://paperswithcode.com/paper/deep-differentiable-random-forests-for-age
Repo
Framework
comments powered by Disqus