October 17, 2019

3111 words 15 mins read

Paper Group ANR 781

Paper Group ANR 781

Internal node bagging. Distance Measure Machines. On PAC-Bayesian Bounds for Random Forests. Geometric Median Shapes. Degree based Classification of Harmful Speech using Twitter Data. Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners. Iterative Recursive Attention Model for Interpretable Sequence Classific …

Internal node bagging

Title Internal node bagging
Authors Shun Yi
Abstract We introduce a novel view to understand how dropout works as an inexplicit ensemble learning method, which doesn’t point out how many and which nodes to learn a certain feature. We propose a new training method named internal node bagging, it explicitly forces a group of nodes to learn a certain feature in training time, and combine those nodes to be one node in inference time. It means we can use much more parameters to improve model’s fitting ability in training time while keeping model small in inference time. We test our method on several benchmark datasets and find it performs significantly better than dropout on small models.
Tasks
Published 2018-05-01
URL http://arxiv.org/abs/1805.00215v5
PDF http://arxiv.org/pdf/1805.00215v5.pdf
PWC https://paperswithcode.com/paper/internal-node-bagging
Repo
Framework

Distance Measure Machines

Title Distance Measure Machines
Authors Alain Rakotomamonjy, Abraham Traoré, Maxime Berar, Rémi Flamary, Nicolas Courty
Abstract This paper presents a distance-based discriminative framework for learning with probability distributions. Instead of using kernel mean embeddings or generalized radial basis kernels, we introduce embeddings based on dissimilarity of distributions to some reference distributions denoted as templates. Our framework extends the theory of similarity of Balcan et al. (2008) to the population distribution case and we show that, for some learning problems, some dissimilarity on distribution achieves low-error linear decision functions with high probability. Our key result is to prove that the theory also holds for empirical distributions. Algorithmically, the proposed approach consists in computing a mapping based on pairwise dissimilarity where learning a linear decision function is amenable. Our experimental results show that the Wasserstein distance embedding performs better than kernel mean embeddings and computing Wasserstein distance is far more tractable than estimating pairwise Kullback-Leibler divergence of empirical distributions.
Tasks
Published 2018-03-01
URL http://arxiv.org/abs/1803.00250v3
PDF http://arxiv.org/pdf/1803.00250v3.pdf
PWC https://paperswithcode.com/paper/distance-measure-machines
Repo
Framework

On PAC-Bayesian Bounds for Random Forests

Title On PAC-Bayesian Bounds for Random Forests
Authors Stephan Sloth Lorenzen, Christian Igel, Yevgeny Seldin
Abstract Existing guarantees in terms of rigorous upper bounds on the generalization error for the original random forest algorithm, one of the most frequently used machine learning methods, are unsatisfying. We discuss and evaluate various PAC-Bayesian approaches to derive such bounds. The bounds do not require additional hold-out data, because the out-of-bag samples from the bagging in the training process can be exploited. A random forest predicts by taking a majority vote of an ensemble of decision trees. The first approach is to bound the error of the vote by twice the error of the corresponding Gibbs classifier (classifying with a single member of the ensemble selected at random). However, this approach does not take into account the effect of averaging out of errors of individual classifiers when taking the majority vote. This effect provides a significant boost in performance when the errors are independent or negatively correlated, but when the correlations are strong the advantage from taking the majority vote is small. The second approach based on PAC-Bayesian C-bounds takes dependencies between ensemble members into account, but it requires estimating correlations between the errors of the individual classifiers. When the correlations are high or the estimation is poor, the bounds degrade. In our experiments, we compute generalization bounds for random forests on various benchmark data sets. Because the individual decision trees already perform well, their predictions are highly correlated and the C-bounds do not lead to satisfactory results. For the same reason, the bounds based on the analysis of Gibbs classifiers are typically superior and often reasonably tight. Bounds based on a validation set coming at the cost of a smaller training set gave better performance guarantees, but worse performance in most experiments.
Tasks
Published 2018-10-23
URL http://arxiv.org/abs/1810.09746v2
PDF http://arxiv.org/pdf/1810.09746v2.pdf
PWC https://paperswithcode.com/paper/on-pac-bayesian-bounds-for-random-forests
Repo
Framework

Geometric Median Shapes

Title Geometric Median Shapes
Authors Alexandre Cunha
Abstract We present an algorithm to compute the geometric median of shapes which is based on the extension of median to high dimensions. The median finding problem is formulated as an optimization over distances and it is solved directly using the watershed method as an optimizer. We show that computing the geometric median of shapes is robust in the presence of outliers and it is superior to the mean shape which can easily be affected by the presence of outliers. The geometric median shape thus faithfully represents the true central tendency of the data, contaminated or not. Our approach can be applied to manifold and non manifold shapes, with connected or disconnected shapes. The application of distance transforms and watershed algorithm, two well established constructs of image processing, lead to an algorithm that can be quickly implemented to generate fast solutions with linear storage requirements. We demonstrate our methods in synthetic and natural shapes and compare median and mean results under increasing contamination by strong outliers.
Tasks
Published 2018-10-29
URL http://arxiv.org/abs/1810.12445v3
PDF http://arxiv.org/pdf/1810.12445v3.pdf
PWC https://paperswithcode.com/paper/geometric-median-shapes
Repo
Framework

Degree based Classification of Harmful Speech using Twitter Data

Title Degree based Classification of Harmful Speech using Twitter Data
Authors Sanjana Sharma, Saksham Agrawal, Manish Shrivastava
Abstract Harmful speech has various forms and it has been plaguing the social media in different ways. If we need to crackdown different degrees of hate speech and abusive behavior amongst it, the classification needs to be based on complex ramifications which needs to be defined and hold accountable for, other than racist, sexist or against some particular group and community. This paper primarily describes how we created an ontological classification of harmful speech based on degree of hateful intent, and used it to annotate twitter data accordingly. The key contribution of this paper is the new dataset of tweets we created based on ontological classes and degrees of harmful speech found in the text. We also propose supervised classification system for recognizing these respective harmful speech classes in the texts hence.
Tasks
Published 2018-06-11
URL http://arxiv.org/abs/1806.04197v1
PDF http://arxiv.org/pdf/1806.04197v1.pdf
PWC https://paperswithcode.com/paper/degree-based-classification-of-harmful-speech
Repo
Framework

Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners

Title Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners
Authors Yuxin Chen, Adish Singla, Oisin Mac Aodha, Pietro Perona, Yisong Yue
Abstract In real-world applications of education, an effective teacher adaptively chooses the next example to teach based on the learner’s current state. However, most existing work in algorithmic machine teaching focuses on the batch setting, where adaptivity plays no role. In this paper, we study the case of teaching consistent, version space learners in an interactive setting. At any time step, the teacher provides an example, the learner performs an update, and the teacher observes the learner’s new state. We highlight that adaptivity does not speed up the teaching process when considering existing models of version space learners, such as “worst-case” (the learner picks the next hypothesis randomly from the version space) and “preference-based” (the learner picks hypothesis according to some global preference). Inspired by human teaching, we propose a new model where the learner picks hypotheses according to some local preference defined by the current hypothesis. We show that our model exhibits several desirable properties, e.g., adaptivity plays a key role, and the learner’s transitions over hypotheses are smooth/interpretable. We develop efficient teaching algorithms and demonstrate our results via simulation and user studies.
Tasks
Published 2018-02-14
URL http://arxiv.org/abs/1802.05190v3
PDF http://arxiv.org/pdf/1802.05190v3.pdf
PWC https://paperswithcode.com/paper/understanding-the-role-of-adaptivity-in
Repo
Framework

Iterative Recursive Attention Model for Interpretable Sequence Classification

Title Iterative Recursive Attention Model for Interpretable Sequence Classification
Authors Martin Tutek, Jan Šnajder
Abstract Natural language processing has greatly benefited from the introduction of the attention mechanism. However, standard attention models are of limited interpretability for tasks that involve a series of inference steps. We describe an iterative recursive attention model, which constructs incremental representations of input data through reusing results of previously computed queries. We train our model on sentiment classification datasets and demonstrate its capacity to identify and combine different aspects of the input in an easily interpretable manner, while obtaining performance close to the state of the art.
Tasks Sentiment Analysis
Published 2018-08-30
URL http://arxiv.org/abs/1808.10503v1
PDF http://arxiv.org/pdf/1808.10503v1.pdf
PWC https://paperswithcode.com/paper/iterative-recursive-attention-model-for
Repo
Framework

Machine Learning CICY Threefolds

Title Machine Learning CICY Threefolds
Authors Kieran Bull, Yang-Hui He, Vishnu Jejjala, Challenger Mishra
Abstract The latest techniques from Neural Networks and Support Vector Machines (SVM) are used to investigate geometric properties of Complete Intersection Calabi-Yau (CICY) threefolds, a class of manifolds that facilitate string model building. An advanced neural network classifier and SVM are employed to (1) learn Hodge numbers and report a remarkable improvement over previous efforts, (2) query for favourability, and (3) predict discrete symmetries, a highly imbalanced problem to which both Synthetic Minority Oversampling Technique (SMOTE) and permutations of the CICY matrix are used to decrease the class imbalance and improve performance. In each case study, we employ a genetic algorithm to optimise the hyperparameters of the neural network. We demonstrate that our approach provides quick diagnostic tools capable of shortlisting quasi-realistic string models based on compactification over smooth CICYs and further supports the paradigm that classes of problems in algebraic geometry can be machine learned.
Tasks
Published 2018-06-08
URL http://arxiv.org/abs/1806.03121v3
PDF http://arxiv.org/pdf/1806.03121v3.pdf
PWC https://paperswithcode.com/paper/machine-learning-cicy-threefolds
Repo
Framework

Deep Multimodal Learning for Emotion Recognition in Spoken Language

Title Deep Multimodal Learning for Emotion Recognition in Spoken Language
Authors Yue Gu, Shuhong Chen, Ivan Marsic
Abstract In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language. Our architecture has two distinctive characteristics. First, it extracts the high-level features from both text and audio via a hybrid deep multimodal structure, which considers the spatial information from text, temporal information from audio, and high-level associations from low-level handcrafted features. Second, we fuse all features by using a three-layer deep neural network to learn the correlations across modalities and train the feature extraction and fusion modules together, allowing optimal global fine-tuning of the entire structure. We evaluated the proposed framework on the IEMOCAP dataset. Our result shows promising performance, achieving 60.4% in weighted accuracy for five emotion categories.
Tasks Emotion Recognition
Published 2018-02-22
URL http://arxiv.org/abs/1802.08332v1
PDF http://arxiv.org/pdf/1802.08332v1.pdf
PWC https://paperswithcode.com/paper/deep-multimodal-learning-for-emotion
Repo
Framework

The Mismatch Principle: The Generalized Lasso Under Large Model Uncertainties

Title The Mismatch Principle: The Generalized Lasso Under Large Model Uncertainties
Authors Martin Genzel, Gitta Kutyniok
Abstract We study the estimation capacity of the generalized Lasso, i.e., least squares minimization combined with a (convex) structural constraint. While Lasso-type estimators were originally designed for noisy linear regression problems, it has recently turned out that they are in fact robust against various types of model uncertainties and misspecifications, most notably, non-linearly distorted observation models. This work provides more theoretical evidence for this somewhat astonishing phenomenon. At the heart of our analysis stands the mismatch principle, which is a simple recipe to establish theoretical error bounds for the generalized Lasso. The associated estimation guarantees are of independent interest and are formulated in a fairly general setup, permitting arbitrary sub-Gaussian data, possibly with strongly correlated feature designs; in particular, we do not assume a specific observation model which connects the input and output variables. Although the mismatch principle is conceived based on ideas from statistical learning theory, its actual application area are (high-dimensional) estimation tasks for semi-parametric models. In this context, the benefits of the mismatch principle are demonstrated for a variety of popular problem classes, such as single-index models, generalized linear models, and variable selection. Apart from that, our findings are also relevant to recent advances in quantized and distributed compressed sensing.
Tasks
Published 2018-08-20
URL https://arxiv.org/abs/1808.06329v2
PDF https://arxiv.org/pdf/1808.06329v2.pdf
PWC https://paperswithcode.com/paper/the-mismatch-principle-statistical-learning
Repo
Framework

Imagination Based Sample Construction for Zero-Shot Learning

Title Imagination Based Sample Construction for Zero-Shot Learning
Authors Gang Yang, Jinlu Liu, Xirong Li
Abstract Zero-shot learning (ZSL) which aims to recognize unseen classes with no labeled training sample, efficiently tackles the problem of missing labeled data in image retrieval. Nowadays there are mainly two types of popular methods for ZSL to recognize images of unseen classes: probabilistic reasoning and feature projection. Different from these existing types of methods, we propose a new method: sample construction to deal with the problem of ZSL. Our proposed method, called Imagination Based Sample Construction (IBSC), innovatively constructs image samples of target classes in feature space by mimicking human associative cognition process. Based on an association between attribute and feature, target samples are constructed from different parts of various samples. Furthermore, dissimilarity representation is employed to select high-quality constructed samples which are used as labeled data to train a specific classifier for those unseen classes. In this way, zero-shot learning is turned into a supervised learning problem. As far as we know, it is the first work to construct samples for ZSL thus, our work is viewed as a baseline for future sample construction methods. Experiments on four benchmark datasets show the superiority of our proposed method.
Tasks Image Retrieval, Zero-Shot Learning
Published 2018-10-29
URL http://arxiv.org/abs/1810.12145v1
PDF http://arxiv.org/pdf/1810.12145v1.pdf
PWC https://paperswithcode.com/paper/imagination-based-sample-construction-for
Repo
Framework

FaceFeat-GAN: a Two-Stage Approach for Identity-Preserving Face Synthesis

Title FaceFeat-GAN: a Two-Stage Approach for Identity-Preserving Face Synthesis
Authors Yujun Shen, Bolei Zhou, Ping Luo, Xiaoou Tang
Abstract The advance of Generative Adversarial Networks (GANs) enables realistic face image synthesis. However, synthesizing face images that preserve facial identity as well as have high diversity within each identity remains challenging. To address this problem, we present FaceFeat-GAN, a novel generative model that improves both image quality and diversity by using two stages. Unlike existing single-stage models that map random noise to image directly, our two-stage synthesis includes the first stage of diverse feature generation and the second stage of feature-to-image rendering. The competitions between generators and discriminators are carefully designed in both stages with different objective functions. Specially, in the first stage, they compete in the feature domain to synthesize various facial features rather than images. In the second stage, they compete in the image domain to render photo-realistic images that contain high diversity but preserve identity. Extensive experiments show that FaceFeat-GAN generates images that not only retain identity information but also have high diversity and quality, significantly outperforming previous methods.
Tasks Face Generation, Image Generation
Published 2018-12-04
URL http://arxiv.org/abs/1812.01288v1
PDF http://arxiv.org/pdf/1812.01288v1.pdf
PWC https://paperswithcode.com/paper/facefeat-gan-a-two-stage-approach-for
Repo
Framework
Title In-Session Personalization for Talent Search
Authors Sahin Cem Geyik, Vijay Dialani, Meng Meng, Ryan Smith
Abstract Previous efforts in recommendation of candidates for talent search followed the general pattern of receiving an initial search criteria and generating a set of candidates utilizing a pre-trained model. Traditionally, the generated recommendations are final, that is, the list of potential candidates is not modified unless the user explicitly changes his/her search criteria. In this paper, we are proposing a candidate recommendation model which takes into account the immediate feedback of the user, and updates the candidate recommendations at each step. This setting also allows for very uninformative initial search queries, since we pinpoint the user’s intent due to the feedback during the search session. To achieve our goal, we employ an intent clustering method based on topic modeling which separates the candidate space into meaningful, possibly overlapping, subsets (which we call intent clusters) for each position. On top of the candidate segments, we apply a multi-armed bandit approach to choose which intent cluster is more appropriate for the current session. We also present an online learning scheme which updates the intent clusters within the session, due to user feedback, to achieve further personalization. Our offline experiments as well as the results from the online deployment of our solution demonstrate the benefits of our proposed methodology.
Tasks
Published 2018-09-18
URL http://arxiv.org/abs/1809.06488v1
PDF http://arxiv.org/pdf/1809.06488v1.pdf
PWC https://paperswithcode.com/paper/in-session-personalization-for-talent-search
Repo
Framework

Zero and Few Shot Learning with Semantic Feature Synthesis and Competitive Learning

Title Zero and Few Shot Learning with Semantic Feature Synthesis and Competitive Learning
Authors Zhiwu Lu, Jiechao Guan, Aoxue Li, Tao Xiang, An Zhao, Ji-Rong Wen
Abstract Zero-shot learning (ZSL) is made possible by learning a projection function between a feature space and a semantic space (e.g.,~an attribute space). Key to ZSL is thus to learn a projection that is robust against the often large domain gap between the seen and unseen class domains. In this work, this is achieved by unseen class data synthesis and robust projection function learning. Specifically, a novel semantic data synthesis strategy is proposed, by which semantic class prototypes (e.g., attribute vectors) are used to simply perturb seen class data for generating unseen class ones. As in any data synthesis/hallucination approach, there are ambiguities and uncertainties on how well the synthesised data can capture the targeted unseen class data distribution. To cope with this, the second contribution of this work is a novel projection learning model termed competitive bidirectional projection learning (BPL) designed to best utilise the ambiguous synthesised data. Specifically, we assume that each synthesised data point can belong to any unseen class; and the most likely two class candidates are exploited to learn a robust projection function in a competitive fashion. As a third contribution, we show that the proposed ZSL model can be easily extended to few-shot learning (FSL) by again exploiting semantic (class prototype guided) feature synthesis and competitive BPL. Extensive experiments show that our model achieves the state-of-the-art results on both problems.
Tasks Few-Shot Learning, Zero-Shot Learning
Published 2018-10-19
URL http://arxiv.org/abs/1810.08332v1
PDF http://arxiv.org/pdf/1810.08332v1.pdf
PWC https://paperswithcode.com/paper/zero-and-few-shot-learning-with-semantic
Repo
Framework

Boundary Optimizing Network (BON)

Title Boundary Optimizing Network (BON)
Authors Marco Singh, Akshay Pai
Abstract Despite all the success that deep neural networks have seen in classifying certain datasets, the challenge of finding optimal solutions that generalize still remains. In this paper, we propose the Boundary Optimizing Network (BON), a new approach to generalization for deep neural networks when used for supervised learning. Given a classification network, we propose to use a collaborative generative network that produces new synthetic data points in the form of perturbations of original data points. In this way, we create a data support around each original data point which prevents decision boundaries from passing too close to the original data points, i.e. prevents overfitting. We show that BON improves convergence on CIFAR-10 using the state-of-the-art Densenet. We do however observe that the generative network suffers from catastrophic forgetting during training, and we therefore propose to use a variation of Memory Aware Synapses to optimize the generative network (called BON++). On the Iris dataset, we visualize the effect of BON++ when the generator does not suffer from catastrophic forgetting and conclude that the approach has the potential to create better boundaries in a higher dimensional space.
Tasks
Published 2018-01-08
URL http://arxiv.org/abs/1801.02642v3
PDF http://arxiv.org/pdf/1801.02642v3.pdf
PWC https://paperswithcode.com/paper/boundary-optimizing-network-bon
Repo
Framework
comments powered by Disqus