Paper Group ANR 1023
How to evaluate sentiment classifiers for Twitter time-ordered data?. Automatic Coding for Neonatal Jaundice From Free Text Data Using Ensemble Methods. Prototype-based Neural Network Layers: Incorporating Vector Quantization. Aggregated Learning: A Deep Learning Framework Based on Information-Bottleneck Vector Quantization. Noise-adding Methods of …
How to evaluate sentiment classifiers for Twitter time-ordered data?
Title | How to evaluate sentiment classifiers for Twitter time-ordered data? |
Authors | Igor Mozetič, Luis Torgo, Vitor Cerqueira, Jasmina Smailović |
Abstract | Social media are becoming an increasingly important source of information about the public mood regarding issues such as elections, Brexit, stock market, etc. In this paper we focus on sentiment classification of Twitter data. Construction of sentiment classifiers is a standard text mining task, but here we address the question of how to properly evaluate them as there is no settled way to do so. Sentiment classes are ordered and unbalanced, and Twitter produces a stream of time-ordered data. The problem we address concerns the procedures used to obtain reliable estimates of performance measures, and whether the temporal ordering of the training and test data matters. We collected a large set of 1.5 million tweets in 13 European languages. We created 138 sentiment models and out-of-sample datasets, which are used as a gold standard for evaluations. The corresponding 138 in-sample datasets are used to empirically compare six different estimation procedures: three variants of cross-validation, and three variants of sequential validation (where test set always follows the training set). We find no significant difference between the best cross-validation and sequential validation. However, we observe that all cross-validation variants tend to overestimate the performance, while the sequential methods tend to underestimate it. Standard cross-validation with random selection of examples is significantly worse than the blocked cross-validation, and should not be used to evaluate classifiers in time-ordered data scenarios. |
Tasks | Sentiment Analysis |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05160v1 |
http://arxiv.org/pdf/1803.05160v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-evaluate-sentiment-classifiers-for |
Repo | |
Framework | |
Automatic Coding for Neonatal Jaundice From Free Text Data Using Ensemble Methods
Title | Automatic Coding for Neonatal Jaundice From Free Text Data Using Ensemble Methods |
Authors | Scott Werwath |
Abstract | This study explores the creation of a machine learning model to automatically identify whether a Neonatal Intensive Care Unit (NICU) patient was diagnosed with neonatal jaundice during a particular hospitalization based on their associated clinical notes. We develop a number of techniques for text preprocessing and feature selection and compare the effectiveness of different classification models. We show that using ensemble decision tree classification, both with AdaBoost and with bagging, outperforms support vector machines (SVM), the current state-of-the-art technique for neonatal jaundice coding. |
Tasks | Feature Selection |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.01054v1 |
http://arxiv.org/pdf/1805.01054v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-coding-for-neonatal-jaundice-from |
Repo | |
Framework | |
Prototype-based Neural Network Layers: Incorporating Vector Quantization
Title | Prototype-based Neural Network Layers: Incorporating Vector Quantization |
Authors | Sascha Saralajew, Lars Holdijk, Maike Rees, Thomas Villmann |
Abstract | Neural networks currently dominate the machine learning community and they do so for good reasons. Their accuracy on complex tasks such as image classification is unrivaled at the moment and with recent improvements they are reasonably easy to train. Nevertheless, neural networks are lacking robustness and interpretability. Prototype-based vector quantization methods on the other hand are known for being robust and interpretable. For this reason, we propose techniques and strategies to merge both approaches. This contribution will particularly highlight the similarities between them and outline how to construct a prototype-based classification layer for multilayer networks. Additionally, we provide an alternative, prototype-based, approach to the classical convolution operation. Numerical results are not part of this report, instead the focus lays on establishing a strong theoretical framework. By publishing our framework and the respective theoretical considerations and justifications before finalizing our numerical experiments we hope to jump-start the incorporation of prototype-based learning in neural networks and vice versa. |
Tasks | Image Classification, Quantization |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01214v2 |
http://arxiv.org/pdf/1812.01214v2.pdf | |
PWC | https://paperswithcode.com/paper/prototype-based-neural-network-layers |
Repo | |
Framework | |
Aggregated Learning: A Deep Learning Framework Based on Information-Bottleneck Vector Quantization
Title | Aggregated Learning: A Deep Learning Framework Based on Information-Bottleneck Vector Quantization |
Authors | Hongyu Guo, Yongyi Mao, Ali Al-Bashabsheh, Richong Zhang |
Abstract | Based on the notion of information bottleneck (IB), we formulate a quantization problem called “IB quantization”. We show that IB quantization is equivalent to learning based on the IB principle. Under this equivalence, the standard neural network models can be viewed as scalar (single sample) IB quantizers. It is known, from conventional rate-distortion theory, that scalar quantizers are inferior to vector (multi-sample) quantizers. Such a deficiency then inspires us to develop a novel learning framework, AgrLearn, that corresponds to vector IB quantizers for learning with neural networks. Unlike standard networks, AgrLearn simultaneously optimizes against multiple data samples. We experimentally verify that AgrLearn can result in significant improvements when applied to several current deep learning architectures for image recognition and text classification. We also empirically show that AgrLearn can reduce up to 80% of the training samples needed for ResNet training. |
Tasks | Image Classification, Quantization, Text Classification |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10251v3 |
http://arxiv.org/pdf/1807.10251v3.pdf | |
PWC | https://paperswithcode.com/paper/aggregated-learning-a-deep-learning-framework |
Repo | |
Framework | |
Noise-adding Methods of Saliency Map as Series of Higher Order Partial Derivative
Title | Noise-adding Methods of Saliency Map as Series of Higher Order Partial Derivative |
Authors | Junghoon Seo, Jeongyeol Choe, Jamyoung Koo, Seunghyeon Jeon, Beomsu Kim, Taegyun Jeon |
Abstract | SmoothGrad and VarGrad are techniques that enhance the empirical quality of standard saliency maps by adding noise to input. However, there were few works that provide a rigorous theoretical interpretation of those methods. We analytically formalize the result of these noise-adding methods. As a result, we observe two interesting results from the existing noise-adding methods. First, SmoothGrad does not make the gradient of the score function smooth. Second, VarGrad is independent of the gradient of the score function. We believe that our findings provide a clue to reveal the relationship between local explanation methods of deep neural networks and higher-order partial derivatives of the score function. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03000v1 |
http://arxiv.org/pdf/1806.03000v1.pdf | |
PWC | https://paperswithcode.com/paper/noise-adding-methods-of-saliency-map-as |
Repo | |
Framework | |
Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations
Title | Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations |
Authors | Fatima Zohra Smaili, Xin Gao, Robert Hoehndorf |
Abstract | We propose the Onto2Vec method, an approach to learn feature vectors for biological entities based on their annotations to biomedical ontologies. Our method can be applied to a wide range of bioinformatics research problems such as similarity-based prediction of interactions between proteins, classification of interaction types using supervised learning, or clustering. |
Tasks | |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1802.00864v1 |
http://arxiv.org/pdf/1802.00864v1.pdf | |
PWC | https://paperswithcode.com/paper/onto2vec-joint-vector-based-representation-of |
Repo | |
Framework | |
Frank-Wolfe Style Algorithms for Large Scale Optimization
Title | Frank-Wolfe Style Algorithms for Large Scale Optimization |
Authors | Lijun Ding, Madeleine Udell |
Abstract | We introduce a few variants on Frank-Wolfe style algorithms suitable for large scale optimization. We show how to modify the standard Frank-Wolfe algorithm using stochastic gradients, approximate subproblem solutions, and sketched decision variables in order to scale to enormous problems while preserving (up to constants) the optimal convergence rate $\mathcal{O}(\frac{1}{k})$. |
Tasks | |
Published | 2018-08-15 |
URL | http://arxiv.org/abs/1808.05274v1 |
http://arxiv.org/pdf/1808.05274v1.pdf | |
PWC | https://paperswithcode.com/paper/frank-wolfe-style-algorithms-for-large-scale |
Repo | |
Framework | |
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset
Title | InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset |
Authors | Wenbin Li, Sajad Saeedi, John McCormac, Ronald Clark, Dimos Tzoumanikas, Qing Ye, Yuzhong Huang, Rui Tang, Stefan Leutenegger |
Abstract | Datasets have gained an enormous amount of popularity in the computer vision community, from training and evaluation of Deep Learning-based methods to benchmarking Simultaneous Localization and Mapping (SLAM). Without a doubt, synthetic imagery bears a vast potential due to scalability in terms of amounts of data obtainable without tedious manual ground truth annotations or measurements. Here, we present a dataset with the aim of providing a higher degree of photo-realism, larger scale, more variability as well as serving a wider range of purposes compared to existing datasets. Our dataset leverages the availability of millions of professional interior designs and millions of production-level furniture and object assets – all coming with fine geometric details and high-resolution texture. We render high-resolution and high frame-rate video sequences following realistic trajectories while supporting various camera types as well as providing inertial measurements. Together with the release of the dataset, we will make executable program of our interactive simulator software as well as our renderer available at https://interiornetdataset.github.io. To showcase the usability and uniqueness of our dataset, we show benchmarking results of both sparse and dense SLAM algorithms. |
Tasks | Simultaneous Localization and Mapping |
Published | 2018-09-03 |
URL | http://arxiv.org/abs/1809.00716v1 |
http://arxiv.org/pdf/1809.00716v1.pdf | |
PWC | https://paperswithcode.com/paper/interiornet-mega-scale-multi-sensor-photo |
Repo | |
Framework | |
PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence
Title | PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence |
Authors | Sangryul Jeon, Seungryong Kim, Dongbo Min, Kwanghoon Sohn |
Abstract | This paper presents a deep architecture for dense semantic correspondence, called pyramidal affine regression networks (PARN), that estimates locally-varying affine transformation fields across images. To deal with intra-class appearance and shape variations that commonly exist among different instances within the same object category, we leverage a pyramidal model where affine transformation fields are progressively estimated in a coarse-to-fine manner so that the smoothness constraint is naturally imposed within deep networks. PARN estimates residual affine transformations at each level and composes them to estimate final affine transformations. Furthermore, to overcome the limitations of insufficient training data for semantic correspondence, we propose a novel weakly-supervised training scheme that generates progressive supervisions by leveraging a correspondence consistency across image pairs. Our method is fully learnable in an end-to-end manner and does not require quantizing infinite continuous affine transformation fields. To the best of our knowledge, it is the first work that attempts to estimate dense affine transformation fields in a coarse-to-fine manner within deep networks. Experimental results demonstrate that PARN outperforms the state-of-the-art methods for dense semantic correspondence on various benchmarks. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.02939v2 |
http://arxiv.org/pdf/1807.02939v2.pdf | |
PWC | https://paperswithcode.com/paper/parn-pyramidal-affine-regression-networks-for |
Repo | |
Framework | |
Deep learning electromagnetic inversion with convolutional neural networks
Title | Deep learning electromagnetic inversion with convolutional neural networks |
Authors | Vladimir Puzyrev |
Abstract | Geophysical inversion attempts to estimate the distribution of physical properties in the Earth’s interior from observations collected at or above the surface. Inverse problems are commonly posed as least-squares optimization problems in high-dimensional parameter spaces. Existing approaches are largely based on deterministic gradient-based methods, which are limited by nonlinearity and nonuniqueness of the inverse problem. Probabilistic inversion methods, despite their great potential in uncertainty quantification, still remain a formidable computational task. In this paper, I explore the potential of deep learning methods for electromagnetic inversion. This approach does not require calculation of the gradient and provides results instantaneously. Deep neural networks based on fully convolutional architecture are trained on large synthetic datasets obtained by full 3-D simulations. The performance of the method is demonstrated on models of strong practical relevance representing an onshore controlled source electromagnetic CO2 monitoring scenario. The pre-trained networks can reliably estimate the position and lateral dimensions of the anomalies, as well as their resistivity properties. Several fully convolutional network architectures are compared in terms of their accuracy, generalization, and cost of training. Examples with different survey geometry and noise levels confirm the feasibility of the deep learning inversion, opening the possibility to estimate the subsurface resistivity distribution in real time. |
Tasks | |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.10247v1 |
http://arxiv.org/pdf/1812.10247v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-electromagnetic-inversion-with |
Repo | |
Framework | |
A Variational Feature Encoding Method of 3D Object for Probabilistic Semantic SLAM
Title | A Variational Feature Encoding Method of 3D Object for Probabilistic Semantic SLAM |
Authors | H. W. Yu, B. H. Lee |
Abstract | This paper presents a feature encoding method of complex 3D objects for high-level semantic features. Recent approaches to object recognition methods become important for semantic simultaneous localization and mapping (SLAM). However, there is a lack of consideration of the probabilistic observation model for 3D objects, as the shape of a 3D object basically follows a complex probability distribution. Furthermore, since the mobile robot equipped with a range sensor observes only a single view, much information of the object shape is discarded. These limitations are the major obstacles to semantic SLAM and view-independent loop closure using 3D object shapes as features. In order to enable the numerical analysis for the Bayesian inference, we approximate the true observation model of 3D objects to tractable distributions. Since the observation likelihood can be obtained from the generative model, we formulate the true generative model for 3D object with the Bayesian networks. To capture these complex distributions, we apply a variational auto-encoder. To analyze the approximated distributions and encoded features, we perform classification with maximum likelihood estimation and shape retrieval. |
Tasks | Bayesian Inference, Object Recognition, Simultaneous Localization and Mapping |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10180v1 |
http://arxiv.org/pdf/1808.10180v1.pdf | |
PWC | https://paperswithcode.com/paper/a-variational-feature-encoding-method-of-3d |
Repo | |
Framework | |
When is there a Representer Theorem? Reflexive Banach spaces
Title | When is there a Representer Theorem? Reflexive Banach spaces |
Authors | Kevin Schlegel |
Abstract | We consider a general regularised interpolation problem for learning a parameter vector from data. The well known representer theorem says that under certain conditions on the regulariser there exists a solution in the linear span of the data points. This is at the core of kernel methods in machine learning as it makes the problem computationally tractable. Most literature deals only with sufficient conditions for representer theorems in Hilbert spaces. We prove necessary and sufficient conditions for the existence of representer theorems in reflexive Banach spaces and illustrate why in a sense reflexivity is the minimal requirement on the function space. We further show that if the learning relies on the linear representer theorem, then the solution is independent of the regulariser and in fact determined by the function space alone. This in particular shows the value of generalising Hilbert space learning theory to Banach spaces. |
Tasks | |
Published | 2018-09-26 |
URL | https://arxiv.org/abs/1809.10284v2 |
https://arxiv.org/pdf/1809.10284v2.pdf | |
PWC | https://paperswithcode.com/paper/when-is-there-a-representer-theorem-reflexive |
Repo | |
Framework | |
An Algorithmic Perspective on Imitation Learning
Title | An Algorithmic Perspective on Imitation Learning |
Authors | Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J. Andrew Bagnell, Pieter Abbeel, Jan Peters |
Abstract | As robots and other intelligent agents move from simple environments and problems to more complex, unstructured settings, manually programming their behavior has become increasingly challenging and expensive. Often, it is easier for a teacher to demonstrate a desired behavior rather than attempt to manually engineer it. This process of learning from demonstrations, and the study of algorithms to do so, is called imitation learning. This work provides an introduction to imitation learning. It covers the underlying assumptions, approaches, and how they relate; the rich set of algorithms developed to tackle the problem; and advice on effective tools and implementation. We intend this paper to serve two audiences. First, we want to familiarize machine learning experts with the challenges of imitation learning, particularly those arising in robotics, and the interesting theoretical and practical distinctions between it and more familiar frameworks like statistical supervised learning theory and reinforcement learning. Second, we want to give roboticists and experts in applied artificial intelligence a broader appreciation for the frameworks and tools available for imitation learning. |
Tasks | Imitation Learning |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06711v1 |
http://arxiv.org/pdf/1811.06711v1.pdf | |
PWC | https://paperswithcode.com/paper/an-algorithmic-perspective-on-imitation |
Repo | |
Framework | |
ReDecode Framework for Iterative Improvement in Paraphrase Generation
Title | ReDecode Framework for Iterative Improvement in Paraphrase Generation |
Authors | Milan Aggarwal, Nupur Kumari, Ayush Bansal, Balaji Krishnamurthy |
Abstract | Generating paraphrases, that is, different variations of a sentence conveying the same meaning, is an important yet challenging task in NLP. Automatically generating paraphrases has its utility in many NLP tasks like question answering, information retrieval, conversational systems to name a few. In this paper, we introduce iterative refinement of generated paraphrases within VAE based generation framework. Current sequence generation models lack the capability to (1) make improvements once the sentence is generated; (2) rectify errors made while decoding. We propose a technique to iteratively refine the output using multiple decoders, each one attending on the output sentence generated by the previous decoder. We improve current state of the art results significantly - with over 9% and 28% absolute increase in METEOR scores on Quora question pairs and MSCOCO datasets respectively. We also show qualitatively through examples that our re-decoding approach generates better paraphrases compared to a single decoder by rectifying errors and making improvements in paraphrase structure, inducing variations and introducing new but semantically coherent information. |
Tasks | Information Retrieval, Paraphrase Generation, Question Answering |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04454v1 |
http://arxiv.org/pdf/1811.04454v1.pdf | |
PWC | https://paperswithcode.com/paper/redecode-framework-for-iterative-improvement |
Repo | |
Framework | |
A task in a suit and a tie: paraphrase generation with semantic augmentation
Title | A task in a suit and a tie: paraphrase generation with semantic augmentation |
Authors | Su Wang, Rahul Gupta, Nancy Chang, Jason Baldridge |
Abstract | Paraphrasing is rooted in semantics. We show the effectiveness of transformers (Vaswani et al. 2017) for paraphrase generation and further improvements by incorporating PropBank labels via a multi-encoder. Evaluating on MSCOCO and WikiAnswers, we find that transformers are fast and effective, and that semantic augmentation for both transformers and LSTMs leads to sizable 2-3 point gains in BLEU, METEOR and TER. More importantly, we find surprisingly large gains on human evaluations compared to previous models. Nevertheless, manual inspection of generated paraphrases reveals ample room for improvement: even our best model produces human-acceptable paraphrases for only 28% of captions from the CHIA dataset (Sharma et al. 2018), and it fails spectacularly on sentences from Wikipedia. Overall, these results point to the potential for incorporating semantics in the task while highlighting the need for stronger evaluation. |
Tasks | Paraphrase Generation |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00119v2 |
http://arxiv.org/pdf/1811.00119v2.pdf | |
PWC | https://paperswithcode.com/paper/a-task-in-a-suit-and-a-tie-paraphrase |
Repo | |
Framework | |