January 26, 2020

2860 words 14 mins read

Paper Group ANR 1437

Paper Group ANR 1437

Extreme Tensoring for Low-Memory Preconditioning. Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation. A cost-reducing partial labeling estimator in text classification problem. A Visual Programming Paradigm for Abstract Deep Learning Model Development. Analyzing Deep Neural Networks with Symbolic Propagation: Tow …

Extreme Tensoring for Low-Memory Preconditioning

Title Extreme Tensoring for Low-Memory Preconditioning
Authors Xinyi Chen, Naman Agarwal, Elad Hazan, Cyril Zhang, Yi Zhang
Abstract State-of-the-art models are now trained with billions of parameters, reaching hardware limits in terms of memory consumption. This has created a recent demand for memory-efficient optimizers. To this end, we investigate the limits and performance tradeoffs of memory-efficient adaptively preconditioned gradient methods. We propose extreme tensoring for high-dimensional stochastic optimization, showing that an optimizer needs very little memory to benefit from adaptive preconditioning. Our technique applies to arbitrary models (not necessarily with tensor-shaped parameters), and is accompanied by regret and convergence guarantees, which shed light on the tradeoffs between preconditioner quality and expressivity. On a large-scale NLP model, we reduce the optimizer memory overhead by three orders of magnitude, without degrading performance.
Tasks Stochastic Optimization
Published 2019-02-12
URL http://arxiv.org/abs/1902.04620v1
PDF http://arxiv.org/pdf/1902.04620v1.pdf
PWC https://paperswithcode.com/paper/extreme-tensoring-for-low-memory
Repo
Framework

Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation

Title Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation
Authors Hamed Shahbazi, Xiaoli Z. Fern, Reza Ghaeini, Rasha Obeidat, Prasad Tadepalli
Abstract We present a new local entity disambiguation system. The key to our system is a novel approach for learning entity representations. In our approach we learn an entity aware extension of Embedding for Language Model (ELMo) which we call Entity-ELMo (E-ELMo). Given a paragraph containing one or more named entity mentions, each mention is first defined as a function of the entire paragraph (including other mentions), then they predict the referent entities. Utilizing E-ELMo for local entity disambiguation, we outperform all of the state-of-the-art local and global models on the popular benchmarks by improving about 0.5% on micro average accuracy for AIDA test-b with Yago candidate set. The evaluation setup of the training data and candidate set are the same as our baselines for fair comparison.
Tasks Entity Disambiguation, Language Modelling
Published 2019-08-14
URL https://arxiv.org/abs/1908.05762v2
PDF https://arxiv.org/pdf/1908.05762v2.pdf
PWC https://paperswithcode.com/paper/entity-aware-elmo-learning-contextual-entity
Repo
Framework

A cost-reducing partial labeling estimator in text classification problem

Title A cost-reducing partial labeling estimator in text classification problem
Authors Jiangning Chen, Zhibo Dai, Juntao Duan, Qianli Hu, Ruilin Li, Heinrich Matzinger, Ionel Popescu, Haoyan Zhai
Abstract We propose a new approach to address the text classification problems when learning with partial labels is beneficial. Instead of offering each training sample a set of candidate labels, we assign negative-oriented labels to the ambiguous training examples if they are unlikely fall into certain classes. We construct our new maximum likelihood estimators with self-correction property, and prove that under some conditions, our estimators converge faster. Also we discuss the advantages of applying one of our estimator to a fully supervised learning problem. The proposed method has potential applicability in many areas, such as crowdsourcing, natural language processing and medical image analysis.
Tasks Text Classification
Published 2019-06-10
URL https://arxiv.org/abs/1906.03768v1
PDF https://arxiv.org/pdf/1906.03768v1.pdf
PWC https://paperswithcode.com/paper/a-cost-reducing-partial-labeling-estimator-in
Repo
Framework

A Visual Programming Paradigm for Abstract Deep Learning Model Development

Title A Visual Programming Paradigm for Abstract Deep Learning Model Development
Authors Srikanth Tamilselvam, Naveen Panwar, Shreya Khare, Rahul Aralikatte, Anush Sankaran, Senthil Mani
Abstract Deep learning is one of the fastest growing technologies in computer science with a plethora of applications. But this unprecedented growth has so far been limited to the consumption of deep learning experts. The primary challenge being a steep learning curve for learning the programming libraries and the lack of intuitive systems enabling non-experts to consume deep learning. Towards this goal, we study the effectiveness of a no-code paradigm for designing deep learning models. Particularly, a visual drag-and-drop interface is found more efficient when compared with the traditional programming and alternative visual programming paradigms. We conduct user studies of different expertise levels to measure the entry level barrier and the developer load across different programming paradigms. We obtain a System Usability Scale (SUS) of 90 and a NASA Task Load index (TLX) score of 21 for the proposed visual programming compared to 68 and 52, respectively, for the traditional programming methods.
Tasks
Published 2019-05-07
URL https://arxiv.org/abs/1905.02486v2
PDF https://arxiv.org/pdf/1905.02486v2.pdf
PWC https://paperswithcode.com/paper/a-visual-programming-paradigm-for-abstract
Repo
Framework

Analyzing Deep Neural Networks with Symbolic Propagation: Towards Higher Precision and Faster Verification

Title Analyzing Deep Neural Networks with Symbolic Propagation: Towards Higher Precision and Faster Verification
Authors Pengfei Yang, Jiangchao Liu, Jianlin Li, Liqian Chen, Xiaowei Huang
Abstract Deep neural networks (DNNs) have been shown lack of robustness for the vulnerability of their classification to small perturbations on the inputs. This has led to safety concerns of applying DNNs to safety-critical domains. Several verification approaches have been developed to automatically prove or disprove safety properties of DNNs. However, these approaches suffer from either the scalability problem, i.e., only small DNNs can be handled, or the precision problem, i.e., the obtained bounds are loose. This paper improves on a recent proposal of analyzing DNNs through the classic abstract interpretation technique, by a novel symbolic propagation technique. More specifically, the values of neurons are represented symbolically and propagated forwardly from the input layer to the output layer, on top of abstract domains. We show that our approach can achieve significantly higher precision and thus can prove more properties than using only abstract domains. Moreover, we show that the bounds derived from our approach on the hidden neurons, when applied to a state-of-the-art SMT based verification tool, can improve its performance. We implement our approach into a software tool and validate it over a few DNNs trained on benchmark datasets such as MNIST, etc.
Tasks
Published 2019-02-26
URL http://arxiv.org/abs/1902.09866v1
PDF http://arxiv.org/pdf/1902.09866v1.pdf
PWC https://paperswithcode.com/paper/analyzing-deep-neural-networks-with-symbolic
Repo
Framework

Learning Latent Dynamics for Partially-Observed Chaotic Systems

Title Learning Latent Dynamics for Partially-Observed Chaotic Systems
Authors Said Ouala, Duong Nguyen, Lucas Drumetz, Bertrand Chapron, Ananda Pascual, Fabrice Collard, Lucile Gaultier, Ronan Fablet
Abstract This paper addresses the data-driven identification of latent dynamical representations of partially-observed systems, i.e., dynamical systems for which some components are never observed, with an emphasis on forecasting applications, including long-term asymptotic patterns. Whereas state-of-the-art data-driven approaches rely on delay embeddings and linear decompositions of the underlying operators, we introduce a framework based on the data-driven identification of an augmented state-space model using a neural-network-based representation. For a given training dataset, it amounts to jointly learn an ODE (Ordinary Differential Equation) representation in the latent space and reconstructing latent states. Through numerical experiments, we demonstrate the relevance of the proposed framework w.r.t. state-of-the-art approaches in terms of short-term forecasting performance and long-term behaviour. We further discuss how the proposed framework relates to Koopman operator theory and Takens’ embedding theorem.
Tasks
Published 2019-07-04
URL https://arxiv.org/abs/1907.02452v1
PDF https://arxiv.org/pdf/1907.02452v1.pdf
PWC https://paperswithcode.com/paper/learning-latent-dynamics-for-partially
Repo
Framework

PatchNet: A Tool for Deep Patch Classification

Title PatchNet: A Tool for Deep Patch Classification
Authors Thong Hoang, Julia Lawall, Richard J. Oentaryo, Yuan Tian, David Lo
Abstract This work proposes PatchNet, an automated tool based on hierarchical deep learning for classifying patches by extracting features from commit messages and code changes. PatchNet contains a deep hierarchical structure that mirrors the hierarchical and sequential structure of a code change, differentiating it from the existing deep learning models on source code. PatchNet provides several options allowing users to select parameters for the training process. The tool has been validated in the context of automatic identification of stable-relevant patches in the Linux kernel and is potentially applicable to automate other software engineering tasks that can be formulated as patch classification problems. A video demonstrating PatchNet is available at https://goo.gl/CZjG6X. The PatchNet implementation is available at https://github.com/hvdthong/PatchNetTool.
Tasks
Published 2019-02-16
URL http://arxiv.org/abs/1903.02063v2
PDF http://arxiv.org/pdf/1903.02063v2.pdf
PWC https://paperswithcode.com/paper/patchnet-a-tool-for-deep-patch-classification
Repo
Framework

Learning Fully Dense Neural Networks for Image Semantic Segmentation

Title Learning Fully Dense Neural Networks for Image Semantic Segmentation
Authors Mingmin Zhen, Jinglu Wang, Lei Zhou, Tian Fang, Long Quan
Abstract Semantic segmentation is pixel-wise classification which retains critical spatial information. The “feature map reuse” has been commonly adopted in CNN based approaches to take advantage of feature maps in the early layers for the later spatial reconstruction. Along this direction, we go a step further by proposing a fully dense neural network with an encoder-decoder structure that we abbreviate as FDNet. For each stage in the decoder module, feature maps of all the previous blocks are adaptively aggregated to feed-forward as input. On the one hand, it reconstructs the spatial boundaries accurately. On the other hand, it learns more efficiently with the more efficient gradient backpropagation. In addition, we propose the boundary-aware loss function to focus more attention on the pixels near the boundary, which boosts the “hard examples” labeling. We have demonstrated the best performance of the FDNet on the two benchmark datasets: PASCAL VOC 2012, NYUDv2 over previous works when not considering training on other datasets.
Tasks Semantic Segmentation
Published 2019-05-22
URL https://arxiv.org/abs/1905.08929v1
PDF https://arxiv.org/pdf/1905.08929v1.pdf
PWC https://paperswithcode.com/paper/learning-fully-dense-neural-networks-for
Repo
Framework

Say Anything: Automatic Semantic Infelicity Detection in L2 English Indefinite Pronouns

Title Say Anything: Automatic Semantic Infelicity Detection in L2 English Indefinite Pronouns
Authors Ella Rabinovich, Julia Watson, Barend Beekhuizen, Suzanne Stevenson
Abstract Computational research on error detection in second language speakers has mainly addressed clear grammatical anomalies typical to learners at the beginner-to-intermediate level. We focus instead on acquisition of subtle semantic nuances of English indefinite pronouns by non-native speakers at varying levels of proficiency. We first lay out theoretical, linguistically motivated hypotheses, and supporting empirical evidence on the nature of the challenges posed by indefinite pronouns to English learners. We then suggest and evaluate an automatic approach for detection of atypical usage patterns, demonstrating that deep learning architectures are promising for this task involving nuanced semantic anomalies.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.07928v1
PDF https://arxiv.org/pdf/1909.07928v1.pdf
PWC https://paperswithcode.com/paper/say-anything-automatic-semantic-infelicity
Repo
Framework

Emotion recognition with 4kresolution database

Title Emotion recognition with 4kresolution database
Authors Qian Zheng
Abstract Classifying the human emotion through facial expressions is a big topic in both the Computer Vision and Deep learning fields. Human emotion can be classified as one of the basic emotion types like being angry, happy or dimensional emotion with valence and arousal values. There are a lot of related challenges in this topic, one of the most famous challenges is called the ‘Affect-in-the-wild Challenge’(Aff-Wild Challenge). It is the first challenge on the estimation of valence and arousal in-the-wild. This project is an extension of the Aff-wild Challenge. Aff-wild database was created using images with a mean resolution of 607*359, I and Dimitrios sought to find out the performance of the model that is trained on a database that contains4K resolution in-the-wild images. Since there is no existing database to satisfy the requirement, I built this database from scratch with help from Dimitrios and trained neural network models with different hyperparameters on this database. I used network models likeVGG16, AlexNet, ResNet and also some pre-trained models like Ima-geNet VGG. I compared the results of the different network models alongside the results from the Aff-wild database to exploit the optimal model for my database.
Tasks Emotion Recognition
Published 2019-10-24
URL https://arxiv.org/abs/1910.11276v1
PDF https://arxiv.org/pdf/1910.11276v1.pdf
PWC https://paperswithcode.com/paper/emotion-recognition-with-4kresolution
Repo
Framework

Forward and Backward Knowledge Transfer for Sentiment Classification

Title Forward and Backward Knowledge Transfer for Sentiment Classification
Authors Hao Wang, Bing Liu, Shuai Wang, Nianzu Ma, Yan Yang
Abstract This paper studies the problem of learning a sequence of sentiment classification tasks. The learned knowledge from each task is retained and used to help future or subsequent task learning. This learning paradigm is called Lifelong Learning (LL). However, existing LL methods either only transfer knowledge forward to help future learning and do not go back to improve the model of a previous task or require the training data of the previous task to retrain its model to exploit backward/reverse knowledge transfer. This paper studies reverse knowledge transfer of LL in the context of naive Bayesian (NB) classification. It aims to improve the model of a previous task by leveraging future knowledge without retraining using its training data. This is done by exploiting a key characteristic of the generative model of NB. That is, it is possible to improve the NB classifier for a task by improving its model parameters directly by using the retained knowledge from other tasks. Experimental results show that the proposed method markedly outperforms existing LL baselines.
Tasks Sentiment Analysis, Transfer Learning
Published 2019-06-08
URL https://arxiv.org/abs/1906.03506v1
PDF https://arxiv.org/pdf/1906.03506v1.pdf
PWC https://paperswithcode.com/paper/forward-and-backward-knowledge-transfer-for
Repo
Framework

Evolutionary Computation and AI Safety: Research Problems Impeding Routine and Safe Real-world Application of Evolution

Title Evolutionary Computation and AI Safety: Research Problems Impeding Routine and Safe Real-world Application of Evolution
Authors Joel Lehman
Abstract Recent developments in artificial intelligence and machine learning have spurred interest in the growing field of AI safety, which studies how to prevent human-harming accidents when deploying AI systems. This paper thus explores the intersection of AI safety with evolutionary computation, to show how safety issues arise in evolutionary computation and how understanding from evolutionary computational and biological evolution can inform the broader study of AI safety.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.10189v2
PDF https://arxiv.org/pdf/1906.10189v2.pdf
PWC https://paperswithcode.com/paper/evolutionary-computation-and-ai-safety
Repo
Framework

Stochastic Proximal AUC Maximization

Title Stochastic Proximal AUC Maximization
Authors Yunwen Lei, Yiming Ying
Abstract In this paper we consider the problem of maximizing the Area under the ROC curve (AUC) which is a widely used performance metric in imbalanced classification and anomaly detection. Due to the pairwise nonlinearity of the objective function, classical SGD algorithms do not apply to the task of AUC maximization. We propose a novel stochastic proximal algorithm for AUC maximization which is scalable to large scale streaming data. Our algorithm can accommodate general penalty terms and is easy to implement with favorable $O(d)$ space and per-iteration time complexities. We establish a high-probability convergence rate $O(1/\sqrt{T})$ for the general convex setting, and improve it to a fast convergence rate $O(1/T)$ for the cases of strongly convex regularizers and no regularization term (without strong convexity). Our proof does not need the uniform boundedness assumption on the loss function or the iterates which is more fidelity to the practice. Finally, we perform extensive experiments over various benchmark data sets from real-world application domains which show the superior performance of our algorithm over the existing AUC maximization algorithms.
Tasks Anomaly Detection
Published 2019-06-14
URL https://arxiv.org/abs/1906.06053v1
PDF https://arxiv.org/pdf/1906.06053v1.pdf
PWC https://paperswithcode.com/paper/stochastic-proximal-auc-maximization
Repo
Framework

Generative Adversarial Networks for Mitigating Biases in Machine Learning Systems

Title Generative Adversarial Networks for Mitigating Biases in Machine Learning Systems
Authors Adel Abusitta, Esma Aïmeur, Omar Abdel Wahab
Abstract In this paper, we propose a new framework for mitigating biases in machine learning systems. The problem of the existing mitigation approaches is that they are model-oriented in the sense that they focus on tuning the training algorithms to produce fair results, while overlooking the fact that the training data can itself be the main reason for biased outcomes. Technically speaking, two essential limitations can be found in such model-based approaches: 1) the mitigation cannot be achieved without degrading the accuracy of the machine learning models, and 2) when the data used for training are largely biased, the training time automatically increases so as to find suitable learning parameters that help produce fair results. To address these shortcomings, we propose in this work a new framework that can largely mitigate the biases and discriminations in machine learning systems while at the same time enhancing the prediction accuracy of these systems. The proposed framework is based on conditional Generative Adversarial Networks (cGANs), which are used to generate new synthetic fair data with selective properties from the original data. We also propose a framework for analyzing data biases, which is important for understanding the amount and type of data that need to be synthetically sampled and labeled for each population group. Experimental results show that the proposed solution can efficiently mitigate different types of biases, while at the same time enhancing the prediction accuracy of the underlying machine learning model.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09972v1
PDF https://arxiv.org/pdf/1905.09972v1.pdf
PWC https://paperswithcode.com/paper/generative-adversarial-networks-for-5
Repo
Framework

Detection of vertebral fractures in CT using 3D Convolutional Neural Networks

Title Detection of vertebral fractures in CT using 3D Convolutional Neural Networks
Authors Joeri Nicolaes, Steven Raeymaeckers, David Robben, Guido Wilms, Dirk Vandermeulen, Cesar Libanati, Marc Debois
Abstract Osteoporosis induced fractures occur worldwide about every 3 seconds. Vertebral compression fractures are early signs of the disease and considered risk predictors for secondary osteoporotic fractures. We present a detection method to opportunistically screen spine-containing CT images for the presence of these vertebral fractures. Inspired by radiology practice, existing methods are based on 2D and 2.5D features but we present, to the best of our knowledge, the first method for detecting vertebral fractures in CT using automatically learned 3D feature maps. The presented method explicitly localizes these fractures allowing radiologists to interpret its results. We train a voxel-classification 3D Convolutional Neural Network (CNN) with a training database of 90 cases that has been semi-automatically generated using radiologist readings that are readily available in clinical practice. Our 3D method produces an Area Under the Curve (AUC) of 95% for patient-level fracture detection and an AUC of 93% for vertebra-level fracture detection in a five-fold cross-validation experiment.
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.01816v1
PDF https://arxiv.org/pdf/1911.01816v1.pdf
PWC https://paperswithcode.com/paper/detection-of-vertebral-fractures-in-ct-using
Repo
Framework
comments powered by Disqus