January 31, 2020

3247 words 16 mins read

Paper Group ANR 128

Pythia: AI-assisted Code Completion System. Seq2Emo for Multi-label Emotion Classification Based on Latent Variable Chains Transformation. Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free. PDANet: Polarity-consistent Deep Attention Network for Fine-grained Visual Emotion Regression. Research on the pixel-based an …

Pythia: AI-assisted Code Completion System


Title	Pythia: AI-assisted Code Completion System
Authors	Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
Abstract	In this paper, we propose a novel end-to-end approach for AI-assisted code completion called Pythia. It generates ranked lists of method and API recommendations which can be used by software developers at edit time. The system is currently deployed as part of Intellicode extension in Visual Studio Code IDE. Pythia exploits state-of-the-art large-scale deep learning models trained on code contexts extracted from abstract syntax trees. It is designed to work at a high throughput predicting the best matching code completions on the order of 100 $ms$. We describe the architecture of the system, perform comparisons to frequency-based approach and invocation-based Markov Chain language model, and discuss challenges serving Pythia models on lightweight client devices. The offline evaluation results obtained on 2700 Python open source software GitHub repositories show a top-5 accuracy of 92%, surpassing the baseline models by 20% averaged over classes, for both intra and cross-project settings.
Tasks	Language Modelling
Published	2019-11-29
URL	https://arxiv.org/abs/1912.00742v1
PDF	https://arxiv.org/pdf/1912.00742v1.pdf
PWC	https://paperswithcode.com/paper/pythia-ai-assisted-code-completion-system
Repo
Framework

Seq2Emo for Multi-label Emotion Classification Based on Latent Variable Chains Transformation


Title	Seq2Emo for Multi-label Emotion Classification Based on Latent Variable Chains Transformation
Authors	Chenyang Huang, Amine Trabelsi, Xuebin Qin, Nawshad Farruque, Osmar R. Zaïane
Abstract	Emotion detection in text is an important task in NLP and is essential in many applications. Most of the existing methods treat this task as a problem of single-label multi-class text classification. To predict multiple emotions for one instance, most of the existing works regard it as a general Multi-label Classification (MLC) problem, where they usually either apply a manually determined threshold on the last output layer of their neural network models or train multiple binary classifiers and make predictions in the fashion of one-vs-all. However, compared to labels in the general MLC datasets, the number of emotion categories are much fewer (less than 10). Additionally, emotions tend to have more correlations with each other. For example, the human usually does not express “joy” and “anger” at the same time, but it is very likely to have “joy” and “love” expressed together. Given this intuition, in this paper, we propose a Latent Variable Chain (LVC) transformation and a tailored model – Seq2Emo model that not only naturally predicts multiple emotion labels but also takes into consideration their correlations. We perform the experiments on the existing multi-label emotion datasets as well as on our newly collected datasets. The results show that our model compares favorably with existing state-of-the-art methods.
Tasks	Emotion Classification, Multi-Label Classification, Text Classification
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02147v2
PDF	https://arxiv.org/pdf/1911.02147v2.pdf
PWC	https://paperswithcode.com/paper/seq2emo-for-multi-label-emotion
Repo
Framework

Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free


Title	Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free
Authors	Mingrui Zhang, Lin Chen, Aryan Mokhtari, Hamed Hassani, Amin Karbasi
Abstract	How can we efficiently mitigate the overhead of gradient communications in distributed optimization? This problem is at the heart of training scalable machine learning models and has been mainly studied in the unconstrained setting. In this paper, we propose Quantized-Frank-Wolfe (QFW), the first projection-free and communication-efficient algorithm for solving constrained optimization problems at scale. We consider both convex and non-convex objective functions, expressed as a finite-sum or more generally a stochastic optimization problem, and provide strong theoretical guarantees on the convergence rate of QFW. This is accomplished by proposing novel quantization schemes that efficiently compress gradients while controlling the noise variance introduced during this process. Finally, we empirically validate the efficiency of QFW in terms of communication and the quality of returned solution against natural baselines.
Tasks	Distributed Optimization, Quantization, Stochastic Optimization
Published	2019-02-17
URL	https://arxiv.org/abs/1902.06332v3
PDF	https://arxiv.org/pdf/1902.06332v3.pdf
PWC	https://paperswithcode.com/paper/quantized-frank-wolfe-communication-efficient
Repo
Framework

PDANet: Polarity-consistent Deep Attention Network for Fine-grained Visual Emotion Regression


Title	PDANet: Polarity-consistent Deep Attention Network for Fine-grained Visual Emotion Regression
Authors	Sicheng Zhao, Zizhou Jia, Hui Chen, Leida Li, Guiguang Ding, Kurt Keutzer
Abstract	Existing methods on visual emotion analysis mainly focus on coarse-grained emotion classification, i.e. assigning an image with a dominant discrete emotion category. However, these methods cannot well reflect the complexity and subtlety of emotions. In this paper, we study the fine-grained regression problem of visual emotions based on convolutional neural networks (CNNs). Specifically, we develop a Polarity-consistent Deep Attention Network (PDANet), a novel network architecture that integrates attention into a CNN with an emotion polarity constraint. First, we propose to incorporate both spatial and channel-wise attentions into a CNN for visual emotion regression, which jointly considers the local spatial connectivity patterns along each channel and the interdependency between different channels. Second, we design a novel regression loss, i.e. polarity-consistent regression (PCR) loss, based on the weakly supervised emotion polarity to guide the attention generation. By optimizing the PCR loss, PDANet can generate a polarity preserved attention map and thus improve the emotion regression performance. Extensive experiments are conducted on the IAPS, NAPS, and EMOTIC datasets, and the results demonstrate that the proposed PDANet outperforms the state-of-the-art approaches by a large margin for fine-grained visual emotion regression. Our source code is released at: https://github.com/ZizhouJia/PDANet.
Tasks	Deep Attention, Emotion Classification, Emotion Recognition
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05693v1
PDF	https://arxiv.org/pdf/1909.05693v1.pdf
PWC	https://paperswithcode.com/paper/pdanet-polarity-consistent-deep-attention
Repo
Framework

Research on the pixel-based and object-oriented methods of urban feature extraction with GF-2 remote-sensing images


Title	Research on the pixel-based and object-oriented methods of urban feature extraction with GF-2 remote-sensing images
Authors	Dong-dong Zhang, Lei Zhang, Vladimir Zaborovsky, Feng Xie, Yan-wen Wu, Ting-ting Lu
Abstract	During the rapid urbanization construction of China, acquisition of urban geographic information and timely data updating are important and fundamental tasks for the refined management of cities. With the development of domestic remote sensing technology, the application of Gaofen-2 (GF-2) high-resolution remote sensing images can greatly improve the accuracy of information extraction. This paper introduces an approach using object-oriented classification methods for urban feature extraction based on GF-2 satellite data. A combination of spectral, spatial attributes and membership functions was employed for mapping the urban features of Qinhuai District, Nanjing. The data preprocessing is carried out by ENVI software, and the subsequent data is exported into the eCognition software for object-oriented classification and extraction of urban feature information. Finally, the obtained raster image classification results are vectorized using the ARCGIS software, and the vector graphics are stored in the library, which can be used for further analysis and modeling. Accuracy assessment was performed using ground truth data acquired by visual interpretation and from other reliable secondary data sources. Compared with the result of pixel-based supervised (neural net) classification, the developed object-oriented method can significantly improve extraction accuracy, and after manual interpretation, an overall accuracy of 95.44% can be achieved, with a Kappa coefficient of 0.9405, which objectively confirmed the superiority of the object-oriented method and the feasibility of the utilization of GF-2 satellite data.
Tasks	Image Classification
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03412v1
PDF	http://arxiv.org/pdf/1903.03412v1.pdf
PWC	https://paperswithcode.com/paper/research-on-the-pixel-based-and-object
Repo
Framework

Emotion helps Sentiment: A Multi-task Model for Sentiment and Emotion Analysis


Title	Emotion helps Sentiment: A Multi-task Model for Sentiment and Emotion Analysis
Authors	Abhishek Kumar, Asif Ekbal, Daisuke Kawahra, Sadao Kurohashi
Abstract	In this paper, we propose a two-layered multi-task attention based neural network that performs sentiment analysis through emotion analysis. The proposed approach is based on Bidirectional Long Short-Term Memory and uses Distributional Thesaurus as a source of external knowledge to improve the sentiment and emotion prediction. The proposed system has two levels of attention to hierarchically build a meaningful representation. We evaluate our system on the benchmark dataset of SemEval 2016 Task 6 and also compare it with the state-of-the-art systems on Stance Sentiment Emotion Corpus. Experimental results show that the proposed system improves the performance of sentiment analysis by 3.2 F-score points on SemEval 2016 Task 6 dataset. Our network also boosts the performance of emotion analysis by 5 F-score points on Stance Sentiment Emotion Corpus.
Tasks	Emotion Recognition, Sentiment Analysis
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12569v1
PDF	https://arxiv.org/pdf/1911.12569v1.pdf
PWC	https://paperswithcode.com/paper/emotion-helps-sentiment-a-multi-task-model
Repo
Framework

Low Rank Factorization for Compact Multi-Head Self-Attention


Title	Low Rank Factorization for Compact Multi-Head Self-Attention
Authors	Sneha Mehta, Huzefa Rangwala, Naren Ramakrishnan
Abstract	Effective representation learning from text has been an active area of research in the fields of NLP and text mining. Attention mechanisms have been at the forefront in order to learn contextual sentence representations. Current state-of-art approaches in representation learning use single-head and multi-head attention mechanisms to learn context-aware representations. However, these approaches can be largely parameter intensive resulting in low-resource bottlenecks. In this work we present a novel multi-head attention mechanism that uses low-rank bilinear pooling to efficiently construct a structured sentence representation that attends to multiple aspects of a sentence. We show that the proposed model is more effeffective than single-head attention mechanisms and is also more parameter efficient and faster to compute than existing multi-head approaches. We evaluate the performance of the proposed model on multiple datasets on two text classification benchmarks including: (i) Sentiment Analysis and (ii) News classification.
Tasks	Representation Learning, Sentiment Analysis, Text Classification
Published	2019-11-26
URL	https://arxiv.org/abs/1912.00835v1
PDF	https://arxiv.org/pdf/1912.00835v1.pdf
PWC	https://paperswithcode.com/paper/low-rank-factorization-for-compact-multi-head
Repo
Framework

Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps


Title	Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps
Authors	Abhishek Sehgal, Nasser Kehtarnavaz
Abstract	Deep learning solutions are being increasingly used in mobile applications. Although there are many open-source software tools for the development of deep learning solutions, there are no guidelines in one place in a unified manner for using these tools towards real-time deployment of these solutions on smartphones. From the variety of available deep learning tools, the most suited ones are used in this paper to enable real-time deployment of deep learning inference networks on smartphones. A uniform flow of implementation is devised for both Android and iOS smartphones. The advantage of using multi-threading to achieve or improve real-time throughputs is also showcased. A benchmarking framework consisting of accuracy, CPU/GPU consumption and real-time throughput is considered for validation purposes. The developed deployment approach allows deep learning models to be turned into real-time smartphone apps with ease based on publicly available deep learning and smartphone software tools. This approach is applied to six popular or representative convolutional neural network models and the validation results based on the benchmarking metrics are reported.
Tasks
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02144v1
PDF	http://arxiv.org/pdf/1901.02144v1.pdf
PWC	https://paperswithcode.com/paper/guidelines-and-benchmarks-for-deployment-of
Repo
Framework

Multi-Representational Learning for Offline Signature Verification using Multi-Loss Snapshot Ensemble of CNNs


Title	Multi-Representational Learning for Offline Signature Verification using Multi-Loss Snapshot Ensemble of CNNs
Authors	Saeed Masoudnia, Omid Mersa, Babak N. Araabi, Abdol-Hossein Vahabie, Mohammad Amin Sadeghi, Majid Nili Ahmadabadi
Abstract	Offline Signature Verification (OSV) is a challenging pattern recognition task, especially in presence of skilled forgeries that are not available during training. This study aims to tackle its challenges and meet the substantial need for generalization for OSV by examining different loss functions for Convolutional Neural Network (CNN). We adopt our new approach to OSV by asking two questions: 1. which classification loss provides more generalization for feature learning in OSV? , and 2. How integration of different losses into a unified multi-loss function lead to an improved learning framework? These questions are studied based on analysis of three loss functions, including cross entropy, Cauchy-Schwarz divergence, and hinge loss. According to complementary features of these losses, we combine them into a dynamic multi-loss function and propose a novel ensemble framework for simultaneous use of them in CNN. Our proposed Multi-Loss Snapshot Ensemble (MLSE) consists of several sequential trials. In each trial, a dominant loss function is selected from the multi-loss set, and the remaining losses act as a regularizer. Different trials learn diverse representations for each input based on signature identification task. This multi-representation set is then employed for the verification task. An ensemble of SVMs is trained on these representations, and their decisions are finally combined according to the selection of most generalizable SVM for each user. We conducted two sets of experiments based on two different protocols of OSV, i.e., writer-dependent and writer-independent on three signature datasets: GPDS-Synthetic, MCYT, and UT-SIG. Based on the writer-dependent OSV protocol, we achieved substantial improvements over the best EERs in the literature. The results of the second set of experiments also confirmed the robustness to the arrival of new users enrolled in the OSV system.
Tasks
Published	2019-03-11
URL	http://arxiv.org/abs/1903.06536v1
PDF	http://arxiv.org/pdf/1903.06536v1.pdf
PWC	https://paperswithcode.com/paper/multi-representational-learning-for-offline
Repo
Framework

Controlling for Confounders in Multimodal Emotion Classification via Adversarial Learning


Title	Controlling for Confounders in Multimodal Emotion Classification via Adversarial Learning
Authors	Mimansa Jaiswal, Zakaria Aldeneh, Emily Mower Provost
Abstract	Various psychological factors affect how individuals express emotions. Yet, when we collect data intended for use in building emotion recognition systems, we often try to do so by creating paradigms that are designed just with a focus on eliciting emotional behavior. Algorithms trained with these types of data are unlikely to function outside of controlled environments because our emotions naturally change as a function of these other factors. In this work, we study how the multimodal expressions of emotion change when an individual is under varying levels of stress. We hypothesize that stress produces modulations that can hide the true underlying emotions of individuals and that we can make emotion recognition algorithms more generalizable by controlling for variations in stress. To this end, we use adversarial networks to decorrelate stress modulations from emotion representations. We study how stress alters acoustic and lexical emotional predictions, paying special attention to how modulations due to stress affect the transferability of learned emotion recognition models across domains. Our results show that stress is indeed encoded in trained emotion classifiers and that this encoding varies across levels of emotions and across the lexical and acoustic modalities. Our results also show that emotion recognition models that control for stress during training have better generalizability when applied to new domains, compared to models that do not control for stress during training. We conclude that is is necessary to consider the effect of extraneous psychological factors when building and testing emotion recognition models.
Tasks	Emotion Classification, Emotion Recognition
Published	2019-08-23
URL	https://arxiv.org/abs/1908.08979v1
PDF	https://arxiv.org/pdf/1908.08979v1.pdf
PWC	https://paperswithcode.com/paper/controlling-for-confounders-in-multimodal
Repo
Framework

Morpheus: A Deep Learning Framework For Pixel-Level Analysis of Astronomical Image Data


Title	Morpheus: A Deep Learning Framework For Pixel-Level Analysis of Astronomical Image Data
Authors	Ryan Hausen, Brant Robertson
Abstract	We present Morpheus, a new model for generating pixel level morphological classifications of astronomical sources. Morpheus leverages advances in deep learning to perform source detection, source segmentation, and morphological classification pixel-by-pixel via a semantic segmentation algorithm adopted from the field of computer vision. By utilizing morphological information about the flux of real astronomical sources during object detection, Morpheus shows resiliency to false positive identifications of sources. We evaluate Morpheus by performing source detection, source segmentation, morphological classification on the Hubble Space Telescope data in the GOODS South field, and demonstrate a high completeness in recovering known 3D-HST sources with H<26 AB. We release the code publicly, provide online demonstrations, and present an interactive visualization of the Morpheus results in GOODS South.
Tasks	Object Detection, Semantic Segmentation
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11248v1
PDF	https://arxiv.org/pdf/1906.11248v1.pdf
PWC	https://paperswithcode.com/paper/morpheus-a-deep-learning-framework-for-pixel
Repo
Framework

Implicit Regularization via Hadamard Product Over-Parametrization in High-Dimensional Linear Regression


Title	Implicit Regularization via Hadamard Product Over-Parametrization in High-Dimensional Linear Regression
Authors	Peng Zhao, Yun Yang, Qiao-Chu He
Abstract	We consider Hadamard product parametrization as a change-of-variable (over-parametrization) technique for solving least square problems in the context of linear regression. Despite the non-convexity and exponentially many saddle points induced by the change-of-variable, we show that under certain conditions, this over-parametrization leads to implicit regularization: if we directly apply gradient descent to the residual sum of squares with sufficiently small initial values, then under proper early stopping rule, the iterates converge to a nearly sparse rate-optimal solution with relatively better accuracy than explicit regularized approaches. In particular, the resulting estimator does not suffer from extra bias due to explicit penalties, and can achieve the parametric root-$n$ rate (independent of the dimension) under proper conditions on the signal-to-noise ratio. We perform simulations to compare our methods with high dimensional linear regression with explicit regularizations. Our results illustrate advantages of using implicit regularization via gradient descent after over-parametrization in sparse vector estimation.
Tasks
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09367v1
PDF	http://arxiv.org/pdf/1903.09367v1.pdf
PWC	https://paperswithcode.com/paper/implicit-regularization-via-hadamard-product
Repo
Framework

Recommending Dream Jobs in a Biased Real World


Title	Recommending Dream Jobs in a Biased Real World
Authors	Nadia Fawaz
Abstract	Machine learning models learn what we teach them to learn. Machine learning is at the heart of recommender systems. If a machine learning model is trained on biased data, the resulting recommender system may reflect the biases in its recommendations. Biases arise at different stages in a recommender system, from existing societal biases in the data such as the professional gender gap, to biases introduced by the data collection or modeling processes. These biases impact the performance of various components of recommender systems, from offline training, to evaluation and online serving of recommendations in production systems. Specific techniques can help reduce bias at each stage of a recommender system. Reducing bias in our recommender systems is crucial to successfully recommending dream jobs to hundreds of millions members worldwide, while being true to LinkedIn’s vision: “To create economic opportunity for every member of the global workforce”.
Tasks	Recommendation Systems
Published	2019-05-10
URL	https://arxiv.org/abs/1905.06134v1
PDF	https://arxiv.org/pdf/1905.06134v1.pdf
PWC	https://paperswithcode.com/paper/190506134
Repo
Framework

The quadratic Wasserstein metric for inverse data matching


Title	The quadratic Wasserstein metric for inverse data matching
Authors	Bjorn Engquist, Kui Ren, Yunan Yang
Abstract	This work characterizes, analytically and numerically, two major effects of the quadratic Wasserstein ($W_2$) distance as the measure of data discrepancy in computational solutions of inverse problems. First, we show, in the infinite-dimensional setup, that the $W_2$ distance has a smoothing effect on the inversion process, making it robust against high-frequency noise in the data but leading to a reduced resolution for the reconstructed objects at a given noise level. Second, we demonstrate that for some finite-dimensional problems, the $W_2$ distance leads to optimization problems that have better convexity than the classical $L^2$ and $H^{-1}$ distances, making it a more preferred distance to use when solving such inverse matching problems.
Tasks
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06911v3
PDF	https://arxiv.org/pdf/1911.06911v3.pdf
PWC	https://paperswithcode.com/paper/the-quadratic-wasserstein-metric-for-inverse
Repo
Framework

Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference


Title	Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference
Authors	Nelson Nauata, Yasutaka Furukawa
Abstract	This paper tackles a 2D architecture vectorization problem, whose task is to infer an outdoor building architecture as a 2D planar graph from a single RGB image. We provide a new benchmark with ground-truth annotations for 2,001 complex buildings across the cities of Atlanta, Paris, and Las Vegas. We also propose a novel algorithm utilizing 1) convolutional neural networks (CNNs) that detects geometric primitives and infers their relationships and 2) an integer programming (IP) that assembles the information into a 2D planar graph. While being a trivial task for human vision, the inference of a graph structure with an arbitrary topology is still an open problem for computer vision. Qualitative and quantitative evaluations demonstrate that our algorithm makes significant improvements over the current state-of-the-art, towards an intelligent system at the level of human perception. We will share code and data.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05135v3
PDF	https://arxiv.org/pdf/1912.05135v3.pdf
PWC	https://paperswithcode.com/paper/vectorizing-world-buildings-planar-graph
Repo
Framework