July 30, 2019

2618 words 13 mins read

Paper Group AWR 30

Cross-linguistic differences and similarities in image descriptions. HotFlip: White-Box Adversarial Examples for Text Classification. SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties. Sharp Minima Can Generalize For Deep Nets. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learnin …

Cross-linguistic differences and similarities in image descriptions


Title	Cross-linguistic differences and similarities in image descriptions
Authors	Emiel van Miltenburg, Desmond Elliott, Piek Vossen
Abstract	Automatic image description systems are commonly trained and evaluated on large image description datasets. Recently, researchers have started to collect such datasets for languages other than English. An unexplored question is how different these datasets are from English and, if there are any differences, what causes them to differ. This paper provides a cross-linguistic comparison of Dutch, English, and German image descriptions. We find that these descriptions are similar in many respects, but the familiarity of crowd workers with the subjects of the images has a noticeable influence on description specificity.
Tasks
Published	2017-07-06
URL	http://arxiv.org/abs/1707.01736v2
PDF	http://arxiv.org/pdf/1707.01736v2.pdf
PWC	https://paperswithcode.com/paper/cross-linguistic-differences-and-similarities
Repo	https://github.com/cltl/DutchDescriptions
Framework	none

HotFlip: White-Box Adversarial Examples for Text Classification


Title	HotFlip: White-Box Adversarial Examples for Text Classification
Authors	Javid Ebrahimi, Anyi Rao, Daniel Lowd, Dejing Dou
Abstract	We propose an efficient method to generate white-box adversarial examples to trick a character-level neural classifier. We find that only a few manipulations are needed to greatly decrease the accuracy. Our method relies on an atomic flip operation, which swaps one token for another, based on the gradients of the one-hot input vectors. Due to efficiency of our method, we can perform adversarial training which makes the model more robust to attacks at test time. With the use of a few semantics-preserving constraints, we demonstrate that HotFlip can be adapted to attack a word-level classifier as well.
Tasks	Text Classification
Published	2017-12-19
URL	http://arxiv.org/abs/1712.06751v2
PDF	http://arxiv.org/pdf/1712.06751v2.pdf
PWC	https://paperswithcode.com/paper/hotflip-white-box-adversarial-examples-for
Repo	https://github.com/AnyiRao/WordAdver
Framework	none

SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties


Title	SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties
Authors	Garrett B. Goh, Nathan O. Hodas, Charles Siegel, Abhinav Vishnu
Abstract	Chemical databases store information in text representations, and the SMILES format is a universal standard used in many cheminformatics software. Encoded in each SMILES string is structural information that can be used to predict complex chemical properties. In this work, we develop SMILES2vec, a deep RNN that automatically learns features from SMILES to predict chemical properties, without the need for additional explicit feature engineering. Using Bayesian optimization methods to tune the network architecture, we show that an optimized SMILES2vec model can serve as a general-purpose neural network for predicting distinct chemical properties including toxicity, activity, solubility and solvation energy, while also outperforming contemporary MLP neural networks that uses engineered features. Furthermore, we demonstrate proof-of-concept of interpretability by developing an explanation mask that localizes on the most important characters used in making a prediction. When tested on the solubility dataset, it identified specific parts of a chemical that is consistent with established first-principles knowledge with an accuracy of 88%. Our work demonstrates that neural networks can learn technically accurate chemical concept and provide state-of-the-art accuracy, making interpretable deep neural networks a useful tool of relevance to the chemical industry.
Tasks	Feature Engineering
Published	2017-12-06
URL	http://arxiv.org/abs/1712.02034v2
PDF	http://arxiv.org/pdf/1712.02034v2.pdf
PWC	https://paperswithcode.com/paper/smiles2vec-an-interpretable-general-purpose
Repo	https://github.com/liambll/drug-activity-prediction
Framework	tf

Sharp Minima Can Generalize For Deep Nets


Title	Sharp Minima Can Generalize For Deep Nets
Authors	Laurent Dinh, Razvan Pascanu, Samy Bengio, Yoshua Bengio
Abstract	Despite their overwhelming capacity to overfit, deep learning architectures tend to generalize relatively well to unseen data, allowing them to be deployed in practice. However, explaining why this is the case is still an open area of research. One standing hypothesis that is gaining popularity, e.g. Hochreiter & Schmidhuber (1997); Keskar et al. (2017), is that the flatness of minima of the loss function found by stochastic gradient based methods results in good generalization. This paper argues that most notions of flatness are problematic for deep models and can not be directly applied to explain generalization. Specifically, when focusing on deep networks with rectifier units, we can exploit the particular geometry of parameter space induced by the inherent symmetries that these architectures exhibit to build equivalent models corresponding to arbitrarily sharper minima. Furthermore, if we allow to reparametrize a function, the geometry of its parameters can change drastically without affecting its generalization properties.
Tasks
Published	2017-03-15
URL	http://arxiv.org/abs/1703.04933v2
PDF	http://arxiv.org/pdf/1703.04933v2.pdf
PWC	https://paperswithcode.com/paper/sharp-minima-can-generalize-for-deep-nets
Repo	https://github.com/timbrgr/yellow-brick-road-to-MrLd-city
Framework	tf

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning


Title	Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
Authors	Junhyuk Oh, Satinder Singh, Honglak Lee, Pushmeet Kohli
Abstract	As a step towards developing zero-shot task generalization capabilities in reinforcement learning (RL), we introduce a new RL problem where the agent should learn to execute sequences of instructions after learning useful skills that solve subtasks. In this problem, we consider two types of generalizations: to previously unseen instructions and to longer sequences of instructions. For generalization over unseen instructions, we propose a new objective which encourages learning correspondences between similar subtasks by making analogies. For generalization over sequential instructions, we present a hierarchical architecture where a meta controller learns to use the acquired skills for executing the instructions. To deal with delayed reward, we propose a new neural architecture in the meta controller that learns when to update the subtask, which makes learning more efficient. Experimental results on a stochastic 3D domain show that the proposed ideas are crucial for generalization to longer instructions as well as unseen instructions.
Tasks
Published	2017-06-15
URL	http://arxiv.org/abs/1706.05064v2
PDF	http://arxiv.org/pdf/1706.05064v2.pdf
PWC	https://paperswithcode.com/paper/zero-shot-task-generalization-with-multi-task
Repo	https://github.com/seriousssam/zero-shot-task-generalization-implementation
Framework	none

Emotion Intensities in Tweets


Title	Emotion Intensities in Tweets
Authors	Saif M. Mohammad, Felipe Bravo-Marquez
Abstract	This paper examines the task of detecting intensity of emotion from text. We create the first datasets of tweets annotated for anger, fear, joy, and sadness intensities. We use a technique called best–worst scaling (BWS) that improves annotation consistency and obtains reliable fine-grained scores. We show that emotion-word hashtags often impact emotion intensity, usually conveying a more intense emotion. Finally, we create a benchmark regression system and conduct experiments to determine: which features are useful for detecting emotion intensity, and, the extent to which two emotions are similar in terms of how they manifest in language.
Tasks
Published	2017-08-11
URL	http://arxiv.org/abs/1708.03696v1
PDF	http://arxiv.org/pdf/1708.03696v1.pdf
PWC	https://paperswithcode.com/paper/emotion-intensities-in-tweets
Repo	https://github.com/felipebravom/AffectiveTweets
Framework	none

EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION


Title	EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION
Authors	Steffen Eger, Erik-Lân Do Dinh, Ilia Kuznetsov, Masoud Kiaeeha, Iryna Gurevych
Abstract	This paper describes our approach to the SemEval 2017 Task 10: “Extracting Keyphrases and Relations from Scientific Publications”, specifically to Subtask (B): “Classification of identified keyphrases”. We explored three different deep learning approaches: a character-level convolutional neural network (CNN), a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM. From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0.63 on the test data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four according to this official score. However, we erroneously trained 2 out of 3 neural nets (the stacker and the CNN) on only roughly 15% of the full data, namely, the original development set. When trained on the full data (training+development), our ensemble has a micro-F1-score of 0.69. Our code is available from https://github.com/UKPLab/semeval2017-scienceie.
Tasks
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02215v2
PDF	http://arxiv.org/pdf/1704.02215v2.pdf
PWC	https://paperswithcode.com/paper/eelection-at-semeval-2017-task-10-ensemble-of
Repo	https://github.com/UKPLab/semeval2017-scienceie
Framework	tf

FEUP at SemEval-2017 Task 5: Predicting Sentiment Polarity and Intensity with Financial Word Embeddings


Title	FEUP at SemEval-2017 Task 5: Predicting Sentiment Polarity and Intensity with Financial Word Embeddings
Authors	Pedro Saleiro, Eduarda Mendes Rodrigues, Carlos Soares, Eugénio Oliveira
Abstract	This paper presents the approach developed at the Faculty of Engineering of University of Porto, to participate in SemEval 2017, Task 5: Fine-grained Sentiment Analysis on Financial Microblogs and News. The task consisted in predicting a real continuous variable from -1.0 to +1.0 representing the polarity and intensity of sentiment concerning companies/stocks mentioned in short texts. We modeled the task as a regression analysis problem and combined traditional techniques such as pre-processing short texts, bag-of-words representations and lexical-based features with enhanced financial specific bag-of-embeddings. We used an external collection of tweets and news headlines mentioning companies/stocks from S&P 500 to create financial word embeddings which are able to capture domain-specific syntactic and semantic similarities. The resulting approach obtained a cosine similarity score of 0.69 in sub-task 5.1 - Microblogs and 0.68 in sub-task 5.2 - News Headlines.
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-04-17
URL	http://arxiv.org/abs/1704.05091v1
PDF	http://arxiv.org/pdf/1704.05091v1.pdf
PWC	https://paperswithcode.com/paper/feup-at-semeval-2017-task-5-predicting
Repo	https://github.com/saleiro/Financial-Sentiment-Analysis
Framework	none

NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis


Title	NILC-USP at SemEval-2017 Task 4: A Multi-view Ensemble for Twitter Sentiment Analysis
Authors	Edilson A. Corrêa Jr., Vanessa Queiroz Marinho, Leandro Borges dos Santos
Abstract	This paper describes our multi-view ensemble approach to SemEval-2017 Task 4 on Sentiment Analysis in Twitter, specifically, the Message Polarity Classification subtask for English (subtask A). Our system is a voting ensemble, where each base classifier is trained in a different feature space. The first space is a bag-of-words model and has a Linear SVM as base classifier. The second and third spaces are two different strategies of combining word embeddings to represent sentences and use a Linear SVM and a Logistic Regressor as base classifiers. The proposed system was ranked 18th out of 38 systems considering F1 score and 20th considering recall.
Tasks	Sentiment Analysis, Twitter Sentiment Analysis, Word Embeddings
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02263v1
PDF	http://arxiv.org/pdf/1704.02263v1.pdf
PWC	https://paperswithcode.com/paper/nilc-usp-at-semeval-2017-task-4-a-multi-view
Repo	https://github.com/edilsonacjr/semeval2017
Framework	none

Entropy Non-increasing Games for the Improvement of Dataflow Programming


Title	Entropy Non-increasing Games for the Improvement of Dataflow Programming
Authors	Norbert Bátfai, Renátó Besenczi, Gergő Bogacsovics, Fanny Monori
Abstract	In this article, we introduce a new conception of a family of esport games called Samu Entropy to try to improve dataflow program graphs like the ones that are based on Google’s TensorFlow. Currently, the Samu Entropy project specifies only requirements for new esport games to be developed with particular attention to the investigation of the relationship between esport and artificial intelligence. It is quite obvious that there is a very close and natural relationship between esport games and artificial intelligence. Furthermore, the project Samu Entropy focuses not only on using artificial intelligence, but on creating AI in a new way. We present a reference game called Face Battle that implements the Samu Entropy requirements.
Tasks
Published	2017-02-14
URL	http://arxiv.org/abs/1702.04389v1
PDF	http://arxiv.org/pdf/1702.04389v1.pdf
PWC	https://paperswithcode.com/paper/entropy-non-increasing-games-for-the
Repo	https://github.com/nbatfai/SamuEntropy
Framework	none

Deep Incremental Boosting


Title	Deep Incremental Boosting
Authors	Alan Mosca, George D Magoulas
Abstract	This paper introduces Deep Incremental Boosting, a new technique derived from AdaBoost, specifically adapted to work with Deep Learning methods, that reduces the required training time and improves generalisation. We draw inspiration from Transfer of Learning approaches to reduce the start-up time to training each incremental Ensemble member. We show a set of experiments that outlines some preliminary results on some common Deep Learning datasets and discuss the potential improvements Deep Incremental Boosting brings to traditional Ensemble methods in Deep Learning.
Tasks
Published	2017-08-11
URL	http://arxiv.org/abs/1708.03704v1
PDF	http://arxiv.org/pdf/1708.03704v1.pdf
PWC	https://paperswithcode.com/paper/deep-incremental-boosting
Repo	https://github.com/nitbix/toupee
Framework	none

SPINE: SParse Interpretable Neural Embeddings


Title	SPINE: SParse Interpretable Neural Embeddings
Authors	Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, Eduard Hovy
Abstract	Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations. While these representations capture the underlying complexity and latent trends in the data, they are far from being interpretable. We propose a novel variant of denoising k-sparse autoencoders that generates highly efficient and interpretable distributed word representations (word embeddings), beginning with existing word representations from state-of-the-art methods like GloVe and word2vec. Through large scale human evaluation, we report that our resulting word embedddings are much more interpretable than the original GloVe and word2vec embeddings. Moreover, our embeddings outperform existing popular word embeddings on a diverse suite of benchmark downstream tasks.
Tasks	Denoising, Word Embeddings
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08792v1
PDF	http://arxiv.org/pdf/1711.08792v1.pdf
PWC	https://paperswithcode.com/paper/spine-sparse-interpretable-neural-embeddings
Repo	https://github.com/harsh19/SPINE
Framework	pytorch

Face Attention Network: An Effective Face Detector for the Occluded Faces


Title	Face Attention Network: An Effective Face Detector for the Occluded Faces
Authors	Jianfeng Wang, Ye Yuan, Gang Yu
Abstract	The performance of face detection has been largely improved with the development of convolutional neural network. However, the occlusion issue due to mask and sunglasses, is still a challenging problem. The improvement on the recall of these occluded cases usually brings the risk of high false positives. In this paper, we present a novel face detector called Face Attention Network (FAN), which can significantly improve the recall of the face detection problem in the occluded case without compromising the speed. More specifically, we propose a new anchor-level attention, which will highlight the features from the face region. Integrated with our anchor assign strategy and data augmentation techniques, we obtain state-of-art results on public face detection benchmarks like WiderFace and MAFA. The code will be released for reproduction.
Tasks	Data Augmentation, Face Detection, Occluded Face Detection
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07246v2
PDF	http://arxiv.org/pdf/1711.07246v2.pdf
PWC	https://paperswithcode.com/paper/face-attention-network-an-effective-face
Repo	https://github.com/rainofmine/Face_Attention_Network
Framework	pytorch

End-to-end Conversation Modeling Track in DSTC6


Title	End-to-end Conversation Modeling Track in DSTC6
Authors	Chiori Hori, Takaaki Hori
Abstract	End-to-end training of neural networks is a promising approach to automatic construction of dialog systems using a human-to-human dialog corpus. Recently, Vinyals et al. tested neural conversation models using OpenSubtitles. Lowe et al. released the Ubuntu Dialogue Corpus for researching unstructured multi-turn dialogue systems. Furthermore, the approach has been extended to accomplish task oriented dialogs to provide information properly with natural conversation. For example, Ghazvininejad et al. proposed a knowledge grounded neural conversation model [3], where the research is aiming at combining conversational dialogs with task-oriented knowledge using unstructured data such as Twitter data for conversation and Foursquare data for external knowledge.However, the task is still limited to a restaurant information service, and has not yet been tested with a wide variety of dialog tasks. In addition, it is still unclear how to create intelligent dialog systems that can respond like a human agent. In consideration of these problems, we proposed a challenge track to the 6th dialog system technology challenges (DSTC6) using human-to-human dialog data to mimic human dialog behaviors. The focus of the challenge track is to train end-to-end conversation models from human-to-human conversation and accomplish end-to-end dialog tasks in various situations assuming a customer service, in which a system plays a role of human agent and generates natural and informative sentences in response to user’s questions or comments given dialog context.
Tasks
Published	2017-06-22
URL	http://arxiv.org/abs/1706.07440v2
PDF	http://arxiv.org/pdf/1706.07440v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-conversation-modeling-track-in
Repo	https://github.com/dialogtekgeek/DSTC6-End-to-End-Conversation-Modeling
Framework	none

A Closer Look at Spatiotemporal Convolutions for Action Recognition


Title	A Closer Look at Spatiotemporal Convolutions for Action Recognition
Authors	Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, Manohar Paluri
Abstract	In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition. Our motivation stems from the observation that 2D CNNs applied to individual frames of the video have remained solid performers in action recognition. In this work we empirically demonstrate the accuracy advantages of 3D CNNs over 2D CNNs within the framework of residual learning. Furthermore, we show that factorizing the 3D convolutional filters into separate spatial and temporal components yields significantly advantages in accuracy. Our empirical study leads to the design of a new spatiotemporal convolutional block “R(2+1)D” which gives rise to CNNs that achieve results comparable or superior to the state-of-the-art on Sports-1M, Kinetics, UCF101 and HMDB51.
Tasks	Action Recognition In Videos, Temporal Action Localization
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11248v3
PDF	http://arxiv.org/pdf/1711.11248v3.pdf
PWC	https://paperswithcode.com/paper/a-closer-look-at-spatiotemporal-convolutions
Repo	https://github.com/facebookresearch/R2Plus1D
Framework	caffe2