July 29, 2019

2822 words 14 mins read

Paper Group AWR 199

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM. MegDet: A Large Mini-Batch Object Detector. Universal Dependencies to Logical Forms with Negation Scope. Clickbait Detection in Tweets Using Self-attentive Network. Visualizing and Understanding Atari Agent …

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics


Title	Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Authors	Alex Kendall, Yarin Gal, Roberto Cipolla
Abstract	Numerous deep learning applications benefit from multi-task learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task’s loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. We propose a principled approach to multi-task deep learning which weighs multiple loss functions by considering the homoscedastic uncertainty of each task. This allows us to simultaneously learn various quantities with different units or scales in both classification and regression settings. We demonstrate our model learning per-pixel depth regression, semantic and instance segmentation from a monocular input image. Perhaps surprisingly, we show our model can learn multi-task weightings and outperform separate models trained individually on each task.
Tasks	Instance Segmentation, Multi-Task Learning, Semantic Segmentation
Published	2017-05-19
URL	http://arxiv.org/abs/1705.07115v3
PDF	http://arxiv.org/pdf/1705.07115v3.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-using-uncertainty-to
Repo	https://github.com/lorenmt/mtan
Framework	pytorch

Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM


Title	Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM
Authors	Mohamed Eldesouki, Younes Samih, Ahmed Abdelali, Mohammed Attia, Hamdy Mubarak, Kareem Darwish, Kallmeyer Laura
Abstract	Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval. Segmentation entails breaking words into their constituent stems, affixes and clitics. In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect. The two approaches involve posing the problem as a ranking problem, where an SVM ranker picks the best segmentation, and as a sequence labeling problem, where a bi-LSTM RNN coupled with CRF determines where best to segment words. We are able to achieve solid segmentation results for all dialects using rather limited training data. We also show that employing Modern Standard Arabic data for domain adaptation and assuming context independence improve overall results.
Tasks	Domain Adaptation, Information Retrieval, Machine Translation
Published	2017-08-19
URL	http://arxiv.org/abs/1708.05891v1
PDF	http://arxiv.org/pdf/1708.05891v1.pdf
PWC	https://paperswithcode.com/paper/arabic-multi-dialect-segmentation-bi-lstm-crf
Repo	https://github.com/qcri/dialectal_arabic_segmenter
Framework	tf

MegDet: A Large Mini-Batch Object Detector


Title	MegDet: A Large Mini-Batch Object Detector
Authors	Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun
Abstract	The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design. But mini-batch size, a key factor in the training, has not been well studied. In this paper, we propose a Large MiniBatch Object Detector (MegDet) to enable the training with much larger mini-batch size than before (e.g. from 16 to 256), so that we can effectively utilize multiple GPUs (up to 128 in our experiments) to significantly shorten the training time. Technically, we suggest a learning rate policy and Cross-GPU Batch Normalization, which together allow us to successfully train a large mini-batch detector in much less time (e.g., from 33 hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017 Challenge, where we won the 1st place of Detection task.
Tasks	Object Detection
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07240v4
PDF	http://arxiv.org/pdf/1711.07240v4.pdf
PWC	https://paperswithcode.com/paper/megdet-a-large-mini-batch-object-detector
Repo	https://github.com/CSAILVision/semantic-segmentation-pytorch
Framework	pytorch

Universal Dependencies to Logical Forms with Negation Scope


Title	Universal Dependencies to Logical Forms with Negation Scope
Authors	Federico Fancellu, Siva Reddy, Adam Lopez, Bonnie Webber
Abstract	Many language technology applications would benefit from the ability to represent negation and its scope on top of widely-used linguistic resources. In this paper, we investigate the possibility of obtaining a first-order logic representation with negation scope marked using Universal Dependencies. To do so, we enhance UDepLambda, a framework that converts dependency graphs to logical forms. The resulting UDepLambda$\lnot$ is able to handle phenomena related to scope by means of an higher-order type theory, relevant not only to negation but also to universal quantification and other complex semantic phenomena. The initial conversion we did for English is promising, in that one can represent the scope of negation also in the presence of more complex phenomena such as universal quantifiers.
Tasks
Published	2017-02-10
URL	http://arxiv.org/abs/1702.03305v1
PDF	http://arxiv.org/pdf/1702.03305v1.pdf
PWC	https://paperswithcode.com/paper/universal-dependencies-to-logical-forms-with
Repo	https://github.com/sivareddyg/udeplambda
Framework	none

Clickbait Detection in Tweets Using Self-attentive Network


Title	Clickbait Detection in Tweets Using Self-attentive Network
Authors	Yiwei Zhou
Abstract	Clickbait detection in tweets remains an elusive challenge. In this paper, we describe the solution for the Zingel Clickbait Detector at the Clickbait Challenge 2017, which is capable of evaluating each tweet’s level of click baiting. We first reformat the regression problem as a multi-classification problem, based on the annotation scheme. To perform multi-classification, we apply a token-level, self-attentive mechanism on the hidden states of bi-directional Gated Recurrent Units (biGRU), which enables the model to generate tweets’ task-specific vector representations by attending to important tokens. The self-attentive neural network can be trained end-to-end, without involving any manual feature engineering. Our detector ranked first in the final evaluation of Clickbait Challenge 2017.
Tasks	Clickbait Detection, Feature Engineering
Published	2017-10-15
URL	http://arxiv.org/abs/1710.05364v1
PDF	http://arxiv.org/pdf/1710.05364v1.pdf
PWC	https://paperswithcode.com/paper/clickbait-detection-in-tweets-using-self
Repo	https://github.com/zhouyiwei/cc
Framework	tf

Visualizing and Understanding Atari Agents


Title	Visualizing and Understanding Atari Agents
Authors	Sam Greydanus, Anurag Koul, Jonathan Dodge, Alan Fern
Abstract	While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating useful saliency maps and use it to show 1) what strong agents attend to, 2) whether agents are making decisions for the right or wrong reasons, and 3) how agents evolve during learning. We also test our method on non-expert human subjects and find that it improves their ability to reason about these agents. Overall, our results show that saliency information can provide significant insight into an RL agent’s decisions and learning behavior.
Tasks
Published	2017-10-31
URL	http://arxiv.org/abs/1711.00138v5
PDF	http://arxiv.org/pdf/1711.00138v5.pdf
PWC	https://paperswithcode.com/paper/visualizing-and-understanding-atari-agents
Repo	https://github.com/slowjazz/interactive-atari-RL
Framework	pytorch

A Correlation Based Feature Representation for First-Person Activity Recognition


Title	A Correlation Based Feature Representation for First-Person Activity Recognition
Authors	Reza Kahani, Alireza Talebpour, Ahmad Mahmoudi-Aznaveh
Abstract	In this paper, a simple yet efficient activity recognition method for first-person video is introduced. The proposed method is appropriate for representation of high-dimensional features such as those extracted from convolutional neural networks (CNNs). The per-frame (per-segment) extracted features are considered as a set of time series, and inter and intra-time series relations are employed to represent the video descriptors. To find the inter-time relations, the series are grouped and the linear correlation between each pair of groups is calculated. The relations between them can represent the scene dynamics and local motions. The introduced grouping strategy helps to considerably reduce the computational cost. Furthermore, we split the series in temporal direction in order to preserve long term motions and better focus on each local time window. In order to extract the cyclic motion patterns, which can be considered as primary components of various activities, intra-time series correlations are exploited. The representation method results in highly discriminative features which can be linearly classified. The experiments confirm that our method outperforms the state-of-the-art methods on recognizing first-person activities on the two challenging first-person datasets.
Tasks	Activity Recognition, Egocentric Activity Recognition, Time Series
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05523v2
PDF	http://arxiv.org/pdf/1711.05523v2.pdf
PWC	https://paperswithcode.com/paper/a-correlation-based-feature-representation
Repo	https://github.com/rkahani/FirstPersonActivityRecognition
Framework	none

Grammatical facial expression recognition using customized deep neural network architecture


Title	Grammatical facial expression recognition using customized deep neural network architecture
Authors	Devesh Walawalkar
Abstract	This paper proposes to expand the visual understanding capacity of computers by helping it recognize human sign language more efficiently. This is carried out through recognition of facial expressions, which accompany the hand signs used in this language. This paper specially focuses on the popular Brazilian sign language (LIBRAS). While classifying different hand signs into their respective word meanings has already seen much literature dedicated to it, the emotions or intention with which the words are expressed haven’t primarily been taken into consideration. As from our normal human experience, words expressed with different emotions or mood can have completely different meanings attached to it. Lending computers the ability of classifying these facial expressions, can help add another level of deep understanding of what the deaf person exactly wants to communicate. The proposed idea is implemented through a deep neural network having a customized architecture. This helps learning specific patterns in individual expressions much better as compared to a generic approach. With an overall accuracy of 98.04%, the implemented deep network performs excellently well and thus is fit to be used in any given practical scenario.
Tasks	Facial Expression Recognition
Published	2017-11-16
URL	http://arxiv.org/abs/1711.06303v1
PDF	http://arxiv.org/pdf/1711.06303v1.pdf
PWC	https://paperswithcode.com/paper/grammatical-facial-expression-recognition
Repo	https://github.com/rohithv/Grammatical-facial-expression-recognition
Framework	none

A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes


Title	A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
Authors	Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo
Abstract	We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four different programming languages. Quantitative and qualitative results showed that the proposed approach can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting.
Tasks
Published	2017-04-17
URL	http://arxiv.org/abs/1704.04856v1
PDF	http://arxiv.org/pdf/1704.04856v1.pdf
PWC	https://paperswithcode.com/paper/a-neural-architecture-for-generating-natural
Repo	https://github.com/epochx/commitgen
Framework	torch

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts


Title	PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts
Authors	Franck Dernoncourt, Ji Young Lee
Abstract	We present PubMed 200k RCT, a new dataset based on PubMed for sequential sentence classification. The dataset consists of approximately 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. Each sentence of each abstract is labeled with their role in the abstract using one of the following classes: background, objective, method, result, or conclusion. The purpose of releasing this dataset is twofold. First, the majority of datasets for sequential short-text classification (i.e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task. Second, from an application perspective, researchers need better tools to efficiently skim through the literature. Automatically classifying each sentence in an abstract would help researchers read abstracts more efficiently, especially in fields where abstracts may be long, such as the medical field.
Tasks	Sentence Classification, Text Classification
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06071v1
PDF	http://arxiv.org/pdf/1710.06071v1.pdf
PWC	https://paperswithcode.com/paper/pubmed-200k-rct-a-dataset-for-sequential
Repo	https://github.com/DCYN/Ramdomized-Clinical-Trail-Classification
Framework	tf

Learning to Avoid Errors in GANs by Manipulating Input Spaces


Title	Learning to Avoid Errors in GANs by Manipulating Input Spaces
Authors	Alexander B. Jung
Abstract	Despite recent advances, large scale visual artifacts are still a common occurrence in images generated by GANs. Previous work has focused on improving the generator’s capability to accurately imitate the data distribution $p_{data}$. In this paper, we instead explore methods that enable GANs to actively avoid errors by manipulating the input space. The core idea is to apply small changes to each noise vector in order to shift them away from areas in the input space that tend to result in errors. We derive three different architectures from that idea. The main one of these consists of a simple residual module that leads to significantly less visual artifacts, while only slightly decreasing diversity. The module is trivial to add to existing GANs and costs almost zero computation and memory.
Tasks
Published	2017-07-03
URL	http://arxiv.org/abs/1707.00768v1
PDF	http://arxiv.org/pdf/1707.00768v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-avoid-errors-in-gans-by
Repo	https://github.com/aleju/gan-error-avoidance
Framework	pytorch

DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG


Title	DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG
Authors	Akara Supratak, Hao Dong, Chao Wu, Yike Guo
Abstract	The present study proposes a deep learning model, named DeepSleepNet, for automatic sleep stage scoring based on raw single-channel EEG. Most of the existing methods rely on hand-engineered features which require prior knowledge of sleep analysis. Only a few of them encode the temporal information such as transition rules, which is important for identifying the next sleep stages, into the extracted features. In the proposed model, we utilize Convolutional Neural Networks to extract time-invariant features, and bidirectional-Long Short-Term Memory to learn transition rules among sleep stages automatically from EEG epochs. We implement a two-step training algorithm to train our model efficiently. We evaluated our model using different single-channel EEGs (F4-EOG(Left), Fpz-Cz and Pz-Oz) from two public sleep datasets, that have different properties (e.g., sampling rate) and scoring standards (AASM and R&K). The results showed that our model achieved similar overall accuracy and macro F1-score (MASS: 86.2%-81.7, Sleep-EDF: 82.0%-76.9) compared to the state-of-the-art methods (MASS: 85.9%-80.5, Sleep-EDF: 78.9%-73.7) on both datasets. This demonstrated that, without changing the model architecture and the training algorithm, our model could automatically learn features for sleep stage scoring from different raw single-channel EEGs from different datasets without utilizing any hand-engineered features.
Tasks	EEG, Sleep Stage Detection
Published	2017-03-12
URL	http://arxiv.org/abs/1703.04046v2
PDF	http://arxiv.org/pdf/1703.04046v2.pdf
PWC	https://paperswithcode.com/paper/deepsleepnet-a-model-for-automatic-sleep
Repo	https://github.com/famousgrouse/AccSleepNet
Framework	tf

A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features


Title	A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features
Authors	Olga Papadopoulou, Markos Zampoglou, Symeon Papadopoulos, Ioannis Kompatsiaris
Abstract	The emergence of social media as news sources has led to the rise of clickbait posts attempting to attract users to click on article links without informing them on the actual article content. This paper presents our efforts to create a clickbait detector inspired by fake news detection algorithms, and our submission to the Clickbait Challenge 2017. The detector is based almost exclusively on text-based features taken from previous work on clickbait detection, our own work on fake post detection, and features we designed specifically for the challenge. We use a two-level classification approach, combining the outputs of 65 first-level classifiers in a second-level feature vector. We present our exploratory results with individual features and their combinations, taken from the post text and the target article title, as well as feature selection. While our own blind tests with the dataset led to an F-score of 0.63, our final evaluation in the Challenge only achieved an F-score of 0.43. We explore the possible causes of this, and lay out potential future steps to achieve more successful results.
Tasks	Clickbait Detection, Fake News Detection, Feature Selection
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08528v1
PDF	http://arxiv.org/pdf/1710.08528v1.pdf
PWC	https://paperswithcode.com/paper/a-two-level-classification-approach-for
Repo	https://github.com/clickbait-challenge/snapper
Framework	none

Learning compressed representations of blood samples time series with missing data


Title	Learning compressed representations of blood samples time series with missing data
Authors	Filippo Maria Bianchi, Karl Øyvind Mikalsen, Robert Jenssen
Abstract	Clinical measurements collected over time are naturally represented as multivariate time series (MTS), which often contain missing data. An autoencoder can learn low dimensional vectorial representations of MTS that preserve important data characteristics, but cannot deal explicitly with missing data. In this work, we propose a new framework that combines an autoencoder with the Time series Cluster Kernel (TCK), a kernel that accounts for missingness patterns in MTS. Via kernel alignment, we incorporate TCK in the autoencoder to improve the learned representations in presence of missing data. We consider a classification problem of MTS with missing values, representing blood samples of patients with surgical site infection. With our approach, rather than with a standard autoencoder, we learn representations in low dimensions that can be classified better.
Tasks	Time Series
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07547v1
PDF	http://arxiv.org/pdf/1710.07547v1.pdf
PWC	https://paperswithcode.com/paper/learning-compressed-representations-of-blood
Repo	https://github.com/FilippoMB/TCK_AE
Framework	tf

Exploring loss function topology with cyclical learning rates


Title	Exploring loss function topology with cyclical learning rates
Authors	Leslie N. Smith, Nicholay Topin
Abstract	We present observations and discussion of previously unreported phenomena discovered while training residual networks. The goal of this work is to better understand the nature of neural networks through the examination of these new empirical results. These behaviors were identified through the application of Cyclical Learning Rates (CLR) and linear network interpolation. Among these behaviors are counterintuitive increases and decreases in training loss and instances of rapid training. For example, we demonstrate how CLR can produce greater testing accuracy than traditional training despite using large learning rates. Files to replicate these results are available at https://github.com/lnsmith54/exploring-loss
Tasks
Published	2017-02-14
URL	http://arxiv.org/abs/1702.04283v1
PDF	http://arxiv.org/pdf/1702.04283v1.pdf
PWC	https://paperswithcode.com/paper/exploring-loss-function-topology-with
Repo	https://github.com/lnsmith54/exploring-loss
Framework	caffe2