July 29, 2019

2822 words 14 mins read

Paper Group AWR 199

Paper Group AWR 199

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM. MegDet: A Large Mini-Batch Object Detector. Universal Dependencies to Logical Forms with Negation Scope. Clickbait Detection in Tweets Using Self-attentive Network. Visualizing and Understanding Atari Agent …

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics

Title Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Authors Alex Kendall, Yarin Gal, Roberto Cipolla
Abstract Numerous deep learning applications benefit from multi-task learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task’s loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. We propose a principled approach to multi-task deep learning which weighs multiple loss functions by considering the homoscedastic uncertainty of each task. This allows us to simultaneously learn various quantities with different units or scales in both classification and regression settings. We demonstrate our model learning per-pixel depth regression, semantic and instance segmentation from a monocular input image. Perhaps surprisingly, we show our model can learn multi-task weightings and outperform separate models trained individually on each task.
Tasks Instance Segmentation, Multi-Task Learning, Semantic Segmentation
Published 2017-05-19
URL http://arxiv.org/abs/1705.07115v3
PDF http://arxiv.org/pdf/1705.07115v3.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-using-uncertainty-to
Repo https://github.com/lorenmt/mtan
Framework pytorch

Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM

Title Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM
Authors Mohamed Eldesouki, Younes Samih, Ahmed Abdelali, Mohammed Attia, Hamdy Mubarak, Kareem Darwish, Kallmeyer Laura
Abstract Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval. Segmentation entails breaking words into their constituent stems, affixes and clitics. In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect. The two approaches involve posing the problem as a ranking problem, where an SVM ranker picks the best segmentation, and as a sequence labeling problem, where a bi-LSTM RNN coupled with CRF determines where best to segment words. We are able to achieve solid segmentation results for all dialects using rather limited training data. We also show that employing Modern Standard Arabic data for domain adaptation and assuming context independence improve overall results.
Tasks Domain Adaptation, Information Retrieval, Machine Translation
Published 2017-08-19
URL http://arxiv.org/abs/1708.05891v1
PDF http://arxiv.org/pdf/1708.05891v1.pdf
PWC https://paperswithcode.com/paper/arabic-multi-dialect-segmentation-bi-lstm-crf
Repo https://github.com/qcri/dialectal_arabic_segmenter
Framework tf

MegDet: A Large Mini-Batch Object Detector

Title MegDet: A Large Mini-Batch Object Detector
Authors Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun
Abstract The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design. But mini-batch size, a key factor in the training, has not been well studied. In this paper, we propose a Large MiniBatch Object Detector (MegDet) to enable the training with much larger mini-batch size than before (e.g. from 16 to 256), so that we can effectively utilize multiple GPUs (up to 128 in our experiments) to significantly shorten the training time. Technically, we suggest a learning rate policy and Cross-GPU Batch Normalization, which together allow us to successfully train a large mini-batch detector in much less time (e.g., from 33 hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017 Challenge, where we won the 1st place of Detection task.
Tasks Object Detection
Published 2017-11-20
URL http://arxiv.org/abs/1711.07240v4
PDF http://arxiv.org/pdf/1711.07240v4.pdf
PWC https://paperswithcode.com/paper/megdet-a-large-mini-batch-object-detector
Repo https://github.com/CSAILVision/semantic-segmentation-pytorch
Framework pytorch

Universal Dependencies to Logical Forms with Negation Scope

Title Universal Dependencies to Logical Forms with Negation Scope
Authors Federico Fancellu, Siva Reddy, Adam Lopez, Bonnie Webber
Abstract Many language technology applications would benefit from the ability to represent negation and its scope on top of widely-used linguistic resources. In this paper, we investigate the possibility of obtaining a first-order logic representation with negation scope marked using Universal Dependencies. To do so, we enhance UDepLambda, a framework that converts dependency graphs to logical forms. The resulting UDepLambda$\lnot$ is able to handle phenomena related to scope by means of an higher-order type theory, relevant not only to negation but also to universal quantification and other complex semantic phenomena. The initial conversion we did for English is promising, in that one can represent the scope of negation also in the presence of more complex phenomena such as universal quantifiers.
Tasks
Published 2017-02-10
URL http://arxiv.org/abs/1702.03305v1
PDF http://arxiv.org/pdf/1702.03305v1.pdf
PWC https://paperswithcode.com/paper/universal-dependencies-to-logical-forms-with
Repo https://github.com/sivareddyg/udeplambda
Framework none

Clickbait Detection in Tweets Using Self-attentive Network

Title Clickbait Detection in Tweets Using Self-attentive Network
Authors Yiwei Zhou
Abstract Clickbait detection in tweets remains an elusive challenge. In this paper, we describe the solution for the Zingel Clickbait Detector at the Clickbait Challenge 2017, which is capable of evaluating each tweet’s level of click baiting. We first reformat the regression problem as a multi-classification problem, based on the annotation scheme. To perform multi-classification, we apply a token-level, self-attentive mechanism on the hidden states of bi-directional Gated Recurrent Units (biGRU), which enables the model to generate tweets’ task-specific vector representations by attending to important tokens. The self-attentive neural network can be trained end-to-end, without involving any manual feature engineering. Our detector ranked first in the final evaluation of Clickbait Challenge 2017.
Tasks Clickbait Detection, Feature Engineering
Published 2017-10-15
URL http://arxiv.org/abs/1710.05364v1
PDF http://arxiv.org/pdf/1710.05364v1.pdf
PWC https://paperswithcode.com/paper/clickbait-detection-in-tweets-using-self
Repo https://github.com/zhouyiwei/cc
Framework tf

Visualizing and Understanding Atari Agents

Title Visualizing and Understanding Atari Agents
Authors Sam Greydanus, Anurag Koul, Jonathan Dodge, Alan Fern
Abstract While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating useful saliency maps and use it to show 1) what strong agents attend to, 2) whether agents are making decisions for the right or wrong reasons, and 3) how agents evolve during learning. We also test our method on non-expert human subjects and find that it improves their ability to reason about these agents. Overall, our results show that saliency information can provide significant insight into an RL agent’s decisions and learning behavior.
Tasks
Published 2017-10-31
URL http://arxiv.org/abs/1711.00138v5
PDF http://arxiv.org/pdf/1711.00138v5.pdf
PWC https://paperswithcode.com/paper/visualizing-and-understanding-atari-agents
Repo https://github.com/slowjazz/interactive-atari-RL
Framework pytorch

A Correlation Based Feature Representation for First-Person Activity Recognition

Title A Correlation Based Feature Representation for First-Person Activity Recognition
Authors Reza Kahani, Alireza Talebpour, Ahmad Mahmoudi-Aznaveh
Abstract In this paper, a simple yet efficient activity recognition method for first-person video is introduced. The proposed method is appropriate for representation of high-dimensional features such as those extracted from convolutional neural networks (CNNs). The per-frame (per-segment) extracted features are considered as a set of time series, and inter and intra-time series relations are employed to represent the video descriptors. To find the inter-time relations, the series are grouped and the linear correlation between each pair of groups is calculated. The relations between them can represent the scene dynamics and local motions. The introduced grouping strategy helps to considerably reduce the computational cost. Furthermore, we split the series in temporal direction in order to preserve long term motions and better focus on each local time window. In order to extract the cyclic motion patterns, which can be considered as primary components of various activities, intra-time series correlations are exploited. The representation method results in highly discriminative features which can be linearly classified. The experiments confirm that our method outperforms the state-of-the-art methods on recognizing first-person activities on the two challenging first-person datasets.
Tasks Activity Recognition, Egocentric Activity Recognition, Time Series
Published 2017-11-15
URL http://arxiv.org/abs/1711.05523v2
PDF http://arxiv.org/pdf/1711.05523v2.pdf
PWC https://paperswithcode.com/paper/a-correlation-based-feature-representation
Repo https://github.com/rkahani/FirstPersonActivityRecognition
Framework none

Grammatical facial expression recognition using customized deep neural network architecture

Title Grammatical facial expression recognition using customized deep neural network architecture
Authors Devesh Walawalkar
Abstract This paper proposes to expand the visual understanding capacity of computers by helping it recognize human sign language more efficiently. This is carried out through recognition of facial expressions, which accompany the hand signs used in this language. This paper specially focuses on the popular Brazilian sign language (LIBRAS). While classifying different hand signs into their respective word meanings has already seen much literature dedicated to it, the emotions or intention with which the words are expressed haven’t primarily been taken into consideration. As from our normal human experience, words expressed with different emotions or mood can have completely different meanings attached to it. Lending computers the ability of classifying these facial expressions, can help add another level of deep understanding of what the deaf person exactly wants to communicate. The proposed idea is implemented through a deep neural network having a customized architecture. This helps learning specific patterns in individual expressions much better as compared to a generic approach. With an overall accuracy of 98.04%, the implemented deep network performs excellently well and thus is fit to be used in any given practical scenario.
Tasks Facial Expression Recognition
Published 2017-11-16
URL http://arxiv.org/abs/1711.06303v1
PDF http://arxiv.org/pdf/1711.06303v1.pdf
PWC https://paperswithcode.com/paper/grammatical-facial-expression-recognition
Repo https://github.com/rohithv/Grammatical-facial-expression-recognition
Framework none

A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes

Title A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
Authors Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo
Abstract We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four different programming languages. Quantitative and qualitative results showed that the proposed approach can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting.
Tasks
Published 2017-04-17
URL http://arxiv.org/abs/1704.04856v1
PDF http://arxiv.org/pdf/1704.04856v1.pdf
PWC https://paperswithcode.com/paper/a-neural-architecture-for-generating-natural
Repo https://github.com/epochx/commitgen
Framework torch

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts

Title PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts
Authors Franck Dernoncourt, Ji Young Lee
Abstract We present PubMed 200k RCT, a new dataset based on PubMed for sequential sentence classification. The dataset consists of approximately 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. Each sentence of each abstract is labeled with their role in the abstract using one of the following classes: background, objective, method, result, or conclusion. The purpose of releasing this dataset is twofold. First, the majority of datasets for sequential short-text classification (i.e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task. Second, from an application perspective, researchers need better tools to efficiently skim through the literature. Automatically classifying each sentence in an abstract would help researchers read abstracts more efficiently, especially in fields where abstracts may be long, such as the medical field.
Tasks Sentence Classification, Text Classification
Published 2017-10-17
URL http://arxiv.org/abs/1710.06071v1
PDF http://arxiv.org/pdf/1710.06071v1.pdf
PWC https://paperswithcode.com/paper/pubmed-200k-rct-a-dataset-for-sequential
Repo https://github.com/DCYN/Ramdomized-Clinical-Trail-Classification
Framework tf

Learning to Avoid Errors in GANs by Manipulating Input Spaces

Title Learning to Avoid Errors in GANs by Manipulating Input Spaces
Authors Alexander B. Jung
Abstract Despite recent advances, large scale visual artifacts are still a common occurrence in images generated by GANs. Previous work has focused on improving the generator’s capability to accurately imitate the data distribution $p_{data}$. In this paper, we instead explore methods that enable GANs to actively avoid errors by manipulating the input space. The core idea is to apply small changes to each noise vector in order to shift them away from areas in the input space that tend to result in errors. We derive three different architectures from that idea. The main one of these consists of a simple residual module that leads to significantly less visual artifacts, while only slightly decreasing diversity. The module is trivial to add to existing GANs and costs almost zero computation and memory.
Tasks
Published 2017-07-03
URL http://arxiv.org/abs/1707.00768v1
PDF http://arxiv.org/pdf/1707.00768v1.pdf
PWC https://paperswithcode.com/paper/learning-to-avoid-errors-in-gans-by
Repo https://github.com/aleju/gan-error-avoidance
Framework pytorch

DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG

Title DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG
Authors Akara Supratak, Hao Dong, Chao Wu, Yike Guo
Abstract The present study proposes a deep learning model, named DeepSleepNet, for automatic sleep stage scoring based on raw single-channel EEG. Most of the existing methods rely on hand-engineered features which require prior knowledge of sleep analysis. Only a few of them encode the temporal information such as transition rules, which is important for identifying the next sleep stages, into the extracted features. In the proposed model, we utilize Convolutional Neural Networks to extract time-invariant features, and bidirectional-Long Short-Term Memory to learn transition rules among sleep stages automatically from EEG epochs. We implement a two-step training algorithm to train our model efficiently. We evaluated our model using different single-channel EEGs (F4-EOG(Left), Fpz-Cz and Pz-Oz) from two public sleep datasets, that have different properties (e.g., sampling rate) and scoring standards (AASM and R&K). The results showed that our model achieved similar overall accuracy and macro F1-score (MASS: 86.2%-81.7, Sleep-EDF: 82.0%-76.9) compared to the state-of-the-art methods (MASS: 85.9%-80.5, Sleep-EDF: 78.9%-73.7) on both datasets. This demonstrated that, without changing the model architecture and the training algorithm, our model could automatically learn features for sleep stage scoring from different raw single-channel EEGs from different datasets without utilizing any hand-engineered features.
Tasks EEG, Sleep Stage Detection
Published 2017-03-12
URL http://arxiv.org/abs/1703.04046v2
PDF http://arxiv.org/pdf/1703.04046v2.pdf
PWC https://paperswithcode.com/paper/deepsleepnet-a-model-for-automatic-sleep
Repo https://github.com/famousgrouse/AccSleepNet
Framework tf

A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features

Title A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features
Authors Olga Papadopoulou, Markos Zampoglou, Symeon Papadopoulos, Ioannis Kompatsiaris
Abstract The emergence of social media as news sources has led to the rise of clickbait posts attempting to attract users to click on article links without informing them on the actual article content. This paper presents our efforts to create a clickbait detector inspired by fake news detection algorithms, and our submission to the Clickbait Challenge 2017. The detector is based almost exclusively on text-based features taken from previous work on clickbait detection, our own work on fake post detection, and features we designed specifically for the challenge. We use a two-level classification approach, combining the outputs of 65 first-level classifiers in a second-level feature vector. We present our exploratory results with individual features and their combinations, taken from the post text and the target article title, as well as feature selection. While our own blind tests with the dataset led to an F-score of 0.63, our final evaluation in the Challenge only achieved an F-score of 0.43. We explore the possible causes of this, and lay out potential future steps to achieve more successful results.
Tasks Clickbait Detection, Fake News Detection, Feature Selection
Published 2017-10-23
URL http://arxiv.org/abs/1710.08528v1
PDF http://arxiv.org/pdf/1710.08528v1.pdf
PWC https://paperswithcode.com/paper/a-two-level-classification-approach-for
Repo https://github.com/clickbait-challenge/snapper
Framework none

Learning compressed representations of blood samples time series with missing data

Title Learning compressed representations of blood samples time series with missing data
Authors Filippo Maria Bianchi, Karl Øyvind Mikalsen, Robert Jenssen
Abstract Clinical measurements collected over time are naturally represented as multivariate time series (MTS), which often contain missing data. An autoencoder can learn low dimensional vectorial representations of MTS that preserve important data characteristics, but cannot deal explicitly with missing data. In this work, we propose a new framework that combines an autoencoder with the Time series Cluster Kernel (TCK), a kernel that accounts for missingness patterns in MTS. Via kernel alignment, we incorporate TCK in the autoencoder to improve the learned representations in presence of missing data. We consider a classification problem of MTS with missing values, representing blood samples of patients with surgical site infection. With our approach, rather than with a standard autoencoder, we learn representations in low dimensions that can be classified better.
Tasks Time Series
Published 2017-10-20
URL http://arxiv.org/abs/1710.07547v1
PDF http://arxiv.org/pdf/1710.07547v1.pdf
PWC https://paperswithcode.com/paper/learning-compressed-representations-of-blood
Repo https://github.com/FilippoMB/TCK_AE
Framework tf

Exploring loss function topology with cyclical learning rates

Title Exploring loss function topology with cyclical learning rates
Authors Leslie N. Smith, Nicholay Topin
Abstract We present observations and discussion of previously unreported phenomena discovered while training residual networks. The goal of this work is to better understand the nature of neural networks through the examination of these new empirical results. These behaviors were identified through the application of Cyclical Learning Rates (CLR) and linear network interpolation. Among these behaviors are counterintuitive increases and decreases in training loss and instances of rapid training. For example, we demonstrate how CLR can produce greater testing accuracy than traditional training despite using large learning rates. Files to replicate these results are available at https://github.com/lnsmith54/exploring-loss
Tasks
Published 2017-02-14
URL http://arxiv.org/abs/1702.04283v1
PDF http://arxiv.org/pdf/1702.04283v1.pdf
PWC https://paperswithcode.com/paper/exploring-loss-function-topology-with
Repo https://github.com/lnsmith54/exploring-loss
Framework caffe2
comments powered by Disqus