Paper Group AWR 199
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM. MegDet: A Large Mini-Batch Object Detector. Universal Dependencies to Logical Forms with Negation Scope. Clickbait Detection in Tweets Using Self-attentive Network. Visualizing and Understanding Atari Agent …
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Title | Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics |
Authors | Alex Kendall, Yarin Gal, Roberto Cipolla |
Abstract | Numerous deep learning applications benefit from multi-task learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task’s loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. We propose a principled approach to multi-task deep learning which weighs multiple loss functions by considering the homoscedastic uncertainty of each task. This allows us to simultaneously learn various quantities with different units or scales in both classification and regression settings. We demonstrate our model learning per-pixel depth regression, semantic and instance segmentation from a monocular input image. Perhaps surprisingly, we show our model can learn multi-task weightings and outperform separate models trained individually on each task. |
Tasks | Instance Segmentation, Multi-Task Learning, Semantic Segmentation |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07115v3 |
http://arxiv.org/pdf/1705.07115v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-using-uncertainty-to |
Repo | https://github.com/lorenmt/mtan |
Framework | pytorch |
Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM
Title | Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM |
Authors | Mohamed Eldesouki, Younes Samih, Ahmed Abdelali, Mohammed Attia, Hamdy Mubarak, Kareem Darwish, Kallmeyer Laura |
Abstract | Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval. Segmentation entails breaking words into their constituent stems, affixes and clitics. In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect. The two approaches involve posing the problem as a ranking problem, where an SVM ranker picks the best segmentation, and as a sequence labeling problem, where a bi-LSTM RNN coupled with CRF determines where best to segment words. We are able to achieve solid segmentation results for all dialects using rather limited training data. We also show that employing Modern Standard Arabic data for domain adaptation and assuming context independence improve overall results. |
Tasks | Domain Adaptation, Information Retrieval, Machine Translation |
Published | 2017-08-19 |
URL | http://arxiv.org/abs/1708.05891v1 |
http://arxiv.org/pdf/1708.05891v1.pdf | |
PWC | https://paperswithcode.com/paper/arabic-multi-dialect-segmentation-bi-lstm-crf |
Repo | https://github.com/qcri/dialectal_arabic_segmenter |
Framework | tf |
MegDet: A Large Mini-Batch Object Detector
Title | MegDet: A Large Mini-Batch Object Detector |
Authors | Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun |
Abstract | The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design. But mini-batch size, a key factor in the training, has not been well studied. In this paper, we propose a Large MiniBatch Object Detector (MegDet) to enable the training with much larger mini-batch size than before (e.g. from 16 to 256), so that we can effectively utilize multiple GPUs (up to 128 in our experiments) to significantly shorten the training time. Technically, we suggest a learning rate policy and Cross-GPU Batch Normalization, which together allow us to successfully train a large mini-batch detector in much less time (e.g., from 33 hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017 Challenge, where we won the 1st place of Detection task. |
Tasks | Object Detection |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07240v4 |
http://arxiv.org/pdf/1711.07240v4.pdf | |
PWC | https://paperswithcode.com/paper/megdet-a-large-mini-batch-object-detector |
Repo | https://github.com/CSAILVision/semantic-segmentation-pytorch |
Framework | pytorch |
Universal Dependencies to Logical Forms with Negation Scope
Title | Universal Dependencies to Logical Forms with Negation Scope |
Authors | Federico Fancellu, Siva Reddy, Adam Lopez, Bonnie Webber |
Abstract | Many language technology applications would benefit from the ability to represent negation and its scope on top of widely-used linguistic resources. In this paper, we investigate the possibility of obtaining a first-order logic representation with negation scope marked using Universal Dependencies. To do so, we enhance UDepLambda, a framework that converts dependency graphs to logical forms. The resulting UDepLambda$\lnot$ is able to handle phenomena related to scope by means of an higher-order type theory, relevant not only to negation but also to universal quantification and other complex semantic phenomena. The initial conversion we did for English is promising, in that one can represent the scope of negation also in the presence of more complex phenomena such as universal quantifiers. |
Tasks | |
Published | 2017-02-10 |
URL | http://arxiv.org/abs/1702.03305v1 |
http://arxiv.org/pdf/1702.03305v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-dependencies-to-logical-forms-with |
Repo | https://github.com/sivareddyg/udeplambda |
Framework | none |
Clickbait Detection in Tweets Using Self-attentive Network
Title | Clickbait Detection in Tweets Using Self-attentive Network |
Authors | Yiwei Zhou |
Abstract | Clickbait detection in tweets remains an elusive challenge. In this paper, we describe the solution for the Zingel Clickbait Detector at the Clickbait Challenge 2017, which is capable of evaluating each tweet’s level of click baiting. We first reformat the regression problem as a multi-classification problem, based on the annotation scheme. To perform multi-classification, we apply a token-level, self-attentive mechanism on the hidden states of bi-directional Gated Recurrent Units (biGRU), which enables the model to generate tweets’ task-specific vector representations by attending to important tokens. The self-attentive neural network can be trained end-to-end, without involving any manual feature engineering. Our detector ranked first in the final evaluation of Clickbait Challenge 2017. |
Tasks | Clickbait Detection, Feature Engineering |
Published | 2017-10-15 |
URL | http://arxiv.org/abs/1710.05364v1 |
http://arxiv.org/pdf/1710.05364v1.pdf | |
PWC | https://paperswithcode.com/paper/clickbait-detection-in-tweets-using-self |
Repo | https://github.com/zhouyiwei/cc |
Framework | tf |
Visualizing and Understanding Atari Agents
Title | Visualizing and Understanding Atari Agents |
Authors | Sam Greydanus, Anurag Koul, Jonathan Dodge, Alan Fern |
Abstract | While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating useful saliency maps and use it to show 1) what strong agents attend to, 2) whether agents are making decisions for the right or wrong reasons, and 3) how agents evolve during learning. We also test our method on non-expert human subjects and find that it improves their ability to reason about these agents. Overall, our results show that saliency information can provide significant insight into an RL agent’s decisions and learning behavior. |
Tasks | |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1711.00138v5 |
http://arxiv.org/pdf/1711.00138v5.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-and-understanding-atari-agents |
Repo | https://github.com/slowjazz/interactive-atari-RL |
Framework | pytorch |
A Correlation Based Feature Representation for First-Person Activity Recognition
Title | A Correlation Based Feature Representation for First-Person Activity Recognition |
Authors | Reza Kahani, Alireza Talebpour, Ahmad Mahmoudi-Aznaveh |
Abstract | In this paper, a simple yet efficient activity recognition method for first-person video is introduced. The proposed method is appropriate for representation of high-dimensional features such as those extracted from convolutional neural networks (CNNs). The per-frame (per-segment) extracted features are considered as a set of time series, and inter and intra-time series relations are employed to represent the video descriptors. To find the inter-time relations, the series are grouped and the linear correlation between each pair of groups is calculated. The relations between them can represent the scene dynamics and local motions. The introduced grouping strategy helps to considerably reduce the computational cost. Furthermore, we split the series in temporal direction in order to preserve long term motions and better focus on each local time window. In order to extract the cyclic motion patterns, which can be considered as primary components of various activities, intra-time series correlations are exploited. The representation method results in highly discriminative features which can be linearly classified. The experiments confirm that our method outperforms the state-of-the-art methods on recognizing first-person activities on the two challenging first-person datasets. |
Tasks | Activity Recognition, Egocentric Activity Recognition, Time Series |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05523v2 |
http://arxiv.org/pdf/1711.05523v2.pdf | |
PWC | https://paperswithcode.com/paper/a-correlation-based-feature-representation |
Repo | https://github.com/rkahani/FirstPersonActivityRecognition |
Framework | none |
Grammatical facial expression recognition using customized deep neural network architecture
Title | Grammatical facial expression recognition using customized deep neural network architecture |
Authors | Devesh Walawalkar |
Abstract | This paper proposes to expand the visual understanding capacity of computers by helping it recognize human sign language more efficiently. This is carried out through recognition of facial expressions, which accompany the hand signs used in this language. This paper specially focuses on the popular Brazilian sign language (LIBRAS). While classifying different hand signs into their respective word meanings has already seen much literature dedicated to it, the emotions or intention with which the words are expressed haven’t primarily been taken into consideration. As from our normal human experience, words expressed with different emotions or mood can have completely different meanings attached to it. Lending computers the ability of classifying these facial expressions, can help add another level of deep understanding of what the deaf person exactly wants to communicate. The proposed idea is implemented through a deep neural network having a customized architecture. This helps learning specific patterns in individual expressions much better as compared to a generic approach. With an overall accuracy of 98.04%, the implemented deep network performs excellently well and thus is fit to be used in any given practical scenario. |
Tasks | Facial Expression Recognition |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06303v1 |
http://arxiv.org/pdf/1711.06303v1.pdf | |
PWC | https://paperswithcode.com/paper/grammatical-facial-expression-recognition |
Repo | https://github.com/rohithv/Grammatical-facial-expression-recognition |
Framework | none |
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
Title | A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes |
Authors | Pablo Loyola, Edison Marrese-Taylor, Yutaka Matsuo |
Abstract | We propose a model to automatically describe changes introduced in the source code of a program using natural language. Our method receives as input a set of code commits, which contains both the modifications and message introduced by an user. These two modalities are used to train an encoder-decoder architecture. We evaluated our approach on twelve real world open source projects from four different programming languages. Quantitative and qualitative results showed that the proposed approach can generate feasible and semantically sound descriptions not only in standard in-project settings, but also in a cross-project setting. |
Tasks | |
Published | 2017-04-17 |
URL | http://arxiv.org/abs/1704.04856v1 |
http://arxiv.org/pdf/1704.04856v1.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-architecture-for-generating-natural |
Repo | https://github.com/epochx/commitgen |
Framework | torch |
PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts
Title | PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts |
Authors | Franck Dernoncourt, Ji Young Lee |
Abstract | We present PubMed 200k RCT, a new dataset based on PubMed for sequential sentence classification. The dataset consists of approximately 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. Each sentence of each abstract is labeled with their role in the abstract using one of the following classes: background, objective, method, result, or conclusion. The purpose of releasing this dataset is twofold. First, the majority of datasets for sequential short-text classification (i.e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task. Second, from an application perspective, researchers need better tools to efficiently skim through the literature. Automatically classifying each sentence in an abstract would help researchers read abstracts more efficiently, especially in fields where abstracts may be long, such as the medical field. |
Tasks | Sentence Classification, Text Classification |
Published | 2017-10-17 |
URL | http://arxiv.org/abs/1710.06071v1 |
http://arxiv.org/pdf/1710.06071v1.pdf | |
PWC | https://paperswithcode.com/paper/pubmed-200k-rct-a-dataset-for-sequential |
Repo | https://github.com/DCYN/Ramdomized-Clinical-Trail-Classification |
Framework | tf |
Learning to Avoid Errors in GANs by Manipulating Input Spaces
Title | Learning to Avoid Errors in GANs by Manipulating Input Spaces |
Authors | Alexander B. Jung |
Abstract | Despite recent advances, large scale visual artifacts are still a common occurrence in images generated by GANs. Previous work has focused on improving the generator’s capability to accurately imitate the data distribution $p_{data}$. In this paper, we instead explore methods that enable GANs to actively avoid errors by manipulating the input space. The core idea is to apply small changes to each noise vector in order to shift them away from areas in the input space that tend to result in errors. We derive three different architectures from that idea. The main one of these consists of a simple residual module that leads to significantly less visual artifacts, while only slightly decreasing diversity. The module is trivial to add to existing GANs and costs almost zero computation and memory. |
Tasks | |
Published | 2017-07-03 |
URL | http://arxiv.org/abs/1707.00768v1 |
http://arxiv.org/pdf/1707.00768v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-avoid-errors-in-gans-by |
Repo | https://github.com/aleju/gan-error-avoidance |
Framework | pytorch |
DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG
Title | DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG |
Authors | Akara Supratak, Hao Dong, Chao Wu, Yike Guo |
Abstract | The present study proposes a deep learning model, named DeepSleepNet, for automatic sleep stage scoring based on raw single-channel EEG. Most of the existing methods rely on hand-engineered features which require prior knowledge of sleep analysis. Only a few of them encode the temporal information such as transition rules, which is important for identifying the next sleep stages, into the extracted features. In the proposed model, we utilize Convolutional Neural Networks to extract time-invariant features, and bidirectional-Long Short-Term Memory to learn transition rules among sleep stages automatically from EEG epochs. We implement a two-step training algorithm to train our model efficiently. We evaluated our model using different single-channel EEGs (F4-EOG(Left), Fpz-Cz and Pz-Oz) from two public sleep datasets, that have different properties (e.g., sampling rate) and scoring standards (AASM and R&K). The results showed that our model achieved similar overall accuracy and macro F1-score (MASS: 86.2%-81.7, Sleep-EDF: 82.0%-76.9) compared to the state-of-the-art methods (MASS: 85.9%-80.5, Sleep-EDF: 78.9%-73.7) on both datasets. This demonstrated that, without changing the model architecture and the training algorithm, our model could automatically learn features for sleep stage scoring from different raw single-channel EEGs from different datasets without utilizing any hand-engineered features. |
Tasks | EEG, Sleep Stage Detection |
Published | 2017-03-12 |
URL | http://arxiv.org/abs/1703.04046v2 |
http://arxiv.org/pdf/1703.04046v2.pdf | |
PWC | https://paperswithcode.com/paper/deepsleepnet-a-model-for-automatic-sleep |
Repo | https://github.com/famousgrouse/AccSleepNet |
Framework | tf |
A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features
Title | A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features |
Authors | Olga Papadopoulou, Markos Zampoglou, Symeon Papadopoulos, Ioannis Kompatsiaris |
Abstract | The emergence of social media as news sources has led to the rise of clickbait posts attempting to attract users to click on article links without informing them on the actual article content. This paper presents our efforts to create a clickbait detector inspired by fake news detection algorithms, and our submission to the Clickbait Challenge 2017. The detector is based almost exclusively on text-based features taken from previous work on clickbait detection, our own work on fake post detection, and features we designed specifically for the challenge. We use a two-level classification approach, combining the outputs of 65 first-level classifiers in a second-level feature vector. We present our exploratory results with individual features and their combinations, taken from the post text and the target article title, as well as feature selection. While our own blind tests with the dataset led to an F-score of 0.63, our final evaluation in the Challenge only achieved an F-score of 0.43. We explore the possible causes of this, and lay out potential future steps to achieve more successful results. |
Tasks | Clickbait Detection, Fake News Detection, Feature Selection |
Published | 2017-10-23 |
URL | http://arxiv.org/abs/1710.08528v1 |
http://arxiv.org/pdf/1710.08528v1.pdf | |
PWC | https://paperswithcode.com/paper/a-two-level-classification-approach-for |
Repo | https://github.com/clickbait-challenge/snapper |
Framework | none |
Learning compressed representations of blood samples time series with missing data
Title | Learning compressed representations of blood samples time series with missing data |
Authors | Filippo Maria Bianchi, Karl Øyvind Mikalsen, Robert Jenssen |
Abstract | Clinical measurements collected over time are naturally represented as multivariate time series (MTS), which often contain missing data. An autoencoder can learn low dimensional vectorial representations of MTS that preserve important data characteristics, but cannot deal explicitly with missing data. In this work, we propose a new framework that combines an autoencoder with the Time series Cluster Kernel (TCK), a kernel that accounts for missingness patterns in MTS. Via kernel alignment, we incorporate TCK in the autoencoder to improve the learned representations in presence of missing data. We consider a classification problem of MTS with missing values, representing blood samples of patients with surgical site infection. With our approach, rather than with a standard autoencoder, we learn representations in low dimensions that can be classified better. |
Tasks | Time Series |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1710.07547v1 |
http://arxiv.org/pdf/1710.07547v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-compressed-representations-of-blood |
Repo | https://github.com/FilippoMB/TCK_AE |
Framework | tf |
Exploring loss function topology with cyclical learning rates
Title | Exploring loss function topology with cyclical learning rates |
Authors | Leslie N. Smith, Nicholay Topin |
Abstract | We present observations and discussion of previously unreported phenomena discovered while training residual networks. The goal of this work is to better understand the nature of neural networks through the examination of these new empirical results. These behaviors were identified through the application of Cyclical Learning Rates (CLR) and linear network interpolation. Among these behaviors are counterintuitive increases and decreases in training loss and instances of rapid training. For example, we demonstrate how CLR can produce greater testing accuracy than traditional training despite using large learning rates. Files to replicate these results are available at https://github.com/lnsmith54/exploring-loss |
Tasks | |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04283v1 |
http://arxiv.org/pdf/1702.04283v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-loss-function-topology-with |
Repo | https://github.com/lnsmith54/exploring-loss |
Framework | caffe2 |