February 2, 2020

3684 words 18 mins read

Paper Group AWR 35

Paper Group AWR 35

Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media. Learning Depth-Guided Convolutions for Monocular 3D Object Detection. Knowing What, How and Why: A Near Complete Solution for Aspect-based Sentiment Analysis. A deep learning approach for automated detection of geographic atrophy from color fundus photographs. …

Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media

Title Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media
Authors Abeer Aldayel, Walid Magdy
Abstract To what extent user’s stance towards a given topic could be inferred? Most of the studies on stance detection have focused on analysing user’s posts on a given topic to predict the stance. However, the stance in social media can be inferred from a mixture of signals that might reflect user’s beliefs including posts and online interactions. This paper examines various online features of users to detect their stance towards different topics. We compare multiple set of features, including on-topic content, network interactions, user’s preferences, and online network connections. Our objective is to understand the online signals that can reveal the users’ stance. Experimentation is applied on tweets dataset from the SemEval stance detection task, which covers five topics. Results show that stance of a user can be detected with multiple signals of user’s online activity, including their posts on the topic, the network they interact with or follow, the websites they visit, and the content they like. The performance of the stance modelling using different network features are comparable with the state-of-the-art reported model that used textual content only. In addition, combining network and content features leads to the highest reported performance to date on the SemEval dataset with F-measure of 72.49%. We further present an extensive analysis to show how these different set of features can reveal stance. Our findings have distinct privacy implications, where they highlight that stance is strongly embedded in user’s online social network that, in principle, individuals can be profiled from their interactions and connections even when they do not post about the topic.
Tasks Stance Detection
Published 2019-08-08
URL https://arxiv.org/abs/1908.03146v1
PDF https://arxiv.org/pdf/1908.03146v1.pdf
PWC https://paperswithcode.com/paper/your-stance-is-exposed-analysing-possible
Repo https://github.com/AbeerAldayel/Stance_detection
Framework none

Learning Depth-Guided Convolutions for Monocular 3D Object Detection

Title Learning Depth-Guided Convolutions for Monocular 3D Object Detection
Authors Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, Ping Luo
Abstract 3D object detection from a single image without LiDAR is a challenging task due to the lack of accurate depth information. Conventional 2D convolutions are unsuitable for this task because they fail to capture local object and its scale information, which are vital for 3D object detection. To better represent 3D structure, prior arts typically transform depth maps estimated from 2D images into a pseudo-LiDAR representation, and then apply existing 3D point-cloud based object detectors. However, their results depend heavily on the accuracy of the estimated depth maps, resulting in suboptimal performance. In this work, instead of using pseudo-LiDAR representation, we improve the fundamental 2D fully convolutions by proposing a new local convolutional network (LCN), termed Depth-guided Dynamic-Depthwise-Dilated LCN (D$^4$LCN), where the filters and their receptive fields can be automatically learned from image-based depth maps, making different pixels of different images have different filters. D$^4$LCN overcomes the limitation of conventional 2D convolutions and narrows the gap between image representation and 3D point cloud representation. Extensive experiments show that D$^4$LCN outperforms existing works by large margins. For example, the relative improvement of D$^4$LCN against the state-of-the-art on KITTI is 9.1% in the moderate setting. The code is available at https://github.com/dingmyu/D4LCN.
Tasks 3D Object Detection, Object Detection
Published 2019-12-10
URL https://arxiv.org/abs/1912.04799v2
PDF https://arxiv.org/pdf/1912.04799v2.pdf
PWC https://paperswithcode.com/paper/learning-depth-guided-convolutions-for
Repo https://github.com/dingmyu/D4LCN
Framework pytorch

Knowing What, How and Why: A Near Complete Solution for Aspect-based Sentiment Analysis

Title Knowing What, How and Why: A Near Complete Solution for Aspect-based Sentiment Analysis
Authors Haiyun Peng, Lu Xu, Lidong Bing, Fei Huang, Wei Lu, Luo Si
Abstract Target-based sentiment analysis or aspect-based sentiment analysis (ABSA) refers to addressing various sentiment analysis tasks at a fine-grained level, which includes but is not limited to aspect extraction, aspect sentiment classification, and opinion extraction. There exist many solvers of the above individual subtasks or a combination of two subtasks, and they can work together to tell a complete story, i.e. the discussed aspect, the sentiment on it, and the cause of the sentiment. However, no previous ABSA research tried to provide a complete solution in one shot. In this paper, we introduce a new subtask under ABSA, named aspect sentiment triplet extraction (ASTE). Particularly, a solver of this task needs to extract triplets (What, How, Why) from the inputs, which show WHAT the targeted aspects are, HOW their sentiment polarities are and WHY they have such polarities (i.e. opinion reasons). For instance, one triplet from “Waiters are very friendly and the pasta is simply average” could be (‘Waiters’, positive, ‘friendly’). We propose a two-stage framework to address this task. The first stage predicts what, how and why in a unified model, and then the second stage pairs up the predicted what (how) and why from the first stage to output triplets. In the experiments, our framework has set a benchmark performance in this novel triplet extraction task. Meanwhile, it outperforms a few strong baselines adapted from state-of-the-art related methods.
Tasks Aspect-Based Sentiment Analysis, Aspect Extraction, Sentiment Analysis
Published 2019-11-05
URL https://arxiv.org/abs/1911.01616v4
PDF https://arxiv.org/pdf/1911.01616v4.pdf
PWC https://paperswithcode.com/paper/knowing-what-how-and-why-a-near-complete
Repo https://github.com/xuuuluuu/SemEval-Triplet-data
Framework none

A deep learning approach for automated detection of geographic atrophy from color fundus photographs

Title A deep learning approach for automated detection of geographic atrophy from color fundus photographs
Authors Tiarnan D. Keenan, Shazia Dharssi, Yifan Peng, Qingyu Chen, Elvira Agrón, Wai T. Wong, Zhiyong Lu, Emily Y. Chew
Abstract Purpose: To assess the utility of deep learning in the detection of geographic atrophy (GA) from color fundus photographs; secondary aim to explore potential utility in detecting central GA (CGA). Design: A deep learning model was developed to detect the presence of GA in color fundus photographs, and two additional models to detect CGA in different scenarios. Participants: 59,812 color fundus photographs from longitudinal follow up of 4,582 participants in the AREDS dataset. Gold standard labels were from human expert reading center graders using a standardized protocol. Methods: A deep learning model was trained to use color fundus photographs to predict GA presence from a population of eyes with no AMD to advanced AMD. A second model was trained to predict CGA presence from the same population. A third model was trained to predict CGA presence from the subset of eyes with GA. For training and testing, 5-fold cross-validation was employed. For comparison with human clinician performance, model performance was compared with that of 88 retinal specialists. Results: The deep learning models (GA detection, CGA detection from all eyes, and centrality detection from GA eyes) had AUC of 0.933-0.976, 0.939-0.976, and 0.827-0.888, respectively. The GA detection model had accuracy, sensitivity, specificity, and precision of 0.965, 0.692, 0.978, and 0.584, respectively. The CGA detection model had equivalent values of 0.966, 0.763, 0.971, and 0.394. The centrality detection model had equivalent values of 0.762, 0.782, 0.729, and 0.799. Conclusions: A deep learning model demonstrated high accuracy for the automated detection of GA. The AUC was non-inferior to that of human retinal specialists. Deep learning approaches may also be applied to the identification of CGA. The code and pretrained models are publicly available at https://github.com/ncbi-nlp/DeepSeeNet.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03153v1
PDF https://arxiv.org/pdf/1906.03153v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-approach-for-automated
Repo https://github.com/ncbi-nlp/DeepSeeNet
Framework tf

DOER: Dual Cross-Shared RNN for Aspect Term-Polarity Co-Extraction

Title DOER: Dual Cross-Shared RNN for Aspect Term-Polarity Co-Extraction
Authors Huaishao Luo, Tianrui Li, Bing Liu, Junbo Zhang
Abstract This paper focuses on two related subtasks of aspect-based sentiment analysis, namely aspect term extraction and aspect sentiment classification, which we call aspect term-polarity co-extraction. The former task is to extract aspects of a product or service from an opinion document, and the latter is to identify the polarity expressed in the document about these extracted aspects. Most existing algorithms address them as two separate tasks and solve them one by one, or only perform one task, which can be complicated for real applications. In this paper, we treat these two tasks as two sequence labeling problems and propose a novel Dual crOss-sharEd RNN framework (DOER) to generate all aspect term-polarity pairs of the input sentence simultaneously. Specifically, DOER involves a dual recurrent neural network to extract the respective representation of each task, and a cross-shared unit to consider the relationship between them. Experimental results demonstrate that the proposed framework outperforms state-of-the-art baselines on three benchmark datasets.
Tasks Aspect-Based Sentiment Analysis, Sentiment Analysis
Published 2019-06-05
URL https://arxiv.org/abs/1906.01794v1
PDF https://arxiv.org/pdf/1906.01794v1.pdf
PWC https://paperswithcode.com/paper/doer-dual-cross-shared-rnn-for-aspect-term
Repo https://github.com/ArrowLuo/DOER
Framework tf

Learning Human Objectives by Evaluating Hypothetical Behavior

Title Learning Human Objectives by Evaluating Hypothetical Behavior
Authors Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike
Abstract We seek to align agent behavior with a user’s objectives in a reinforcement learning setting with unknown dynamics, an unknown reward function, and unknown unsafe states. The user knows the rewards and unsafe states, but querying the user is expensive. To address this challenge, we propose an algorithm that safely and interactively learns a model of the user’s reward function. We start with a generative model of initial states and a forward dynamics model trained on off-policy data. Our method uses these models to synthesize hypothetical behaviors, asks the user to label the behaviors with rewards, and trains a neural network to predict the rewards. The key idea is to actively synthesize the hypothetical behaviors from scratch by maximizing tractable proxies for the value of information, without interacting with the environment. We call this method reward query synthesis via trajectory optimization (ReQueST). We evaluate ReQueST with simulated users on a state-based 2D navigation task and the image-based Car Racing video game. The results show that ReQueST significantly outperforms prior methods in learning reward models that transfer to new environments with different initial state distributions. Moreover, ReQueST safely trains the reward model to detect unsafe states, and corrects reward hacking before deploying the agent.
Tasks Car Racing
Published 2019-12-05
URL https://arxiv.org/abs/1912.05652v1
PDF https://arxiv.org/pdf/1912.05652v1.pdf
PWC https://paperswithcode.com/paper/learning-human-objectives-by-evaluating
Repo https://github.com/rddy/ReQueST
Framework tf

Charting the Right Manifold: Manifold Mixup for Few-shot Learning

Title Charting the Right Manifold: Manifold Mixup for Few-shot Learning
Authors Puneet Mangla, Mayank Singh, Abhishek Sinha, Nupur Kumari, Vineeth N Balasubramanian, Balaji Krishnamurthy
Abstract Few-shot learning algorithms aim to learn model parameters capable of adapting to unseen classes with the help of only a few labeled examples. A recent regularization technique - Manifold Mixup focuses on learning a general-purpose representation, robust to small changes in the data distribution. Since the goal of few-shot learning is closely linked to robust representation learning, we study Manifold Mixup in this problem setting. Self-supervised learning is another technique that learns semantically meaningful features, using only the inherent structure of the data. This work investigates the role of learning relevant feature manifold for few-shot tasks using self-supervision and regularization techniques. We observe that regularizing the feature manifold, enriched via self-supervised techniques, with Manifold Mixup significantly improves few-shot learning performance. We show that our proposed method S2M2 beats the current state-of-the-art accuracy on standard few-shot learning datasets like CIFAR-FS, CUB, mini-ImageNet and tiered-ImageNet by 3-8 %. Through extensive experimentation, we show that the features learned using our approach generalize to complex few-shot evaluation tasks, cross-domain scenarios and are robust against slight changes to data distribution.
Tasks Few-Shot Image Classification, Few-Shot Learning, Representation Learning
Published 2019-07-28
URL https://arxiv.org/abs/1907.12087v4
PDF https://arxiv.org/pdf/1907.12087v4.pdf
PWC https://paperswithcode.com/paper/charting-the-right-manifold-manifold-mixup
Repo https://github.com/nupurkmr9/S2M2_fewshot
Framework pytorch

Graph U-Nets

Title Graph U-Nets
Authors Hongyang Gao, Shuiwang Ji
Abstract We consider the problem of representation learning for graph data. Convolutional neural networks can naturally operate on images, but have significant challenges in dealing with graph data. Given images are special cases of graphs with nodes lie on 2D lattices, graph embedding tasks have a natural correspondence with image pixel-wise prediction tasks such as segmentation. While encoder-decoder architectures like U-Nets have been successfully applied on many image pixel-wise prediction tasks, similar methods are lacking for graph data. This is due to the fact that pooling and up-sampling operations are not natural on graph data. To address these challenges, we propose novel graph pooling (gPool) and unpooling (gUnpool) operations in this work. The gPool layer adaptively selects some nodes to form a smaller graph based on their scalar projection values on a trainable projection vector. We further propose the gUnpool layer as the inverse operation of the gPool layer. The gUnpool layer restores the graph into its original structure using the position information of nodes selected in the corresponding gPool layer. Based on our proposed gPool and gUnpool layers, we develop an encoder-decoder model on graph, known as the graph U-Nets. Our experimental results on node classification and graph classification tasks demonstrate that our methods achieve consistently better performance than previous models.
Tasks Graph Classification, Graph Embedding, Node Classification, Representation Learning
Published 2019-05-11
URL https://arxiv.org/abs/1905.05178v1
PDF https://arxiv.org/pdf/1905.05178v1.pdf
PWC https://paperswithcode.com/paper/graph-u-nets
Repo https://github.com/HongyangGao/Graph-U-Nets
Framework pytorch

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Title Adversarial Attacks and Defenses in Images, Graphs and Text: A Review
Authors Han Xu, Yao Ma, Haochen Liu, Debayan Deb, Hui Liu, Jiliang Tang, Anil K. Jain
Abstract Deep neural networks (DNN) have achieved unprecedented success in numerous machine learning tasks in various domains. However, the existence of adversarial examples has raised concerns about applying deep learning to safety-critical applications. As a result, we have witnessed increasing interests in studying attack and defense mechanisms for DNN models on different data types, such as images, graphs and text. Thus, it is necessary to provide a systematic and comprehensive overview of the main threats of attacks and the success of corresponding countermeasures. In this survey, we review the state of the art algorithms for generating adversarial examples and the countermeasures against adversarial examples, for the three popular data types, i.e., images, graphs and text.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.08072v2
PDF https://arxiv.org/pdf/1909.08072v2.pdf
PWC https://paperswithcode.com/paper/adversarial-attacks-and-defenses-in-images
Repo https://github.com/snaka0213/PyTorch-AdvAttacks
Framework pytorch

E-LPIPS: Robust Perceptual Image Similarity via Random Transformation Ensembles

Title E-LPIPS: Robust Perceptual Image Similarity via Random Transformation Ensembles
Authors Markus Kettunen, Erik Härkönen, Jaakko Lehtinen
Abstract It has been recently shown that the hidden variables of convolutional neural networks make for an efficient perceptual similarity metric that accurately predicts human judgment on relative image similarity assessment. First, we show that such learned perceptual similarity metrics (LPIPS) are susceptible to adversarial attacks that dramatically contradict human visual similarity judgment. While this is not surprising in light of neural networks’ well-known weakness to adversarial perturbations, we proceed to show that self-ensembling with an infinite family of random transformations of the input — a technique known not to render classification networks robust — is enough to turn the metric robust against attack, while retaining predictive power on human judgments. Finally, we study the geometry imposed by our our novel self-ensembled metric (E-LPIPS) on the space of natural images. We find evidence of “perceptual convexity” by showing that convex combinations of similar-looking images retain appearance, and that discrete geodesics yield meaningful frame interpolation and texture morphing, all without explicit correspondences.
Tasks Image Similarity Search
Published 2019-06-10
URL https://arxiv.org/abs/1906.03973v2
PDF https://arxiv.org/pdf/1906.03973v2.pdf
PWC https://paperswithcode.com/paper/e-lpips-robust-perceptual-image-similarity
Repo https://github.com/mkettune/elpips
Framework tf

EnlightenGAN: Deep Light Enhancement without Paired Supervision

Title EnlightenGAN: Deep Light Enhancement without Paired Supervision
Authors Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, Zhangyang Wang
Abstract Deep learning-based methods have achieved remarkable success in image restoration and enhancement, but are they still competitive when there is a lack of paired training data? As one such example, this paper explores the low-light image enhancement problem, where in practice it is extremely challenging to simultaneously take a low-light and a normal-light photo of the same visual scene. We propose a highly effective unsupervised generative adversarial network, dubbed EnlightenGAN, that can be trained without low/normal-light image pairs, yet proves to generalize very well on various real-world test images. Instead of supervising the learning using ground truth data, we propose to regularize the unpaired training using the information extracted from the input itself, and benchmark a series of innovations for the low-light image enhancement problem, including a global-local discriminator structure, a self-regularized perceptual loss fusion, and attention mechanism. Through extensive experiments, our proposed approach outperforms recent methods under a variety of metrics in terms of visual quality and subjective user study. Thanks to the great flexibility brought by unpaired training, EnlightenGAN is demonstrated to be easily adaptable to enhancing real-world images from various domains. The code is available at \url{https://github.com/yueruchen/EnlightenGAN}
Tasks Image Enhancement, Image Restoration, Low-Light Image Enhancement
Published 2019-06-17
URL https://arxiv.org/abs/1906.06972v1
PDF https://arxiv.org/pdf/1906.06972v1.pdf
PWC https://paperswithcode.com/paper/enlightengan-deep-light-enhancement-without
Repo https://github.com/ksheeraj/CS256-AI-ObjectDetection
Framework none

Predicting the Type and Target of Offensive Posts in Social Media

Title Predicting the Type and Target of Offensive Posts in Social Media
Authors Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar
Abstract As offensive content has become pervasive in social media, there has been much research in identifying potentially offensive messages. However, previous work on this topic did not consider the problem as a whole, but rather focused on detecting very specific types of offensive content, e.g., hate speech, cyberbulling, or cyber-aggression. In contrast, here we target several different kinds of offensive content. In particular, we model the task hierarchically, identifying the type and the target of offensive messages in social media. For this purpose, we complied the Offensive Language Identification Dataset (OLID), a new dataset with tweets annotated for offensive content using a fine-grained three-layer annotation scheme, which we make publicly available. We discuss the main similarities and differences between OLID and pre-existing datasets for hate speech identification, aggression detection, and similar tasks. We further experiment with and we compare the performance of different machine learning models on OLID.
Tasks Language Identification
Published 2019-02-25
URL http://arxiv.org/abs/1902.09666v2
PDF http://arxiv.org/pdf/1902.09666v2.pdf
PWC https://paperswithcode.com/paper/predicting-the-type-and-target-of-offensive
Repo https://github.com/idontflow/OLID
Framework none

A Fully Differentiable Beam Search Decoder

Title A Fully Differentiable Beam Search Decoder
Authors Ronan Collobert, Awni Hannun, Gabriel Synnaeve
Abstract We introduce a new beam search decoder that is fully differentiable, making it possible to optimize at training time through the inference procedure. Our decoder allows us to combine models which operate at different granularities (e.g. acoustic and language models). It can be used when target sequences are not aligned to input sequences by considering all possible alignments between the two. We demonstrate our approach scales by applying it to speech recognition, jointly training acoustic and word-level language models. The system is end-to-end, with gradients flowing through the whole architecture from the word-level transcriptions. Recent research efforts have shown that deep neural networks with attention-based mechanisms are powerful enough to successfully train an acoustic model from the final transcription, while implicitly learning a language model. Instead, we show that it is possible to discriminatively train an acoustic model jointly with an explicit and possibly pre-trained language model.
Tasks Language Modelling, Speech Recognition
Published 2019-02-16
URL http://arxiv.org/abs/1902.06022v1
PDF http://arxiv.org/pdf/1902.06022v1.pdf
PWC https://paperswithcode.com/paper/a-fully-differentiable-beam-search-decoder
Repo https://github.com/johnhw/differentiable_sorting
Framework tf

Detector-in-Detector: Multi-Level Analysis for Human-Parts

Title Detector-in-Detector: Multi-Level Analysis for Human-Parts
Authors Xiaojie Li, Lu Yang, Qing Song, Fuqiang Zhou
Abstract Vision-based person, hand or face detection approaches have achieved incredible success in recent years with the development of deep convolutional neural network (CNN). In this paper, we take the inherent correlation between the body and body parts into account and propose a new framework to boost up the detection performance of the multi-level objects. In particular, we adopt a region-based object detection structure with two carefully designed detectors to separately pay attention to the human body and body parts in a coarse-to-fine manner, which we call Detector-in-Detector network (DID-Net). The first detector is designed to detect human body, hand, and face. The second detector, based on the body detection results of the first detector, mainly focus on the detection of small hand and face inside each body. The framework is trained in an end-to-end way by optimizing a multi-task loss. Due to the lack of human body, face and hand detection dataset, we have collected and labeled a new large dataset named Human-Parts with 14,962 images and 106,879 annotations. Experiments show that our method can achieve excellent performance on Human-Parts.
Tasks Face Detection, Object Detection
Published 2019-02-19
URL http://arxiv.org/abs/1902.07017v1
PDF http://arxiv.org/pdf/1902.07017v1.pdf
PWC https://paperswithcode.com/paper/detector-in-detector-multi-level-analysis-for
Repo https://github.com/svjack/Detector-in-Detector
Framework tf

Asymmetric Generative Adversarial Networks for Image-to-Image Translation

Title Asymmetric Generative Adversarial Networks for Image-to-Image Translation
Authors Hao Tang, Dan Xu, Hong Liu, Nicu Sebe
Abstract State-of-the-art models for unpaired image-to-image translation with Generative Adversarial Networks (GANs) can learn the mapping from the source domain to the target domain using a cycle-consistency loss. The intuition behind these models is that if we translate from one domain to the other and back again we should arrive at where we started. However, existing methods always adopt a symmetric network architecture to learn both forward and backward cycles. Because of the task complexity and cycle input difference between the source and target image domains, the inequality in bidirectional forward-backward cycle translations is significant and the amount of information between two domains is different. In this paper, we analyze the limitation of the existing symmetric GAN models in asymmetric translation tasks, and propose an AsymmetricGAN model with both translation and reconstruction generators of unequal sizes and different parameter-sharing strategy to adapt to the asymmetric need in both unsupervised and supervised image-to-image translation tasks. Moreover, the training stage of existing methods has the common problem of model collapse that degrades the quality of the generated images, thus we explore different optimization losses for better training of AsymmetricGAN, and thus make image-to-image translation with higher consistency and better stability. Extensive experiments on both supervised and unsupervised generative tasks with several publicly available datasets demonstrate that the proposed AsymmetricGAN achieves superior model capacity and better generation performance compared with existing GAN models. To the best of our knowledge, we are the first to investigate the asymmetric GAN framework on both unsupervised and supervised image-to-image translation tasks. The source code, data and trained models are available at https://github.com/Ha0Tang/AsymmetricGAN.
Tasks Image-to-Image Translation
Published 2019-12-14
URL https://arxiv.org/abs/1912.06931v1
PDF https://arxiv.org/pdf/1912.06931v1.pdf
PWC https://paperswithcode.com/paper/asymmetric-generative-adversarial-networks
Repo https://github.com/Ha0Tang/AsymmetricGAN
Framework pytorch
comments powered by Disqus