October 20, 2019

3119 words 15 mins read

Paper Group AWR 275

Paper Group AWR 275

We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!. COSMO: Contextualized Scene Modeling with Boltzmann Machines. Deep Association Learning for Unsupervised Video Person Re-identification. Concurrent Learning of Semantic Relations. Learning Local RGB-to-CAD Correspondences for Object Pose Estimation. Can You Tell Me H …

We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!

Title We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!
Authors Georgi Karadzhov, Pepa Gencheva, Preslav Nakov, Ivan Koychev
Abstract It is completely amazing! Fake news and click-baits have totally invaded the cyber space. Let us face it: everybody hates them for three simple reasons. Reason #2 will absolutely amaze you. What these can achieve at the time of election will completely blow your mind! Now, we all agree, this cannot go on, you know, somebody has to stop it. So, we did this research on fake news/click-bait detection and trust us, it is totally great research, it really is! Make no mistake. This is the best research ever! Seriously, come have a look, we have it all: neural networks, attention mechanism, sentiment lexicons, author profiling, you name it. Lexical features, semantic features, we absolutely have it all. And we have totally tested it, trust us! We have results, and numbers, really big numbers. The best numbers ever! Oh, and analysis, absolutely top notch analysis. Interested? Come read the shocking truth about fake news and click-bait in the Bulgarian cyber space. You won’t believe what we have found!
Tasks
Published 2018-03-10
URL http://arxiv.org/abs/1803.03786v1
PDF http://arxiv.org/pdf/1803.03786v1.pdf
PWC https://paperswithcode.com/paper/we-built-a-fake-news-click-bait-filter-what
Repo https://github.com/gkaradzhov/ClickbaitRANLP
Framework none

COSMO: Contextualized Scene Modeling with Boltzmann Machines

Title COSMO: Contextualized Scene Modeling with Boltzmann Machines
Authors Ilker Bozcan, Sinan Kalkan
Abstract Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling. Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections into the model. Moreover, we contribute a dataset for relation estimation and modeling studies. We evaluate our method in comparison with several baselines on object estimation, out-of-context object detection, relation estimation, and affordance estimation tasks. Moreover, to illustrate the generative capability of the model, we show several example scenes that the model is able to generate.
Tasks Object Detection
Published 2018-07-02
URL http://arxiv.org/abs/1807.00511v2
PDF http://arxiv.org/pdf/1807.00511v2.pdf
PWC https://paperswithcode.com/paper/cosmo-contextualized-scene-modeling-with
Repo https://github.com/bozcani/COSMO
Framework tf

Deep Association Learning for Unsupervised Video Person Re-identification

Title Deep Association Learning for Unsupervised Video Person Re-identification
Authors Yanbei Chen, Xiatian Zhu, Shaogang Gong
Abstract Deep learning methods have started to dominate the research progress of video-based person re-identification (re-id). However, existing methods mostly consider supervised learning, which requires exhaustive manual efforts for labelling cross-view pairwise data. Therefore, they severely lack scalability and practicality in real-world video surveillance applications. In this work, to address the video person re-id task, we formulate a novel Deep Association Learning (DAL) scheme, the first end-to-end deep learning method using none of the identity labels in model initialisation and training. DAL learns a deep re-id matching model by jointly optimising two margin-based association losses in an end-to-end manner, which effectively constrains the association of each frame to the best-matched intra-camera representation and cross-camera representation. Existing standard CNNs can be readily employed within our DAL scheme. Experiment results demonstrate that our proposed DAL significantly outperforms current state-of-the-art unsupervised video person re-id methods on three benchmarks: PRID 2011, iLIDS-VID and MARS.
Tasks Person Re-Identification, Video-Based Person Re-Identification
Published 2018-08-22
URL http://arxiv.org/abs/1808.07301v1
PDF http://arxiv.org/pdf/1808.07301v1.pdf
PWC https://paperswithcode.com/paper/deep-association-learning-for-unsupervised
Repo https://github.com/yanbeic/Deep-Association-Learning
Framework tf

Concurrent Learning of Semantic Relations

Title Concurrent Learning of Semantic Relations
Authors Georgios Balikas, Gaël Dias, Rumen Moraliyski, Massih-Reza Amini
Abstract Discovering whether words are semantically related and identifying the specific semantic relation that holds between them is of crucial importance for NLP as it is essential for tasks like query expansion in IR. Within this context, different methodologies have been proposed that either exclusively focus on a single lexical relation (e.g. hypernymy vs. random) or learn specific classifiers capable of identifying multiple semantic relations (e.g. hypernymy vs. synonymy vs. random). In this paper, we propose another way to look at the problem that relies on the multi-task learning paradigm. In particular, we want to study whether the learning process of a given semantic relation (e.g. hypernymy) can be improved by the concurrent learning of another semantic relation (e.g. co-hyponymy). Within this context, we particularly examine the benefits of semi-supervised learning where the training of a prediction function is performed over few labeled data jointly with many unlabeled ones. Preliminary results based on simple learning strategies and state-of-the-art distributional feature representations show that concurrent learning can lead to improvements in a vast majority of tested situations.
Tasks Multi-Task Learning
Published 2018-07-26
URL http://arxiv.org/abs/1807.10076v3
PDF http://arxiv.org/pdf/1807.10076v3.pdf
PWC https://paperswithcode.com/paper/concurrent-learning-of-semantic-relations
Repo https://github.com/Houssam93/MultiTask-Learning-NLP
Framework none

Learning Local RGB-to-CAD Correspondences for Object Pose Estimation

Title Learning Local RGB-to-CAD Correspondences for Object Pose Estimation
Authors Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jana Kosecka
Abstract We consider the problem of 3D object pose estimation. While much recent work has focused on the RGB domain, the reliance on accurately annotated images limits their generalizability and scalability. On the other hand, the easily available CAD models of objects are rich sources of data, providing a large number of synthetically rendered images. In this paper, we solve this key problem of existing methods requiring expensive 3D pose annotations by proposing a new method that matches RGB images to CAD models for object pose estimation. Our key innovations compared to existing work include removing the need for either real-world textures for CAD models or explicit 3D pose annotations for RGB images. We achieve this through a series of objectives that learn how to select keypoints and enforce viewpoint and modality invariance across RGB images and CAD model renderings. We conduct extensive experiments to demonstrate that the proposed method can reliably estimate object pose in RGB images, as well as generalize to object instances not seen during training.
Tasks Pose Estimation
Published 2018-11-18
URL https://arxiv.org/abs/1811.07249v4
PDF https://arxiv.org/pdf/1811.07249v4.pdf
PWC https://paperswithcode.com/paper/matching-rgb-images-to-cad-models-for-object
Repo https://github.com/YoungXIAO13/PoseFromShape
Framework pytorch

Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling

Title Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling
Authors Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman
Abstract Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling. We conduct the first large-scale systematic study of candidate pretraining tasks, comparing 19 different tasks both as alternatives and complements to language modeling. Our primary results support the use language modeling, especially when combined with pretraining on additional labeled-data tasks. However, our results are mixed across pretraining tasks and show some concerning trends: In ELMo’s pretrain-then-freeze paradigm, random baselines are worryingly strong and results vary strikingly across target tasks. In addition, fine-tuning BERT on an intermediate task often negatively impacts downstream transfer. In a more positive trend, we see modest gains from multitask training, suggesting the development of more sophisticated multitask and transfer learning techniques as an avenue for further research.
Tasks Language Modelling, Transfer Learning
Published 2018-12-28
URL https://arxiv.org/abs/1812.10860v5
PDF https://arxiv.org/pdf/1812.10860v5.pdf
PWC https://paperswithcode.com/paper/looking-for-elmos-friends-sentence-level
Repo https://github.com/nyu-mll/jiant
Framework pytorch

Enhanced Optic Disk and Cup Segmentation with Glaucoma Screening from Fundus Images using Position encoded CNNs

Title Enhanced Optic Disk and Cup Segmentation with Glaucoma Screening from Fundus Images using Position encoded CNNs
Authors Vismay Agrawal, Avinash Kori, Varghese Alex, Ganapathy Krishnamurthi
Abstract In this manuscript, we present a robust method for glaucoma screening from fundus images using an ensemble of convolutional neural networks (CNNs). The pipeline comprises of first segmenting the optic disk and optic cup from the fundus image, then extracting a patch centered around the optic disk and subsequently feeding to the classification network to differentiate the image as diseased or healthy. In the segmentation network, apart from the image, we make use of spatial co-ordinate (X & Y) space so as to learn the structure of interest better. The classification network is composed of a DenseNet201 and a ResNet18 which were pre-trained on a large cohort of natural images. On the REFUGE validation data (n=400), the segmentation network achieved a dice score of 0.88 and 0.64 for optic disc and optic cup respectively. For the tasking differentiating images affected with glaucoma from healthy images, the area under the ROC curve was observed to be 0.85.
Tasks
Published 2018-09-14
URL http://arxiv.org/abs/1809.05216v1
PDF http://arxiv.org/pdf/1809.05216v1.pdf
PWC https://paperswithcode.com/paper/enhanced-optic-disk-and-cup-segmentation-with
Repo https://github.com/koriavinash1/Optic-Disk-Cup-Segmentation
Framework pytorch

DeepTerramechanics: Terrain Classification and Slip Estimation for Ground Robots via Deep Learning

Title DeepTerramechanics: Terrain Classification and Slip Estimation for Ground Robots via Deep Learning
Authors Ramon Gonzalez, Karl Iagnemma
Abstract Terramechanics plays a critical role in the areas of ground vehicles and ground mobile robots since understanding and estimating the variables influencing the vehicle-terrain interaction may mean the success or the failure of an entire mission. This research applies state-of-the-art algorithms in deep learning to two key problems: estimating wheel slip and classifying the terrain being traversed by a ground robot. Three data sets collected by ground robotic platforms (MIT single-wheel testbed, MSL Curiosity rover, and tracked robot Fitorobot) are employed in order to compare the performance of traditional machine learning methods (i.e. Support Vector Machine (SVM) and Multi-layer Perceptron (MLP)) against Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs). This work also shows the impact that certain tuning parameters and the network architecture (MLP, DNN and CNN) play on the performance of those methods. This paper also contributes a deep discussion with the lessons learned in the implementation of DNNs and CNNs and how these methods can be extended to solve other problems.
Tasks
Published 2018-06-12
URL http://arxiv.org/abs/1806.07379v1
PDF http://arxiv.org/pdf/1806.07379v1.pdf
PWC https://paperswithcode.com/paper/deepterramechanics-terrain-classification-and
Repo https://github.com/ntseng450/DeepTerra
Framework none

Training Complex Models with Multi-Task Weak Supervision

Title Training Complex Models with Multi-Task Weak Supervision
Authors Alexander Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, Christopher Ré
Abstract Snorkel MeTaL: A framework for training models with multi-task weak supervision
Tasks Matrix Completion
Published 2018-10-05
URL http://arxiv.org/abs/1810.02840v2
PDF http://arxiv.org/pdf/1810.02840v2.pdf
PWC https://paperswithcode.com/paper/training-complex-models-with-multi-task-weak
Repo https://github.com/HazyResearch/metal
Framework pytorch
Title Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
Authors Anji Liu, Jianshu Chen, Mingze Yu, Yu Zhai, Xuewen Zhou, Ji Liu
Abstract Monte Carlo Tree Search (MCTS) algorithms have achieved great success on many challenging benchmarks (e.g., Computer Go). However, they generally require a large number of rollouts, making their applications costly. Furthermore, it is also extremely challenging to parallelize MCTS due to its inherent sequential nature: each rollout heavily relies on the statistics (e.g., node visitation counts) estimated from previous simulations to achieve an effective exploration-exploitation tradeoff. In spite of these difficulties, we develop an algorithm, WU-UCT, to effectively parallelize MCTS, which achieves linear speedup and exhibits only limited performance loss with an increasing number of workers. The key idea in WU-UCT is a set of statistics that we introduce to track the number of on-going yet incomplete simulation queries (named as unobserved samples). These statistics are used to modify the UCT tree policy in the selection steps in a principled manner to retain effective exploration-exploitation tradeoff when we parallelize the most time-consuming expansion and simulation steps. Experiments on a proprietary benchmark and the Atari Game benchmark demonstrate the linear speedup and the superior performance of WU-UCT comparing to existing techniques.
Tasks
Published 2018-10-28
URL https://arxiv.org/abs/1810.11755v5
PDF https://arxiv.org/pdf/1810.11755v5.pdf
PWC https://paperswithcode.com/paper/p-mcgs-parallel-monte-carlo-acyclic-graph
Repo https://github.com/liuanji/P-UCT
Framework pytorch

Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size

Title Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size
Authors Ian Goodfellow
Abstract A key problem in research on adversarial examples is that vulnerability to adversarial examples is usually measured by running attack algorithms. Because the attack algorithms are not optimal, the attack algorithms are prone to overestimating the size of perturbation needed to fool the target model. In other words, the attack-based methodology provides an upper-bound on the size of a perturbation that will fool the model, but security guarantees require a lower bound. CLEVER is a proposed scoring method to estimate a lower bound. Unfortunately, an estimate of a bound is not a bound. In this report, we show that gradient masking, a common problem that causes attack methodologies to provide only a very loose upper bound, causes CLEVER to overestimate the size of perturbation needed to fool the model. In other words, CLEVER does not resolve the key problem with the attack-based methodology, because it fails to provide a lower bound.
Tasks
Published 2018-04-21
URL http://arxiv.org/abs/1804.07870v1
PDF http://arxiv.org/pdf/1804.07870v1.pdf
PWC https://paperswithcode.com/paper/gradient-masking-causes-clever-to
Repo https://github.com/huanzhang12/CLEVER
Framework tf

Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration

Title Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
Authors Hyemin Ahn, Sungjoon Choi, Nuri Kim, Geonho Cha, Songhwai Oh
Abstract In this paper, we propose the Interactive Text2Pickup (IT2P) network for human-robot collaboration which enables an effective interaction with a human user despite the ambiguity in user’s commands. We focus on the task where a robot is expected to pick up an object instructed by a human, and to interact with the human when the given instruction is vague. The proposed network understands the command from the human user and estimates the position of the desired object first. To handle the inherent ambiguity in human language commands, a suitable question which can resolve the ambiguity is generated. The user’s answer to the question is combined with the initial command and given back to the network, resulting in more accurate estimation. The experiment results show that given unambiguous commands, the proposed method can estimate the position of the requested object with an accuracy of 98.49% based on our test dataset. Given ambiguous language commands, we show that the accuracy of the pick up task increases by 1.94 times after incorporating the information obtained from the interaction.
Tasks
Published 2018-05-28
URL http://arxiv.org/abs/1805.10799v1
PDF http://arxiv.org/pdf/1805.10799v1.pdf
PWC https://paperswithcode.com/paper/interactive-text2pickup-network-for-natural
Repo https://github.com/hiddenmaze/InteractivePickup
Framework tf

Semi-Supervised Deep Learning for Abnormality Classification in Retinal Images

Title Semi-Supervised Deep Learning for Abnormality Classification in Retinal Images
Authors Bruno Lecouat, Ken Chang, Chuan-Sheng Foo, Balagopal Unnikrishnan, James M. Brown, Houssam Zenati, Andrew Beers, Vijay Chandrasekhar, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy
Abstract Supervised deep learning algorithms have enabled significant performance gains in medical image classification tasks. But these methods rely on large labeled datasets that require resource-intensive expert annotation. Semi-supervised generative adversarial network (GAN) approaches offer a means to learn from limited labeled data alongside larger unlabeled datasets, but have not been applied to discern fine-scale, sparse or localized features that define medical abnormalities. To overcome these limitations, we propose a patch-based semi-supervised learning approach and evaluate performance on classification of diabetic retinopathy from funduscopic images. Our semi-supervised approach achieves high AUC with just 10-20 labeled training images, and outperforms the supervised baselines by upto 15% when less than 30% of the training dataset is labeled. Further, our method implicitly enables interpretation of the SSL predictions. As this approach enables good accuracy, resolution and interpretability with lower annotation burden, it sets the pathway for scalable applications of deep learning in clinical imaging.
Tasks Image Classification
Published 2018-12-19
URL http://arxiv.org/abs/1812.07832v1
PDF http://arxiv.org/pdf/1812.07832v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-deep-learning-for-abnormality
Repo https://github.com/theidentity/Improved-GAN-PyTorch
Framework pytorch

UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation

Title UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation
Authors Pablo Martinez-Gonzalez, Sergiu Oprea, Alberto Garcia-Garcia, Alvaro Jover-Alvarez, Sergio Orts-Escolano, Jose Garcia-Rodriguez
Abstract Data-driven algorithms have surpassed traditional techniques in almost every aspect in robotic vision problems. Such algorithms need vast amounts of quality data to be able to work properly after their training process. Gathering and annotating that sheer amount of data in the real world is a time-consuming and error-prone task. Those problems limit scale and quality. Synthetic data generation has become increasingly popular since it is faster to generate and automatic to annotate. However, most of the current datasets and environments lack realism, interactions, and details from the real world. UnrealROX is an environment built over Unreal Engine 4 which aims to reduce that reality gap by leveraging hyperrealistic indoor scenes that are explored by robot agents which also interact with objects in a visually realistic manner in that simulated world. Photorealistic scenes and robots are rendered by Unreal Engine into a virtual reality headset which captures gaze so that a human operator can move the robot and use controllers for the robotic hands; scene information is dumped on a per-frame basis so that it can be reproduced offline to generate raw data and ground truth annotations. This virtual reality environment enables robotic vision researchers to generate realistic and visually plausible data with full ground truth for a wide variety of problems such as class and instance semantic segmentation, object detection, depth estimation, visual grasping, and navigation.
Tasks Depth Estimation, Object Detection, Semantic Segmentation, Synthetic Data Generation
Published 2018-10-16
URL https://arxiv.org/abs/1810.06936v2
PDF https://arxiv.org/pdf/1810.06936v2.pdf
PWC https://paperswithcode.com/paper/unrealrox-an-extremely-photorealistic-virtual
Repo https://github.com/3dperceptionlab/unrealrox
Framework none

Multilingual Constituency Parsing with Self-Attention and Pre-Training

Title Multilingual Constituency Parsing with Self-Attention and Pre-Training
Authors Nikita Kitaev, Steven Cao, Dan Klein
Abstract We show that constituency parsing benefits from unsupervised pre-training across a variety of languages and a range of pre-training conditions. We first compare the benefits of no pre-training, fastText, ELMo, and BERT for English and find that BERT outperforms ELMo, in large part due to increased model capacity, whereas ELMo in turn outperforms the non-contextual fastText embeddings. We also find that pre-training is beneficial across all 11 languages tested; however, large model sizes (more than 100 million parameters) make it computationally expensive to train separate models for each language. To address this shortcoming, we show that joint multilingual pre-training and fine-tuning allows sharing all but a small number of parameters between ten languages in the final model. The 10x reduction in model size compared to fine-tuning one model per language causes only a 3.2% relative error increase in aggregate. We further explore the idea of joint fine-tuning and show that it gives low-resource languages a way to benefit from the larger datasets of other languages. Finally, we demonstrate new state-of-the-art results for 11 languages, including English (95.8 F1) and Chinese (91.8 F1).
Tasks Constituency Parsing
Published 2018-12-31
URL https://arxiv.org/abs/1812.11760v2
PDF https://arxiv.org/pdf/1812.11760v2.pdf
PWC https://paperswithcode.com/paper/multilingual-constituency-parsing-with-self
Repo https://github.com/dpfried/rnng-bert
Framework tf
comments powered by Disqus