Paper Group AWR 275
We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!. COSMO: Contextualized Scene Modeling with Boltzmann Machines. Deep Association Learning for Unsupervised Video Person Re-identification. Concurrent Learning of Semantic Relations. Learning Local RGB-to-CAD Correspondences for Object Pose Estimation. Can You Tell Me H …
We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!
Title | We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind! |
Authors | Georgi Karadzhov, Pepa Gencheva, Preslav Nakov, Ivan Koychev |
Abstract | It is completely amazing! Fake news and click-baits have totally invaded the cyber space. Let us face it: everybody hates them for three simple reasons. Reason #2 will absolutely amaze you. What these can achieve at the time of election will completely blow your mind! Now, we all agree, this cannot go on, you know, somebody has to stop it. So, we did this research on fake news/click-bait detection and trust us, it is totally great research, it really is! Make no mistake. This is the best research ever! Seriously, come have a look, we have it all: neural networks, attention mechanism, sentiment lexicons, author profiling, you name it. Lexical features, semantic features, we absolutely have it all. And we have totally tested it, trust us! We have results, and numbers, really big numbers. The best numbers ever! Oh, and analysis, absolutely top notch analysis. Interested? Come read the shocking truth about fake news and click-bait in the Bulgarian cyber space. You won’t believe what we have found! |
Tasks | |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03786v1 |
http://arxiv.org/pdf/1803.03786v1.pdf | |
PWC | https://paperswithcode.com/paper/we-built-a-fake-news-click-bait-filter-what |
Repo | https://github.com/gkaradzhov/ClickbaitRANLP |
Framework | none |
COSMO: Contextualized Scene Modeling with Boltzmann Machines
Title | COSMO: Contextualized Scene Modeling with Boltzmann Machines |
Authors | Ilker Bozcan, Sinan Kalkan |
Abstract | Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling. Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are introduced with shared, tri-way connections into the model. Moreover, we contribute a dataset for relation estimation and modeling studies. We evaluate our method in comparison with several baselines on object estimation, out-of-context object detection, relation estimation, and affordance estimation tasks. Moreover, to illustrate the generative capability of the model, we show several example scenes that the model is able to generate. |
Tasks | Object Detection |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00511v2 |
http://arxiv.org/pdf/1807.00511v2.pdf | |
PWC | https://paperswithcode.com/paper/cosmo-contextualized-scene-modeling-with |
Repo | https://github.com/bozcani/COSMO |
Framework | tf |
Deep Association Learning for Unsupervised Video Person Re-identification
Title | Deep Association Learning for Unsupervised Video Person Re-identification |
Authors | Yanbei Chen, Xiatian Zhu, Shaogang Gong |
Abstract | Deep learning methods have started to dominate the research progress of video-based person re-identification (re-id). However, existing methods mostly consider supervised learning, which requires exhaustive manual efforts for labelling cross-view pairwise data. Therefore, they severely lack scalability and practicality in real-world video surveillance applications. In this work, to address the video person re-id task, we formulate a novel Deep Association Learning (DAL) scheme, the first end-to-end deep learning method using none of the identity labels in model initialisation and training. DAL learns a deep re-id matching model by jointly optimising two margin-based association losses in an end-to-end manner, which effectively constrains the association of each frame to the best-matched intra-camera representation and cross-camera representation. Existing standard CNNs can be readily employed within our DAL scheme. Experiment results demonstrate that our proposed DAL significantly outperforms current state-of-the-art unsupervised video person re-id methods on three benchmarks: PRID 2011, iLIDS-VID and MARS. |
Tasks | Person Re-Identification, Video-Based Person Re-Identification |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07301v1 |
http://arxiv.org/pdf/1808.07301v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-association-learning-for-unsupervised |
Repo | https://github.com/yanbeic/Deep-Association-Learning |
Framework | tf |
Concurrent Learning of Semantic Relations
Title | Concurrent Learning of Semantic Relations |
Authors | Georgios Balikas, Gaël Dias, Rumen Moraliyski, Massih-Reza Amini |
Abstract | Discovering whether words are semantically related and identifying the specific semantic relation that holds between them is of crucial importance for NLP as it is essential for tasks like query expansion in IR. Within this context, different methodologies have been proposed that either exclusively focus on a single lexical relation (e.g. hypernymy vs. random) or learn specific classifiers capable of identifying multiple semantic relations (e.g. hypernymy vs. synonymy vs. random). In this paper, we propose another way to look at the problem that relies on the multi-task learning paradigm. In particular, we want to study whether the learning process of a given semantic relation (e.g. hypernymy) can be improved by the concurrent learning of another semantic relation (e.g. co-hyponymy). Within this context, we particularly examine the benefits of semi-supervised learning where the training of a prediction function is performed over few labeled data jointly with many unlabeled ones. Preliminary results based on simple learning strategies and state-of-the-art distributional feature representations show that concurrent learning can lead to improvements in a vast majority of tested situations. |
Tasks | Multi-Task Learning |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10076v3 |
http://arxiv.org/pdf/1807.10076v3.pdf | |
PWC | https://paperswithcode.com/paper/concurrent-learning-of-semantic-relations |
Repo | https://github.com/Houssam93/MultiTask-Learning-NLP |
Framework | none |
Learning Local RGB-to-CAD Correspondences for Object Pose Estimation
Title | Learning Local RGB-to-CAD Correspondences for Object Pose Estimation |
Authors | Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jana Kosecka |
Abstract | We consider the problem of 3D object pose estimation. While much recent work has focused on the RGB domain, the reliance on accurately annotated images limits their generalizability and scalability. On the other hand, the easily available CAD models of objects are rich sources of data, providing a large number of synthetically rendered images. In this paper, we solve this key problem of existing methods requiring expensive 3D pose annotations by proposing a new method that matches RGB images to CAD models for object pose estimation. Our key innovations compared to existing work include removing the need for either real-world textures for CAD models or explicit 3D pose annotations for RGB images. We achieve this through a series of objectives that learn how to select keypoints and enforce viewpoint and modality invariance across RGB images and CAD model renderings. We conduct extensive experiments to demonstrate that the proposed method can reliably estimate object pose in RGB images, as well as generalize to object instances not seen during training. |
Tasks | Pose Estimation |
Published | 2018-11-18 |
URL | https://arxiv.org/abs/1811.07249v4 |
https://arxiv.org/pdf/1811.07249v4.pdf | |
PWC | https://paperswithcode.com/paper/matching-rgb-images-to-cad-models-for-object |
Repo | https://github.com/YoungXIAO13/PoseFromShape |
Framework | pytorch |
Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling
Title | Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling |
Authors | Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman |
Abstract | Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling. We conduct the first large-scale systematic study of candidate pretraining tasks, comparing 19 different tasks both as alternatives and complements to language modeling. Our primary results support the use language modeling, especially when combined with pretraining on additional labeled-data tasks. However, our results are mixed across pretraining tasks and show some concerning trends: In ELMo’s pretrain-then-freeze paradigm, random baselines are worryingly strong and results vary strikingly across target tasks. In addition, fine-tuning BERT on an intermediate task often negatively impacts downstream transfer. In a more positive trend, we see modest gains from multitask training, suggesting the development of more sophisticated multitask and transfer learning techniques as an avenue for further research. |
Tasks | Language Modelling, Transfer Learning |
Published | 2018-12-28 |
URL | https://arxiv.org/abs/1812.10860v5 |
https://arxiv.org/pdf/1812.10860v5.pdf | |
PWC | https://paperswithcode.com/paper/looking-for-elmos-friends-sentence-level |
Repo | https://github.com/nyu-mll/jiant |
Framework | pytorch |
Enhanced Optic Disk and Cup Segmentation with Glaucoma Screening from Fundus Images using Position encoded CNNs
Title | Enhanced Optic Disk and Cup Segmentation with Glaucoma Screening from Fundus Images using Position encoded CNNs |
Authors | Vismay Agrawal, Avinash Kori, Varghese Alex, Ganapathy Krishnamurthi |
Abstract | In this manuscript, we present a robust method for glaucoma screening from fundus images using an ensemble of convolutional neural networks (CNNs). The pipeline comprises of first segmenting the optic disk and optic cup from the fundus image, then extracting a patch centered around the optic disk and subsequently feeding to the classification network to differentiate the image as diseased or healthy. In the segmentation network, apart from the image, we make use of spatial co-ordinate (X & Y) space so as to learn the structure of interest better. The classification network is composed of a DenseNet201 and a ResNet18 which were pre-trained on a large cohort of natural images. On the REFUGE validation data (n=400), the segmentation network achieved a dice score of 0.88 and 0.64 for optic disc and optic cup respectively. For the tasking differentiating images affected with glaucoma from healthy images, the area under the ROC curve was observed to be 0.85. |
Tasks | |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05216v1 |
http://arxiv.org/pdf/1809.05216v1.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-optic-disk-and-cup-segmentation-with |
Repo | https://github.com/koriavinash1/Optic-Disk-Cup-Segmentation |
Framework | pytorch |
DeepTerramechanics: Terrain Classification and Slip Estimation for Ground Robots via Deep Learning
Title | DeepTerramechanics: Terrain Classification and Slip Estimation for Ground Robots via Deep Learning |
Authors | Ramon Gonzalez, Karl Iagnemma |
Abstract | Terramechanics plays a critical role in the areas of ground vehicles and ground mobile robots since understanding and estimating the variables influencing the vehicle-terrain interaction may mean the success or the failure of an entire mission. This research applies state-of-the-art algorithms in deep learning to two key problems: estimating wheel slip and classifying the terrain being traversed by a ground robot. Three data sets collected by ground robotic platforms (MIT single-wheel testbed, MSL Curiosity rover, and tracked robot Fitorobot) are employed in order to compare the performance of traditional machine learning methods (i.e. Support Vector Machine (SVM) and Multi-layer Perceptron (MLP)) against Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs). This work also shows the impact that certain tuning parameters and the network architecture (MLP, DNN and CNN) play on the performance of those methods. This paper also contributes a deep discussion with the lessons learned in the implementation of DNNs and CNNs and how these methods can be extended to solve other problems. |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.07379v1 |
http://arxiv.org/pdf/1806.07379v1.pdf | |
PWC | https://paperswithcode.com/paper/deepterramechanics-terrain-classification-and |
Repo | https://github.com/ntseng450/DeepTerra |
Framework | none |
Training Complex Models with Multi-Task Weak Supervision
Title | Training Complex Models with Multi-Task Weak Supervision |
Authors | Alexander Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, Christopher Ré |
Abstract | Snorkel MeTaL: A framework for training models with multi-task weak supervision |
Tasks | Matrix Completion |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02840v2 |
http://arxiv.org/pdf/1810.02840v2.pdf | |
PWC | https://paperswithcode.com/paper/training-complex-models-with-multi-task-weak |
Repo | https://github.com/HazyResearch/metal |
Framework | pytorch |
Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search
Title | Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search |
Authors | Anji Liu, Jianshu Chen, Mingze Yu, Yu Zhai, Xuewen Zhou, Ji Liu |
Abstract | Monte Carlo Tree Search (MCTS) algorithms have achieved great success on many challenging benchmarks (e.g., Computer Go). However, they generally require a large number of rollouts, making their applications costly. Furthermore, it is also extremely challenging to parallelize MCTS due to its inherent sequential nature: each rollout heavily relies on the statistics (e.g., node visitation counts) estimated from previous simulations to achieve an effective exploration-exploitation tradeoff. In spite of these difficulties, we develop an algorithm, WU-UCT, to effectively parallelize MCTS, which achieves linear speedup and exhibits only limited performance loss with an increasing number of workers. The key idea in WU-UCT is a set of statistics that we introduce to track the number of on-going yet incomplete simulation queries (named as unobserved samples). These statistics are used to modify the UCT tree policy in the selection steps in a principled manner to retain effective exploration-exploitation tradeoff when we parallelize the most time-consuming expansion and simulation steps. Experiments on a proprietary benchmark and the Atari Game benchmark demonstrate the linear speedup and the superior performance of WU-UCT comparing to existing techniques. |
Tasks | |
Published | 2018-10-28 |
URL | https://arxiv.org/abs/1810.11755v5 |
https://arxiv.org/pdf/1810.11755v5.pdf | |
PWC | https://paperswithcode.com/paper/p-mcgs-parallel-monte-carlo-acyclic-graph |
Repo | https://github.com/liuanji/P-UCT |
Framework | pytorch |
Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size
Title | Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size |
Authors | Ian Goodfellow |
Abstract | A key problem in research on adversarial examples is that vulnerability to adversarial examples is usually measured by running attack algorithms. Because the attack algorithms are not optimal, the attack algorithms are prone to overestimating the size of perturbation needed to fool the target model. In other words, the attack-based methodology provides an upper-bound on the size of a perturbation that will fool the model, but security guarantees require a lower bound. CLEVER is a proposed scoring method to estimate a lower bound. Unfortunately, an estimate of a bound is not a bound. In this report, we show that gradient masking, a common problem that causes attack methodologies to provide only a very loose upper bound, causes CLEVER to overestimate the size of perturbation needed to fool the model. In other words, CLEVER does not resolve the key problem with the attack-based methodology, because it fails to provide a lower bound. |
Tasks | |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07870v1 |
http://arxiv.org/pdf/1804.07870v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-masking-causes-clever-to |
Repo | https://github.com/huanzhang12/CLEVER |
Framework | tf |
Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
Title | Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration |
Authors | Hyemin Ahn, Sungjoon Choi, Nuri Kim, Geonho Cha, Songhwai Oh |
Abstract | In this paper, we propose the Interactive Text2Pickup (IT2P) network for human-robot collaboration which enables an effective interaction with a human user despite the ambiguity in user’s commands. We focus on the task where a robot is expected to pick up an object instructed by a human, and to interact with the human when the given instruction is vague. The proposed network understands the command from the human user and estimates the position of the desired object first. To handle the inherent ambiguity in human language commands, a suitable question which can resolve the ambiguity is generated. The user’s answer to the question is combined with the initial command and given back to the network, resulting in more accurate estimation. The experiment results show that given unambiguous commands, the proposed method can estimate the position of the requested object with an accuracy of 98.49% based on our test dataset. Given ambiguous language commands, we show that the accuracy of the pick up task increases by 1.94 times after incorporating the information obtained from the interaction. |
Tasks | |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10799v1 |
http://arxiv.org/pdf/1805.10799v1.pdf | |
PWC | https://paperswithcode.com/paper/interactive-text2pickup-network-for-natural |
Repo | https://github.com/hiddenmaze/InteractivePickup |
Framework | tf |
Semi-Supervised Deep Learning for Abnormality Classification in Retinal Images
Title | Semi-Supervised Deep Learning for Abnormality Classification in Retinal Images |
Authors | Bruno Lecouat, Ken Chang, Chuan-Sheng Foo, Balagopal Unnikrishnan, James M. Brown, Houssam Zenati, Andrew Beers, Vijay Chandrasekhar, Jayashree Kalpathy-Cramer, Pavitra Krishnaswamy |
Abstract | Supervised deep learning algorithms have enabled significant performance gains in medical image classification tasks. But these methods rely on large labeled datasets that require resource-intensive expert annotation. Semi-supervised generative adversarial network (GAN) approaches offer a means to learn from limited labeled data alongside larger unlabeled datasets, but have not been applied to discern fine-scale, sparse or localized features that define medical abnormalities. To overcome these limitations, we propose a patch-based semi-supervised learning approach and evaluate performance on classification of diabetic retinopathy from funduscopic images. Our semi-supervised approach achieves high AUC with just 10-20 labeled training images, and outperforms the supervised baselines by upto 15% when less than 30% of the training dataset is labeled. Further, our method implicitly enables interpretation of the SSL predictions. As this approach enables good accuracy, resolution and interpretability with lower annotation burden, it sets the pathway for scalable applications of deep learning in clinical imaging. |
Tasks | Image Classification |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07832v1 |
http://arxiv.org/pdf/1812.07832v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-deep-learning-for-abnormality |
Repo | https://github.com/theidentity/Improved-GAN-PyTorch |
Framework | pytorch |
UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation
Title | UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation |
Authors | Pablo Martinez-Gonzalez, Sergiu Oprea, Alberto Garcia-Garcia, Alvaro Jover-Alvarez, Sergio Orts-Escolano, Jose Garcia-Rodriguez |
Abstract | Data-driven algorithms have surpassed traditional techniques in almost every aspect in robotic vision problems. Such algorithms need vast amounts of quality data to be able to work properly after their training process. Gathering and annotating that sheer amount of data in the real world is a time-consuming and error-prone task. Those problems limit scale and quality. Synthetic data generation has become increasingly popular since it is faster to generate and automatic to annotate. However, most of the current datasets and environments lack realism, interactions, and details from the real world. UnrealROX is an environment built over Unreal Engine 4 which aims to reduce that reality gap by leveraging hyperrealistic indoor scenes that are explored by robot agents which also interact with objects in a visually realistic manner in that simulated world. Photorealistic scenes and robots are rendered by Unreal Engine into a virtual reality headset which captures gaze so that a human operator can move the robot and use controllers for the robotic hands; scene information is dumped on a per-frame basis so that it can be reproduced offline to generate raw data and ground truth annotations. This virtual reality environment enables robotic vision researchers to generate realistic and visually plausible data with full ground truth for a wide variety of problems such as class and instance semantic segmentation, object detection, depth estimation, visual grasping, and navigation. |
Tasks | Depth Estimation, Object Detection, Semantic Segmentation, Synthetic Data Generation |
Published | 2018-10-16 |
URL | https://arxiv.org/abs/1810.06936v2 |
https://arxiv.org/pdf/1810.06936v2.pdf | |
PWC | https://paperswithcode.com/paper/unrealrox-an-extremely-photorealistic-virtual |
Repo | https://github.com/3dperceptionlab/unrealrox |
Framework | none |
Multilingual Constituency Parsing with Self-Attention and Pre-Training
Title | Multilingual Constituency Parsing with Self-Attention and Pre-Training |
Authors | Nikita Kitaev, Steven Cao, Dan Klein |
Abstract | We show that constituency parsing benefits from unsupervised pre-training across a variety of languages and a range of pre-training conditions. We first compare the benefits of no pre-training, fastText, ELMo, and BERT for English and find that BERT outperforms ELMo, in large part due to increased model capacity, whereas ELMo in turn outperforms the non-contextual fastText embeddings. We also find that pre-training is beneficial across all 11 languages tested; however, large model sizes (more than 100 million parameters) make it computationally expensive to train separate models for each language. To address this shortcoming, we show that joint multilingual pre-training and fine-tuning allows sharing all but a small number of parameters between ten languages in the final model. The 10x reduction in model size compared to fine-tuning one model per language causes only a 3.2% relative error increase in aggregate. We further explore the idea of joint fine-tuning and show that it gives low-resource languages a way to benefit from the larger datasets of other languages. Finally, we demonstrate new state-of-the-art results for 11 languages, including English (95.8 F1) and Chinese (91.8 F1). |
Tasks | Constituency Parsing |
Published | 2018-12-31 |
URL | https://arxiv.org/abs/1812.11760v2 |
https://arxiv.org/pdf/1812.11760v2.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-constituency-parsing-with-self |
Repo | https://github.com/dpfried/rnng-bert |
Framework | tf |