October 21, 2019

3307 words 16 mins read

Paper Group AWR 56

Leveraging Contact Forces for Learning to Grasp. Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. Multi-Stage Reinforcement Learning For Object Detection. Opinion Dynamics with Varying Susceptibility to Persuasion. Deep Inverse Op …

Leveraging Contact Forces for Learning to Grasp


Title	Leveraging Contact Forces for Learning to Grasp
Authors	Hamza Merzic, Miroslav Bogdanovic, Daniel Kappler, Ludovic Righetti, Jeannette Bohg
Abstract	Grasping objects under uncertainty remains an open problem in robotics research. This uncertainty is often due to noisy or partial observations of the object pose or shape. To enable a robot to react appropriately to unforeseen effects, it is crucial that it continuously takes sensor feedback into account. While visual feedback is important for inferring a grasp pose and reaching for an object, contact feedback offers valuable information during manipulation and grasp acquisition. In this paper, we use model-free deep reinforcement learning to synthesize control policies that exploit contact sensing to generate robust grasping under uncertainty. We demonstrate our approach on a multi-fingered hand that exhibits more complex finger coordination than the commonly used two-fingered grippers. We conduct extensive experiments in order to assess the performance of the learned policies, with and without contact sensing. While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.
Tasks
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07004v1
PDF	http://arxiv.org/pdf/1809.07004v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-contact-forces-for-learning-to
Repo	https://github.com/machines-in-motion/grasping_sim
Framework	tf

Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making


Title	Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making
Authors	Luisa M Zintgraf, Diederik M Roijers, Sjoerd Linders, Catholijn M Jonker, Ann Nowé
Abstract	In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining which policy to execute by maximising the user’s intrinsic utility function over this (possibly infinite) set, is under-studied. This paper aims to fill this gap. We build on previous work on Gaussian processes and pairwise comparisons for preference modelling, extend it to the multi-objective decision support scenario, and propose new ordered preference elicitation strategies based on ranking and clustering. Our main contribution is an in-depth evaluation of these strategies using computer and human-based experiments. We show that our proposed elicitation strategies outperform the currently used pairwise methods, and found that users prefer ranking most. Our experiments further show that utilising monotonicity information in GPs by using a linear prior mean at the start and virtual comparisons to the nadir and ideal points, increases performance. We demonstrate our decision support framework in a real-world study on traffic regulation, conducted with the city of Amsterdam.
Tasks	Decision Making, Gaussian Processes
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07606v1
PDF	http://arxiv.org/pdf/1802.07606v1.pdf
PWC	https://paperswithcode.com/paper/ordered-preference-elicitation-strategies-for
Repo	https://github.com/lmzintgraf/gp_pref_elicit
Framework	none

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism


Title	GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Authors	Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, Zhifeng Chen
Abstract	Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks. In many cases, increasing model capacity beyond the memory limit of a single accelerator has required developing special algorithms or infrastructure. These solutions are often architecture-specific and do not transfer to other tasks. To address the need for efficient and task-independent model parallelism, we introduce GPipe, a pipeline parallelism library that allows scaling any network that can be expressed as a sequence of layers. By pipelining different sub-sequences of layers on separate accelerators, GPipe provides the flexibility of scaling a variety of different networks to gigantic sizes efficiently. Moreover, GPipe utilizes a novel batch-splitting pipelining algorithm, resulting in almost linear speedup when a model is partitioned across multiple accelerators. We demonstrate the advantages of GPipe by training large-scale neural networks on two different tasks with distinct network architectures: (i) Image Classification: We train a 557-million-parameter AmoebaNet model and attain a top-1 accuracy of 84.4% on ImageNet-2012, (ii) Multilingual Neural Machine Translation: We train a single 6-billion-parameter, 128-layer Transformer model on a corpus spanning over 100 languages and achieve better quality than all bilingual models.
Tasks	Fine-Grained Image Classification, Image Classification, Machine Translation
Published	2018-11-16
URL	https://arxiv.org/abs/1811.06965v5
PDF	https://arxiv.org/pdf/1811.06965v5.pdf
PWC	https://paperswithcode.com/paper/gpipe-efficient-training-of-giant-neural
Repo	https://github.com/tensorflow/lingvo
Framework	tf

Multi-Stage Reinforcement Learning For Object Detection


Title	Multi-Stage Reinforcement Learning For Object Detection
Authors	Jonas Koenig, Simon Malberg, Martin Martens, Sebastian Niehaus, Artus Krohn-Grimberghe, Arunselvan Ramaswamy
Abstract	We present a reinforcement learning approach for detecting objects within an image. Our approach performs a step-wise deformation of a bounding box with the goal of tightly framing the object. It uses a hierarchical tree-like representation of predefined region candidates, which the agent can zoom in on. This reduces the number of region candidates that must be evaluated so that the agent can afford to compute new feature maps before each step to enhance detection quality. We compare an approach that is based purely on zoom actions with one that is extended by a second refinement stage to fine-tune the bounding box after each zoom step. We also improve the fitting ability by allowing for different aspect ratios of the bounding box. Finally, we propose different reward functions to lead to a better guidance of the agent while following its search trajectories. Experiments indicate that each of these extensions leads to more correct detections. The best performing approach comprises a zoom stage and a refinement stage, uses aspect-ratio modifying actions and is trained using a combination of three different reward metrics.
Tasks	Object Detection
Published	2018-10-15
URL	http://arxiv.org/abs/1810.10325v2
PDF	http://arxiv.org/pdf/1810.10325v2.pdf
PWC	https://paperswithcode.com/paper/multi-stage-reinforcement-learning-for-object
Repo	https://github.com/qq456cvb/multi-stage-detection
Framework	tf

Opinion Dynamics with Varying Susceptibility to Persuasion


Title	Opinion Dynamics with Varying Susceptibility to Persuasion
Authors	Rediet Abebe, Jon Kleinberg, David Parkes, Charalampos E. Tsourakakis
Abstract	A long line of work in social psychology has studied variations in people’s susceptibility to persuasion – the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people’s intrinsic opinions, it is also natural to consider interventions that modify people’s susceptibility to persuasion. In this work, we adopt a popular model for social opinion dynamics, and we formalize the opinion maximization and minimization problems where interventions happen at the level of susceptibility. We show that modeling interventions at the level of susceptibility lead to an interesting family of new questions in network opinion dynamics. We find that the questions are quite different depending on whether there is an overall budget constraining the number of agents we can target or not. We give a polynomial-time algorithm for finding the optimal target-set to optimize the sum of opinions when there are no budget constraints on the size of the target-set. We show that this problem is NP-hard when there is a budget, and that the objective function is neither submodular nor supermodular. Finally, we propose a heuristic for the budgeted opinion optimization and show its efficacy at finding target-sets that optimize the sum of opinions compared on real world networks, including a Twitter network with real opinion estimates.
Tasks
Published	2018-01-24
URL	http://arxiv.org/abs/1801.07863v1
PDF	http://arxiv.org/pdf/1801.07863v1.pdf
PWC	https://paperswithcode.com/paper/opinion-dynamics-with-varying-susceptibility
Repo	https://github.com/tsourolampis/opdyn-social-influence
Framework	none

Deep Inverse Optimization


Title	Deep Inverse Optimization
Authors	Yingcong Tan, Andrew Delong, Daria Terekhov
Abstract	Given a set of observations generated by an optimization process, the goal of inverse optimization is to determine likely parameters of that process. We cast inverse optimization as a form of deep learning. Our method, called deep inverse optimization, is to unroll an iterative optimization process and then use backpropagation to learn parameters that generate the observations. We demonstrate that by backpropagating through the interior point algorithm we can learn the coefficients determining the cost vector and the constraints, independently or jointly, for both non-parametric and parametric linear programs, starting from one or multiple observations. With this approach, inverse optimization can leverage concepts and algorithms from deep learning.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00804v1
PDF	http://arxiv.org/pdf/1812.00804v1.pdf
PWC	https://paperswithcode.com/paper/deep-inverse-optimization
Repo	https://github.com/tankconcordia/deep_inv_opt
Framework	pytorch

Dual Recurrent Attention Units for Visual Question Answering


Title	Dual Recurrent Attention Units for Visual Question Answering
Authors	Ahmed Osman, Wojciech Samek
Abstract	Visual Question Answering (VQA) requires AI models to comprehend data in two domains, vision and text. Current state-of-the-art models use learned attention mechanisms to extract relevant information from the input domains to answer a certain question. Thus, robust attention mechanisms are essential for powerful VQA models. In this paper, we propose a recurrent attention mechanism and show its benefits compared to the traditional convolutional approach. We perform two ablation studies to evaluate recurrent attention. First, we introduce a baseline VQA model with visual attention and test the performance difference between convolutional and recurrent attention on the VQA 2.0 dataset. Secondly, we design an architecture for VQA which utilizes dual (textual and visual) Recurrent Attention Units (RAUs). Using this model, we show the effect of all possible combinations of recurrent and convolutional dual attention. Our single model outperforms the first place winner on the VQA 2016 challenge and to the best of our knowledge, it is the second best performing single model on the VQA 1.0 dataset. Furthermore, our model noticeably improves upon the winner of the VQA 2017 challenge. Moreover, we experiment replacing attention mechanisms in state-of-the-art models with our RAUs and show increased performance.
Tasks	Question Answering, Visual Question Answering
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00209v3
PDF	http://arxiv.org/pdf/1802.00209v3.pdf
PWC	https://paperswithcode.com/paper/dual-recurrent-attention-units-for-visual
Repo	https://github.com/ahmedmagdiosman/compress-vqa
Framework	pytorch

Diverse Image Synthesis from Semantic Layouts via Conditional IMLE


Title	Diverse Image Synthesis from Semantic Layouts via Conditional IMLE
Authors	Ke Li, Tianhao Zhang, Jitendra Malik
Abstract	Most existing methods for conditional image synthesis are only able to generate a single plausible image for any given input, or at best a fixed number of plausible images. In this paper, we focus on the problem of generating images from semantic segmentation maps and present a simple new method that can generate an arbitrary number of images with diverse appearance for the same semantic layout. Unlike most existing approaches which adopt the GAN framework, our method is based on the recently introduced Implicit Maximum Likelihood Estimation (IMLE) framework. Compared to the leading approach, our method is able to generate more diverse images while producing fewer artifacts despite using the same architecture. The learned latent space also has sensible structure despite the lack of supervision that encourages such behaviour. Videos and code are available at https://people.eecs.berkeley.edu/~ke.li/projects/imle/scene_layouts/.
Tasks	Image Generation, Semantic Segmentation
Published	2018-11-29
URL	https://arxiv.org/abs/1811.12373v2
PDF	https://arxiv.org/pdf/1811.12373v2.pdf
PWC	https://paperswithcode.com/paper/diverse-image-synthesis-from-semantic-layouts
Repo	https://github.com/zth667/Diverse-Image-Synthesis-from-Semantic-Layout
Framework	tf

BlockCNN: A Deep Network for Artifact Removal and Image Compression


Title	BlockCNN: A Deep Network for Artifact Removal and Image Compression
Authors	Danial Maleki, Soheila Nadalian, Mohammad Mahdi Derakhshani, Mohammad Amin Sadeghi
Abstract	We present a general technique that performs both artifact removal and image compression. For artifact removal, we input a JPEG image and try to remove its compression artifacts. For compression, we input an image and process its 8 by 8 blocks in a sequence. For each block, we first try to predict its intensities based on previous blocks; then, we store a residual with respect to the input image. Our technique reuses JPEG’s legacy compression and decompression routines. Both our artifact removal and our image compression techniques use the same deep network, but with different training weights. Our technique is simple and fast and it significantly improves the performance of artifact removal and image compression.
Tasks	Image Compression
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11091v1
PDF	http://arxiv.org/pdf/1805.11091v1.pdf
PWC	https://paperswithcode.com/paper/blockcnn-a-deep-network-for-artifact-removal
Repo	https://github.com/DaniMlk/BlockCNN
Framework	pytorch

Fast Counting in Machine Learning Applications


Title	Fast Counting in Machine Learning Applications
Authors	Subhadeep Karan, Matthew Eichhorn, Blake Hurlburt, Grant Iraci, Jaroslaw Zola
Abstract	We propose scalable methods to execute counting queries in machine learning applications. To achieve memory and computational efficiency, we abstract counting queries and their context such that the counts can be aggregated as a stream. We demonstrate performance and scalability of the resulting approach on random queries, and through extensive experimentation using Bayesian networks learning and association rule mining. Our methods significantly outperform commonly used ADtrees and hash tables, and are practical alternatives for processing large-scale data.
Tasks
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04640v3
PDF	http://arxiv.org/pdf/1804.04640v3.pdf
PWC	https://paperswithcode.com/paper/fast-counting-in-machine-learning
Repo	https://github.com/omerjerk/cuSABNAtk
Framework	none

Addressing Function Approximation Error in Actor-Critic Methods


Title	Addressing Function Approximation Error in Actor-Critic Methods
Authors	Scott Fujimoto, Herke van Hoof, David Meger
Abstract	In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies. We show that this problem persists in an actor-critic setting and propose novel mechanisms to minimize its effects on both the actor and the critic. Our algorithm builds on Double Q-learning, by taking the minimum value between a pair of critics to limit overestimation. We draw the connection between target networks and overestimation bias, and suggest delaying policy updates to reduce per-update error and further improve performance. We evaluate our method on the suite of OpenAI gym tasks, outperforming the state of the art in every environment tested.
Tasks	Q-Learning
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09477v3
PDF	http://arxiv.org/pdf/1802.09477v3.pdf
PWC	https://paperswithcode.com/paper/addressing-function-approximation-error-in
Repo	https://github.com/nikhilbarhate99/TD3-PyTorch-BipedalWalker-v2
Framework	pytorch

DENSER: Deep Evolutionary Network Structured Representation


Title	DENSER: Deep Evolutionary Network Structured Representation
Authors	Filipe Assunção, Nuno Lourenço, Penousal Machado, Bernardete Ribeiro
Abstract	Deep Evolutionary Network Structured Representation (DENSER) is a novel approach to automatically design Artificial Neural Networks (ANNs) using Evolutionary Computation. The algorithm not only searches for the best network topology (e.g., number of layers, type of layers), but also tunes hyper-parameters, such as, learning parameters or data augmentation parameters. The automatic design is achieved using a representation with two distinct levels, where the outer level encodes the general structure of the network, i.e., the sequence of layers, and the inner level encodes the parameters associated with each layer. The allowed layers and range of the hyper-parameters values are defined by means of a human-readable Context-Free Grammar. DENSER was used to evolve ANNs for CIFAR-10, obtaining an average test accuracy of 94.13%. The networks evolved for the CIFA–10 are tested on the MNIST, Fashion-MNIST, and CIFAR-100; the results are highly competitive, and on the CIFAR-100 we report a test accuracy of 78.75%. To the best of our knowledge, our CIFAR-100 results are the highest performing models generated by methods that aim at the automatic design of Convolutional Neural Networks (CNNs), and are amongst the best for manually designed and fine-tuned CNNs.
Tasks	Data Augmentation
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01563v3
PDF	http://arxiv.org/pdf/1801.01563v3.pdf
PWC	https://paperswithcode.com/paper/denser-deep-evolutionary-network-structured
Repo	https://github.com/harshit17chaudhary/SML_assignment_1
Framework	tf

Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning


Title	Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning
Authors	Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille
Abstract	Learning visual features from unlabeled image data is an important yet challenging task, which is often achieved by training a model on some annotation-free information. We consider spatial contexts, for which we solve so-called jigsaw puzzles, i.e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration. Existing approaches formulated it as a classification task by defining a fixed mapping from a small subset of configurations to a class set, but these approaches ignore the underlying relationship between different configurations and also limit their application to more complex scenarios. This paper presents a novel approach which applies to jigsaw puzzles with an arbitrary grid size and dimensionality. We provide a fundamental and generalized principle, that weaker cues are easier to be learned in an unsupervised manner and also transfer better. In the context of puzzle recognition, we use an iterative manner which, instead of solving the puzzle all at once, adjusts the order of the patches in each step until convergence. In each step, we combine both unary and binary features on each patch into a cost function judging the correctness of the current configuration. Our approach, by taking similarity between puzzles into consideration, enjoys a more reasonable way of learning visual knowledge. We verify the effectiveness of our approach in two aspects. First, it is able to solve arbitrarily complex puzzles, including high-dimensional puzzles, that prior methods are difficult to handle. Second, it serves as a reliable way of network initialization, which leads to better transfer performance in a few visual recognition tasks including image classification, object detection, and semantic segmentation.
Tasks	Image Classification, Object Detection, Representation Learning, Semantic Segmentation, Unsupervised Representation Learning
Published	2018-12-02
URL	http://arxiv.org/abs/1812.00329v1
PDF	http://arxiv.org/pdf/1812.00329v1.pdf
PWC	https://paperswithcode.com/paper/iterative-reorganization-with-weak-spatial
Repo	https://github.com/weichen582/Unsupervised-Visual-Recognition-by-Solving-Arbitrary-Puzzles
Framework	tf


Title	Did You Really Just Have a Heart Attack? Towards Robust Detection of Personal Health Mentions in Social Media
Authors	Payam Karisani, Eugene Agichtein
Abstract	Millions of users share their experiences on social media sites, such as Twitter, which in turn generate valuable data for public health monitoring, digital epidemiology, and other analyses of population health at global scale. The first, critical, task for these applications is classifying whether a personal health event was mentioned, which we call the (PHM) problem. This task is challenging for many reasons, including typically short length of social media posts, inventive spelling and lexicons, and figurative language, including hyperbole using diseases like “heart attack” or “cancer” for emphasis, and not as a health self-report. This problem is even more challenging for rarely reported, or frequent but ambiguously expressed conditions, such as “stroke”. To address this problem, we propose a general, robust method for detecting PHMs in social media, which we call WESPAD, that combines lexical, syntactic, word embedding-based, and context-based features. WESPAD is able to generalize from few examples by automatically distorting the word embedding space to most effectively detect the true health mentions. Unlike previously proposed state-of-the-art supervised and deep-learning techniques, WESPAD requires relatively little training data, which makes it possible to adapt, with minimal effort, to each new disease and condition. We evaluate WESPAD on both an established publicly available Flu detection benchmark, and on a new dataset that we have constructed with mentions of multiple health conditions. Our experiments show that WESPAD outperforms the baselines and state-of-the-art methods, especially in cases when the number and proportion of true health mentions in the training data is small.
Tasks	Epidemiology
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09130v2
PDF	http://arxiv.org/pdf/1802.09130v2.pdf
PWC	https://paperswithcode.com/paper/did-you-really-just-have-a-heart-attack
Repo	https://github.com/emory-irlab/PHM2017
Framework	none

Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder


Title	Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder
Authors	Seunghyun Yoon, Kunwoo Park, Joongbo Shin, Hongjun Lim, Seungpil Won, Meeyoung Cha, Kyomin Jung
Abstract	Some news headlines mislead readers with overrated or false information, and identifying them in advance will better assist readers in choosing proper news stories to consume. This research introduces million-scale pairs of news headline and body text dataset with incongruity label, which can uniquely be utilized for detecting news stories with misleading headlines. On this dataset, we develop two neural networks with hierarchical architectures that model a complex textual representation of news articles and measure the incongruity between the headline and the body text. We also present a data augmentation method that dramatically reduces the text input size a model handles by independently investigating each paragraph of news stories, which further boosts the performance. Our experiments and qualitative evaluations demonstrate that the proposed methods outperform existing approaches and efficiently detect news stories with misleading headlines in the real world.
Tasks	Data Augmentation, Fake News Detection, incongruity detection, Stance Detection
Published	2018-11-17
URL	http://arxiv.org/abs/1811.07066v2
PDF	http://arxiv.org/pdf/1811.07066v2.pdf
PWC	https://paperswithcode.com/paper/detecting-incongruity-between-news-headline
Repo	https://github.com/david-yoon/detecting-incongruity
Framework	tf