Paper Group AWR 433
End-to-End Robotic Reinforcement Learning without Reward Engineering. 4D Generic Video Object Proposals. Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks. Constructing Self-motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach. Fairwashing: the risk of rat …
End-to-End Robotic Reinforcement Learning without Reward Engineering
Title | End-to-End Robotic Reinforcement Learning without Reward Engineering |
Authors | Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine |
Abstract | The combination of deep neural network models and reinforcement learning algorithms can make it possible to learn policies for robotic behaviors that directly read in raw sensory inputs, such as camera images, effectively subsuming both estimation and control into one model. However, real-world applications of reinforcement learning must specify the goal of the task by means of a manually programmed reward function, which in practice requires either designing the very same perception pipeline that end-to-end reinforcement learning promises to avoid, or else instrumenting the environment with additional sensors to determine if the task has been performed successfully. In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task. While requesting labels for every single state would amount to asking the user to manually provide the reward signal, our method requires labels for only a tiny fraction of the states seen during training, making it an efficient and practical approach for learning skills without manually engineered rewards. We evaluate our method on real-world robotic manipulation tasks where the observations consist of images viewed by the robot’s camera. In our experiments, our method effectively learns to arrange objects, place books, and drape cloth, directly from images and without any manually specified reward functions, and with only 1-4 hours of interaction with the real world. |
Tasks | |
Published | 2019-04-16 |
URL | https://arxiv.org/abs/1904.07854v2 |
https://arxiv.org/pdf/1904.07854v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-robotic-reinforcement-learning |
Repo | https://github.com/avisingh599/reward-learning-rl |
Framework | none |
4D Generic Video Object Proposals
Title | 4D Generic Video Object Proposals |
Authors | Aljosa Osep, Paul Voigtlaender, Mark Weber, Jonathon Luiten, Bastian Leibe |
Abstract | Many high-level video understanding methods require input in the form of object proposals. Currently, such proposals are predominantly generated with the help of networks that were trained for detecting and segmenting a set of known object classes, which limits their applicability to cases where all objects of interest are represented in the training set. This is a restriction for automotive scenarios, where unknown objects can frequently occur. We propose an approach that can reliably extract spatio-temporal object proposals for both known and unknown object categories from stereo video. Our 4D Generic Video Tubes (4D-GVT) method leverages motion cues, stereo data, and object instance segmentation to compute a compact set of video-object proposals that precisely localizes object candidates and their contours in 3D space and time. We show that given only a small amount of labeled data, our 4D-GVT proposal generator generalizes well to real-world scenarios, in which unknown categories appear. It outperforms other approaches that try to detect as many objects as possible by increasing the number of classes in the training set to several thousand. |
Tasks | Instance Segmentation, Semantic Segmentation, Video Understanding |
Published | 2019-01-26 |
URL | http://arxiv.org/abs/1901.09260v2 |
http://arxiv.org/pdf/1901.09260v2.pdf | |
PWC | https://paperswithcode.com/paper/4d-generic-video-object-proposals |
Repo | https://github.com/aljosaosep/4DGVT |
Framework | none |
Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks
Title | Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks |
Authors | Seunghyun Yoon, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Kyomin Jung |
Abstract | In this study, we propose a novel graph neural network called propagate-selector (PS), which propagates information over sentences to understand information that cannot be inferred when considering sentences in isolation. First, we design a graph structure in which each node represents an individual sentence, and some pairs of nodes are selectively connected based on the text structure. Then, we develop an iterative attentive aggregation and a skip-combine method in which a node interacts with its neighborhood nodes to accumulate the necessary information. To evaluate the performance of the proposed approaches, we conduct experiments with the standard HotpotQA dataset. The empirical results demonstrate the superiority of our proposed approach, which obtains the best performances, compared to the widely used answer-selection models that do not consider the intersentential relationship. |
Tasks | Answer Selection, Question Answering |
Published | 2019-08-24 |
URL | https://arxiv.org/abs/1908.09137v2 |
https://arxiv.org/pdf/1908.09137v2.pdf | |
PWC | https://paperswithcode.com/paper/propagate-selector-detecting-supporting |
Repo | https://github.com/david-yoon/propagate-selector |
Framework | tf |
Constructing Self-motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach
Title | Constructing Self-motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach |
Authors | Qing Lian, Fengmao Lv, Lixin Duan, Boqing Gong |
Abstract | We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the adaptation of semantic segmentation neural networks from synthetic source domains to real target domains. Our approach draws on an insight connecting two existing works: curriculum domain adaptation and self-training. Inspired by the former, PyCDA constructs a pyramid curriculum which contains various properties about the target domain. Those properties are mainly about the desired label distributions over the target domain images, image regions, and pixels. By enforcing the segmentation neural network to observe those properties, we can improve the network’s generalization capability to the target domain. Motivated by the self-training, we infer this pyramid of properties by resorting to the semantic segmentation network itself. Unlike prior work, we do not need to maintain any additional models (e.g., logistic regression or discriminator networks) or to solve minmax problems which are often difficult to optimize. We report state-of-the-art results for the adaptation from both GTAV and SYNTHIA to Cityscapes, two popular settings in unsupervised domain adaptation for semantic segmentation. |
Tasks | Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09547v1 |
https://arxiv.org/pdf/1908.09547v1.pdf | |
PWC | https://paperswithcode.com/paper/constructing-self-motivated-pyramid |
Repo | https://github.com/lianqing11/pycda |
Framework | pytorch |
Fairwashing: the risk of rationalization
Title | Fairwashing: the risk of rationalization |
Authors | Ulrich Aïvodji, Hiromi Arai, Olivier Fortineau, Sébastien Gambs, Satoshi Hara, Alain Tapp |
Abstract | Black-box explanation is the problem of explaining how a machine learning model – whose internal logic is hidden to the auditor and generally complex – produces its outcomes. Current approaches for solving this problem include model explanation, outcome explanation as well as model inspection. While these techniques can be beneficial by providing interpretability, they can be used in a negative manner to perform fairwashing, which we define as promoting the false perception that a machine learning model respects some ethical values. In particular, we demonstrate that it is possible to systematically rationalize decisions taken by an unfair black-box model using the model explanation as well as the outcome explanation approaches with a given fairness metric. Our solution, LaundryML, is based on a regularized rule list enumeration algorithm whose objective is to search for fair rule lists approximating an unfair black-box model. We empirically evaluate our rationalization technique on black-box models trained on real-world datasets and show that one can obtain rule lists with high fidelity to the black-box model while being considerably less unfair at the same time. |
Tasks | |
Published | 2019-01-28 |
URL | https://arxiv.org/abs/1901.09749v3 |
https://arxiv.org/pdf/1901.09749v3.pdf | |
PWC | https://paperswithcode.com/paper/fairwashing-the-risk-of-rationalization |
Repo | https://github.com/aivodji/LaundryML |
Framework | none |
A Little Annotation does a Lot of Good: A Study in Bootstrapping Low-resource Named Entity Recognizers
Title | A Little Annotation does a Lot of Good: A Study in Bootstrapping Low-resource Named Entity Recognizers |
Authors | Aditi Chaudhary, Jiateng Xie, Zaid Sheikh, Graham Neubig, Jaime G. Carbonell |
Abstract | Most state-of-the-art models for named entity recognition (NER) rely on the availability of large amounts of labeled data, making them challenging to extend to new, lower-resourced languages. However, there are now several proposed approaches involving either cross-lingual transfer learning, which learns from other highly resourced languages, or active learning, which efficiently selects effective training data based on model predictions. This paper poses the question: given this recent progress, and limited human annotation, what is the most effective method for efficiently creating high-quality entity recognizers in under-resourced languages? Based on extensive experimentation using both simulated and real human annotation, we find a dual-strategy approach best, starting with a cross-lingual transferred model, then performing targeted annotation of only uncertain entity spans in the target language, minimizing annotator effort. Results demonstrate that cross-lingual transfer is a powerful tool when very little data can be annotated, but an entity-targeted annotation strategy can achieve competitive accuracy quickly, with just one-tenth of training data. |
Tasks | Active Learning, Cross-Lingual Transfer, Named Entity Recognition, Transfer Learning |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08983v1 |
https://arxiv.org/pdf/1908.08983v1.pdf | |
PWC | https://paperswithcode.com/paper/a-little-annotation-does-a-lot-of-good-a |
Repo | https://github.com/Aditi138/EntityTargetedActiveLearning |
Framework | none |
Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses
Title | Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses |
Authors | Xiao Wang, Siyue Wang, Pin-Yu Chen, Yanzhi Wang, Brian Kulis, Xue Lin, Peter Chin |
Abstract | Despite achieving remarkable success in various domains, recent studies have uncovered the vulnerability of deep neural networks to adversarial perturbations, creating concerns on model generalizability and new threats such as prediction-evasive misclassification or stealthy reprogramming. Among different defense proposals, stochastic network defenses such as random neuron activation pruning or random perturbation to layer inputs are shown to be promising for attack mitigation. However, one critical drawback of current defenses is that the robustness enhancement is at the cost of noticeable performance degradation on legitimate data, e.g., large drop in test accuracy. This paper is motivated by pursuing for a better trade-off between adversarial robustness and test accuracy for stochastic network defenses. We propose Defense Efficiency Score (DES), a comprehensive metric that measures the gain in unsuccessful attack attempts at the cost of drop in test accuracy of any defense. To achieve a better DES, we propose hierarchical random switching (HRS), which protects neural networks through a novel randomization scheme. A HRS-protected model contains several blocks of randomly switching channels to prevent adversaries from exploiting fixed model structures and parameters for their malicious purposes. Extensive experiments show that HRS is superior in defending against state-of-the-art white-box and adaptive adversarial misclassification attacks. We also demonstrate the effectiveness of HRS in defending adversarial reprogramming, which is the first defense against adversarial programs. Moreover, in most settings the average DES of HRS is at least 5X higher than current stochastic network defenses, validating its significantly improved robustness-accuracy trade-off. |
Tasks | |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07116v1 |
https://arxiv.org/pdf/1908.07116v1.pdf | |
PWC | https://paperswithcode.com/paper/protecting-neural-networks-with-hierarchical |
Repo | https://github.com/KieranXWang/HRS |
Framework | tf |
Alternating Synthetic and Real Gradients for Neural Language Modeling
Title | Alternating Synthetic and Real Gradients for Neural Language Modeling |
Authors | Fangxin Shang, Hao Zhang |
Abstract | Training recurrent neural networks (RNNs) with backpropagation through time (BPTT) has known drawbacks such as being difficult to capture longterm dependencies in sequences. Successful alternatives to BPTT have not yet been discovered. Recently, BP with synthetic gradients by a decoupled neural interface module has been proposed to replace BPTT for training RNNs. On the other hand, it has been shown that the representations learned with synthetic and real gradients are different though they are functionally identical. In this project, we explore ways of combining synthetic and real gradients with application to neural language modeling tasks. Empirically, we demonstrate the effectiveness of alternating training with synthetic and real gradients after periodic warm restarts on language modeling tasks. |
Tasks | Language Modelling |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10630v1 |
http://arxiv.org/pdf/1902.10630v1.pdf | |
PWC | https://paperswithcode.com/paper/alternating-synthetic-and-real-gradients-for |
Repo | https://github.com/parap1uie-s/alternate_sg |
Framework | pytorch |
DRiLLS: Deep Reinforcement Learning for Logic Synthesis
Title | DRiLLS: Deep Reinforcement Learning for Logic Synthesis |
Authors | Abdelrahman Hosny, Soheil Hashemi, Mohamed Shalan, Sherief Reda |
Abstract | Logic synthesis requires extensive tuning of the synthesis optimization flow where the quality of results (QoR) depends on the sequence of optimizations used. Efficient design space exploration is challenging due to the exponential number of possible optimization permutations. Therefore, automating the optimization process is necessary. In this work, we propose a novel reinforcement learning-based methodology that navigates the optimization space without human intervention. We demonstrate the training of an Advantage Actor Critic (A2C) agent that seeks to minimize area subject to a timing constraint. Using the proposed methodology, designs can be optimized autonomously with no-humans in-loop. Evaluation on the comprehensive EPFL benchmark suite shows that the agent outperforms existing exploration methodologies and improves QoRs by an average of 13%. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04021v2 |
https://arxiv.org/pdf/1911.04021v2.pdf | |
PWC | https://paperswithcode.com/paper/drills-deep-reinforcement-learning-for-logic |
Repo | https://github.com/scale-lab/DRiLLS |
Framework | tf |
SpiralNet++: A Fast and Highly Efficient Mesh Convolution Operator
Title | SpiralNet++: A Fast and Highly Efficient Mesh Convolution Operator |
Authors | Shunwang Gong, Lei Chen, Michael Bronstein, Stefanos Zafeiriou |
Abstract | Intrinsic graph convolution operators with differentiable kernel functions play a crucial role in analyzing 3D shape meshes. In this paper, we present a fast and efficient intrinsic mesh convolution operator that does not rely on the intricate design of kernel function. We explicitly formulate the order of aggregating neighboring vertices, instead of learning weights between nodes, and then a fully connected layer follows to fuse local geometric structure information with vertex features. We provide extensive evidence showing that models based on this convolution operator are easier to train, and can efficiently learn invariant shape features. Specifically, we evaluate our method on three different types of tasks of dense shape correspondence, 3D facial expression classification, and 3D shape reconstruction, and show that it significantly outperforms state-of-the-art approaches while being significantly faster, without relying on shape descriptors. Our source code is available on GitHub. |
Tasks | |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05856v1 |
https://arxiv.org/pdf/1911.05856v1.pdf | |
PWC | https://paperswithcode.com/paper/spiralnet-a-fast-and-highly-efficient-mesh |
Repo | https://github.com/sw-gong/spiralnet_plus |
Framework | pytorch |
Multi-Task Driven Feature Models for Thermal Infrared Tracking
Title | Multi-Task Driven Feature Models for Thermal Infrared Tracking |
Authors | Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, Wei Liu, Yonsheng Liang |
Abstract | Existing deep Thermal InfraRed (TIR) trackers usually use the feature models of RGB trackers for representation. However, these feature models learned on RGB images are neither effective in representing TIR objects nor taking fine-grained TIR information into consideration. To this end, we develop a multi-task framework to learn the TIR-specific discriminative features and fine-grained correlation features for TIR tracking. Specifically, we first use an auxiliary classification network to guide the generation of TIR-specific discriminative features for distinguishing the TIR objects belonging to different classes. Second, we design a fine-grained aware module to capture more subtle information for distinguishing the TIR objects belonging to the same class. These two kinds of features complement each other and recognize TIR objects in the levels of inter-class and intra-class respectively. These two feature models are learned using a multi-task matching framework and are jointly optimized on the TIR tracking task. In addition, we develop a large-scale TIR training dataset to train the network for adapting the model to the TIR domain. Extensive experimental results on three benchmarks show that the proposed algorithm achieves a relative gain of 10% over the baseline and performs favorably against the state-of-the-art methods. Codes and the proposed TIR dataset are available at {https://github.com/QiaoLiuHit/MMNet}. |
Tasks | Thermal Infrared Object Tracking |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11384v1 |
https://arxiv.org/pdf/1911.11384v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-driven-feature-models-for-thermal |
Repo | https://github.com/QiaoLiuHit/MMNet |
Framework | none |
DeepSmartFuzzer: Reward Guided Test Generation For Deep Learning
Title | DeepSmartFuzzer: Reward Guided Test Generation For Deep Learning |
Authors | Samet Demir, Hasan Ferit Eniser, Alper Sen |
Abstract | Testing Deep Neural Network (DNN) models has become more important than ever with the increasing usage of DNN models in safety-critical domains such as autonomous cars. The traditional approach of testing DNNs is to create a test set, which is a random subset of the dataset about the problem of interest. This kind of approach is not enough for testing most of the real-world scenarios since these traditional test sets do not include corner cases, while a corner case input is generally considered to introduce erroneous behaviors. Recent works on adversarial input generation, data augmentation, and coverage-guided fuzzing (CGF) have provided new ways to extend traditional test sets. Among those, CGF aims to produce new test inputs by fuzzing existing ones to achieve high coverage on a test adequacy criterion (i.e. coverage criterion). Given that the subject test adequacy criterion is a well-established one, CGF can potentially find error inducing inputs for different underlying reasons. In this paper, we propose a novel CGF solution for structural testing of DNNs. The proposed fuzzer employs Monte Carlo Tree Search to drive the coverage-guided search in the pursuit of achieving high coverage. Our evaluation shows that the inputs generated by our method result in higher coverage than the inputs produced by the previously introduced coverage-guided fuzzing techniques. |
Tasks | Data Augmentation |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10621v1 |
https://arxiv.org/pdf/1911.10621v1.pdf | |
PWC | https://paperswithcode.com/paper/deepsmartfuzzer-reward-guided-test-generation |
Repo | https://github.com/hasanferit/DeepSmartFuzzer |
Framework | none |
A Comparative Study on Transformer vs RNN in Speech Applications
Title | A Comparative Study on Transformer vs RNN in Speech Applications |
Authors | Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang |
Abstract | Sequence-to-sequence models have been widely used in end-to-end speech processing, for example, automatic speech recognition (ASR), speech translation (ST), and text-to-speech (TTS). This paper focuses on an emergent sequence-to-sequence model called Transformer, which achieves state-of-the-art performance in neural machine translation and other natural language processing applications. We undertook intensive studies in which we experimentally compared and analyzed Transformer and conventional recurrent neural networks (RNN) in a total of 15 ASR, one multilingual ASR, one ST, and two TTS benchmarks. Our experiments revealed various training tips and significant performance benefits obtained with Transformer for each task including the surprising superiority of Transformer in 13/15 ASR benchmarks in comparison with RNN. We are preparing to release Kaldi-style reproducible recipes using open source and publicly available datasets for all the ASR, ST, and TTS tasks for the community to succeed our exciting outcomes. |
Tasks | Machine Translation, Speech Recognition |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06317v2 |
https://arxiv.org/pdf/1909.06317v2.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-on-transformer-vs-rnn-in |
Repo | https://github.com/espnet/espnet |
Framework | pytorch |
Photo-Sketching: Inferring Contour Drawings from Images
Title | Photo-Sketching: Inferring Contour Drawings from Images |
Authors | Mengtian Li, Zhe Lin, Radomir Mech, Ersin Yumer, Deva Ramanan |
Abstract | Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision. On one hand, they are the 2D elements that convey 3D shapes, on the other hand, they are indicative of occlusion events and thus separation of objects or semantic concepts. In this paper, we aim to generate contour drawings, boundary-like drawings that capture the outline of the visual scene. Prior art often cast this problem as boundary detection. However, the set of visual cues presented in the boundary detection output are different from the ones in contour drawings, and also the artistic style is ignored. We address these issues by collecting a new dataset of contour drawings and proposing a learning-based method that resolves diversity in the annotation and, unlike boundary detectors, can work with imperfect alignment of the annotation and the actual ground truth. Our method surpasses previous methods quantitatively and qualitatively. Surprisingly, when our model fine-tunes on BSDS500, we achieve the state-of-the-art performance in salient boundary detection, suggesting contour drawing might be a scalable alternative to boundary annotation, which at the same time is easier and more interesting for annotators to draw. |
Tasks | Boundary Detection |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00542v1 |
http://arxiv.org/pdf/1901.00542v1.pdf | |
PWC | https://paperswithcode.com/paper/photo-sketching-inferring-contour-drawings |
Repo | https://github.com/jjeamin/PhotoSketch_Pytorch |
Framework | pytorch |
Conservative Agency
Title | Conservative Agency |
Authors | Alexander Matt Turner, Dylan Hadfield-Menell, Prasad Tadepalli |
Abstract | Reward functions are easy to misspecify; although designers can make corrections after observing mistakes, an agent pursuing a misspecified reward function can irreversibly change the state of its environment. If that change precludes optimization of the correctly specified reward function, then correction is futile. For example, a robotic factory assistant could break expensive equipment due to a reward misspecification; even if the designers immediately correct the reward function, the damage is done. To mitigate this risk, we introduce an approach that balances optimization of the primary reward function with preservation of the ability to optimize auxiliary reward functions. Surprisingly, even when the auxiliary reward functions are randomly generated and therefore uninformative about the correctly specified reward function, this approach induces conservative, effective behavior. |
Tasks | |
Published | 2019-02-26 |
URL | https://arxiv.org/abs/1902.09725v2 |
https://arxiv.org/pdf/1902.09725v2.pdf | |
PWC | https://paperswithcode.com/paper/conservative-agency-via-attainable-utility |
Repo | https://github.com/PartnershipOnAI/safelife |
Framework | none |