October 20, 2019

3004 words 15 mins read

Paper Group AWR 202

Paper Group AWR 202

Learning to adapt: a meta-learning approach for speaker adaptation. Towards Fast Computation of Certified Robustness for ReLU Networks. Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering. Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learnin …

Learning to adapt: a meta-learning approach for speaker adaptation

Title Learning to adapt: a meta-learning approach for speaker adaptation
Authors Ondřej Klejch, Joachim Fainberg, Peter Bell
Abstract The performance of automatic speech recognition systems can be improved by adapting an acoustic model to compensate for the mismatch between training and testing conditions, for example by adapting to unseen speakers. The success of speaker adaptation methods relies on selecting weights that are suitable for adaptation and using good adaptation schedules to update these weights in order not to overfit to the adaptation data. In this paper we investigate a principled way of adapting all the weights of the acoustic model using a meta-learning. We show that the meta-learner can learn to perform supervised and unsupervised speaker adaptation and that it outperforms a strong baseline adapting LHUC parameters when adapting a DNN AM with 1.5M parameters. We also report initial experiments on adapting TDNN AMs, where the meta-learner achieves comparable performance with LHUC.
Tasks Meta-Learning, Speech Recognition
Published 2018-08-30
URL http://arxiv.org/abs/1808.10239v1
PDF http://arxiv.org/pdf/1808.10239v1.pdf
PWC https://paperswithcode.com/paper/learning-to-adapt-a-meta-learning-approach
Repo https://github.com/choko/learning_to_adapt
Framework tf

Towards Fast Computation of Certified Robustness for ReLU Networks

Title Towards Fast Computation of Certified Robustness for ReLU Networks
Authors Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel
Abstract Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17]. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or delivering low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms Fast-Lin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Lin, or by bounding the local Lipschitz constant Fast-Lip. Experiments show that (1) our proposed methods deliver bounds close to (the gap is 2-3X) exact minimum distortion found by Reluplex in small MNIST networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that, in fact, there is no polynomial time algorithm that can approximately find the minimum $\ell_1$ adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\mathsf{P}$, where $n$ is the number of neurons in the network.
Tasks
Published 2018-04-25
URL http://arxiv.org/abs/1804.09699v4
PDF http://arxiv.org/pdf/1804.09699v4.pdf
PWC https://paperswithcode.com/paper/towards-fast-computation-of-certified
Repo https://github.com/huanzhang12/CertifiedReLURobustness
Framework tf

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Title Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering
Authors Wuwei Lan, Wei Xu
Abstract In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks. Although most of these models have claimed state-of-the-art performance, the original papers often reported on only one or two selected datasets. We provide a systematic study and show that (i) encoding contextual information by LSTM and inter-sentence interactions are critical, (ii) Tree-LSTM does not help as much as previously claimed but surprisingly improves performance on Twitter datasets, (iii) the Enhanced Sequential Inference Model is the best so far for larger datasets, while the Pairwise Word Interaction Model achieves the best performance when less data is available. We release our implementations as an open-source toolkit.
Tasks Natural Language Inference, Paraphrase Identification, Question Answering, Semantic Textual Similarity, Sentence Pair Modeling
Published 2018-06-12
URL http://arxiv.org/abs/1806.04330v2
PDF http://arxiv.org/pdf/1806.04330v2.pdf
PWC https://paperswithcode.com/paper/neural-network-models-for-paraphrase
Repo https://github.com/lanwuwei/SPM_toolkit
Framework pytorch

Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm

Title Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm
Authors Fan Zhu, Lin Ma, Xin Xu, Dingfeng Guo, Xiao Cui, Qi Kong
Abstract For any autonomous driving vehicle, control module determines its road performance and safety, i.e. its precision and stability should stay within a carefully-designed range. Nonetheless, control algorithms require vehicle dynamics (such as longitudinal dynamics) as inputs, which, unfortunately, are obscure to calibrate in real time. As a result, to achieve reasonable performance, most, if not all, research-oriented autonomous vehicles do manual calibrations in a one-by-one fashion. Since manual calibration is not sustainable once entering into mass production stage for industrial purposes, we here introduce a machine-learning based auto-calibration system for autonomous driving vehicles. In this paper, we will show how we build a data-driven longitudinal calibration procedure using machine learning techniques. We first generated offline calibration tables from human driving data. The offline table serves as an initial guess for later uses and it only needs twenty-minutes data collection and process. We then used an online-learning algorithm to appropriately update the initial table (the offline table) based on real-time performance analysis. This longitudinal auto-calibration system has been deployed to more than one hundred Baidu Apollo self-driving vehicles (including hybrid family vehicles and electronic delivery-only vehicles) since April 2018. By August 27, 2018, it had been tested for more than two thousands hours, ten thousands kilometers (6,213 miles) and yet proven to be effective.
Tasks Autonomous Driving, Autonomous Vehicles, Calibration
Published 2018-08-30
URL http://arxiv.org/abs/1808.10134v1
PDF http://arxiv.org/pdf/1808.10134v1.pdf
PWC https://paperswithcode.com/paper/baidu-apollo-auto-calibration-system-an
Repo https://github.com/purewater0901/carCalibration
Framework tf

Bilinear Attention Networks

Title Bilinear Attention Networks
Authors Jin-Hwa Kim, Jaehyun Jun, Byoung-Tak Zhang
Abstract Attention networks in multimodal learning provide an efficient way to utilize given visual information selectively. However, the computational cost to learn attention distributions for every pair of multimodal input channels is prohibitively expensive. To solve this problem, co-attention builds two separate attention distributions for each modality neglecting the interaction between multimodal inputs. In this paper, we propose bilinear attention networks (BAN) that find bilinear attention distributions to utilize given vision-language information seamlessly. BAN considers bilinear interactions among two groups of input channels, while low-rank bilinear pooling extracts the joint representations for each pair of channels. Furthermore, we propose a variant of multimodal residual networks to exploit eight-attention maps of the BAN efficiently. We quantitatively and qualitatively evaluate our model on visual question answering (VQA 2.0) and Flickr30k Entities datasets, showing that BAN significantly outperforms previous methods and achieves new state-of-the-arts on both datasets.
Tasks Visual Question Answering
Published 2018-05-21
URL http://arxiv.org/abs/1805.07932v2
PDF http://arxiv.org/pdf/1805.07932v2.pdf
PWC https://paperswithcode.com/paper/bilinear-attention-networks
Repo https://github.com/jnhwkim/ban-vqa
Framework pytorch

Energy-Based Hindsight Experience Prioritization

Title Energy-Based Hindsight Experience Prioritization
Authors Rui Zhao, Volker Tresp
Abstract In Hindsight Experience Replay (HER), a reinforcement learning agent is trained by treating whatever it has achieved as virtual goals. However, in previous work, the experience was replayed at random, without considering which episode might be the most valuable for learning. In this paper, we develop an energy-based framework for prioritizing hindsight experience in robotic manipulation tasks. Our approach is inspired by the work-energy principle in physics. We define a trajectory energy function as the sum of the transition energy of the target object over the trajectory. We hypothesize that replaying episodes that have high trajectory energy is more effective for reinforcement learning in robotics. To verify our hypothesis, we designed a framework for hindsight experience prioritization based on the trajectory energy of goal states. The trajectory energy function takes the potential, kinetic, and rotational energy into consideration. We evaluate our Energy-Based Prioritization (EBP) approach on four challenging robotic manipulation tasks in simulation. Our empirical results show that our proposed method surpasses state-of-the-art approaches in terms of both performance and sample-efficiency on all four tasks, without increasing computational time. A video showing experimental results is available at https://youtu.be/jtsF2tTeUGQ
Tasks
Published 2018-10-02
URL http://arxiv.org/abs/1810.01363v4
PDF http://arxiv.org/pdf/1810.01363v4.pdf
PWC https://paperswithcode.com/paper/energy-based-hindsight-experience
Repo https://github.com/ruizhaogit/EnergyBasedPrioritization
Framework none

ITE: A Lightweight Implementation of Stratified Reasoning for Constructive Logical Operators

Title ITE: A Lightweight Implementation of Stratified Reasoning for Constructive Logical Operators
Authors Arnaud Gotlieb, Dusica Marijan, Helge Spieker
Abstract Constraint Programming (CP) is a powerful declarative programming paradigm where inference and search are interleaved to find feasible and optimal solutions to various type of constraint systems. However, handling logical connectors with constructive information in CP is notoriously difficult. This paper presents If Then Else (ITE), a lightweight implementation of stratified constructive reasoning for logical connectives. Stratification is introduced to cope with the risk of combinatorial explosion of constructing information from nested and combined logical operators. ITE is an open-source library built on top of SICStus Prolog clp(fd), which proposes various operators, including constructive disjunction and negation, constructive implication and conditional. These operators can be used to express global constraints and to benefit from constructive reasoning for more domain pruning during constraint filtering. Even though ITE is not competitive with specialized filtering algorithms available in some global constraints implementations, its expressiveness allows users to easily define well-tuned constraints with powerful deduction capabilities. Our extended experimental results show that ITE is more efficient than available generic approaches that handle logical constraint systems over finite domains.
Tasks
Published 2018-11-09
URL https://arxiv.org/abs/1811.03906v2
PDF https://arxiv.org/pdf/1811.03906v2.pdf
PWC https://paperswithcode.com/paper/stratified-constructive-disjunction-and
Repo https://github.com/ite4cp/ite
Framework none

Modular meta-learning in abstract graph networks for combinatorial generalization

Title Modular meta-learning in abstract graph networks for combinatorial generalization
Authors Ferran Alet, Maria Bauza, Alberto Rodriguez, Tomas Lozano-Perez, Leslie P. Kaelbling
Abstract Modular meta-learning is a new framework that generalizes to unseen datasets by combining a small set of neural modules in different ways. In this work we propose abstract graph networks: using graphs as abstractions of a system’s subparts without a fixed assignment of nodes to system subparts, for which we would need supervision. We combine this idea with modular meta-learning to get a flexible framework with combinatorial generalization to new tasks built in. We then use it to model the pushing of arbitrarily shaped objects from little or no training data.
Tasks Meta-Learning
Published 2018-12-19
URL http://arxiv.org/abs/1812.07768v1
PDF http://arxiv.org/pdf/1812.07768v1.pdf
PWC https://paperswithcode.com/paper/modular-meta-learning-in-abstract-graph
Repo https://github.com/FerranAlet/modular-metalearning
Framework pytorch

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects

Title Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects
Authors Michael A. Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, Anh Nguyen
Abstract Despite excellent performance on stationary test sets, deep neural networks (DNNs) can fail to generalize to out-of-distribution (OoD) inputs, including natural, non-adversarial ones, which are common in real-world settings. In this paper, we present a framework for discovering DNN failures that harnesses 3D renderers and 3D models. That is, we estimate the parameters of a 3D renderer that cause a target DNN to misbehave in response to the rendered image. Using our framework and a self-assembled dataset of 3D objects, we investigate the vulnerability of DNNs to OoD poses of well-known objects in ImageNet. For objects that are readily recognized by DNNs in their canonical poses, DNNs incorrectly classify 97% of their pose space. In addition, DNNs are highly sensitive to slight pose perturbations. Importantly, adversarial poses transfer across models and datasets. We find that 99.9% and 99.4% of the poses misclassified by Inception-v3 also transfer to the AlexNet and ResNet-50 image classifiers trained on the same ImageNet dataset, respectively, and 75.5% transfer to the YOLOv3 object detector trained on MS COCO.
Tasks
Published 2018-11-28
URL http://arxiv.org/abs/1811.11553v3
PDF http://arxiv.org/pdf/1811.11553v3.pdf
PWC https://paperswithcode.com/paper/strike-with-a-pose-neural-networks-are-easily
Repo https://github.com/airalcorn2/strike-with-a-pose
Framework pytorch

BA-Net: Dense Bundle Adjustment Network

Title BA-Net: Dense Bundle Adjustment Network
Authors Chengzhou Tang, Ping Tan
Abstract This paper introduces a network architecture to solve the structure-from-motion (SfM) problem via feature-metric bundle adjustment (BA), which explicitly enforces multi-view geometry constraints in the form of feature-metric error. The whole pipeline is differentiable so that the network can learn suitable features that make the BA problem more tractable. Furthermore, this work introduces a novel depth parameterization to recover dense per-pixel depth. The network first generates several basis depth maps according to the input image and optimizes the final depth as a linear combination of these basis depth maps via feature-metric BA. The basis depth maps generator is also learned via end-to-end training. The whole system nicely combines domain knowledge (i.e. hard-coded multi-view geometry constraints) and deep learning (i.e. feature learning and basis depth maps learning) to address the challenging dense SfM problem. Experiments on large scale real data prove the success of the proposed method.
Tasks Depth And Camera Motion
Published 2018-06-13
URL https://arxiv.org/abs/1806.04807v3
PDF https://arxiv.org/pdf/1806.04807v3.pdf
PWC https://paperswithcode.com/paper/ba-net-dense-bundle-adjustment-network
Repo https://github.com/frobelbest/BANet
Framework tf

Pre-Defined Sparse Neural Networks with Hardware Acceleration

Title Pre-Defined Sparse Neural Networks with Hardware Acceleration
Authors Sourya Dey, Kuan-Wen Huang, Peter A. Beerel, Keith M. Chugg
Abstract Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized Field Programmable Gate Array (FPGA)s.
Tasks
Published 2018-12-04
URL http://arxiv.org/abs/1812.01164v1
PDF http://arxiv.org/pdf/1812.01164v1.pdf
PWC https://paperswithcode.com/paper/pre-defined-sparse-neural-networks-with
Repo https://github.com/souryadey/predefinedsparse-nnets
Framework tf

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

Title MMA Training: Direct Input Space Margin Maximization through Adversarial Training
Authors Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, Ruitong Huang
Abstract We study adversarial robustness of neural networks from a margin maximization perspective, where margins are defined as the distances from inputs to a classifier’s decision boundary. Our study shows that maximizing margins can be achieved by minimizing the adversarial loss on the decision boundary at the “shortest successful perturbation”, demonstrating a close connection between adversarial losses and the margins. We propose Max-Margin Adversarial (MMA) training to directly maximize the margins to achieve adversarial robustness. Instead of adversarial training with a fixed $\epsilon$, MMA offers an improvement by enabling adaptive selection of the “correct” $\epsilon$ as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training’s efficacy on the MNIST and CIFAR10 datasets w.r.t. $\ell_\infty$ and $\ell_2$ robustness. Code and models are available at https://github.com/BorealisAI/mma_training.
Tasks
Published 2018-12-06
URL https://arxiv.org/abs/1812.02637v4
PDF https://arxiv.org/pdf/1812.02637v4.pdf
PWC https://paperswithcode.com/paper/max-margin-adversarial-mma-training-direct
Repo https://github.com/BorealisAI/mma_training
Framework pytorch

Deep Imbalanced Learning for Face Recognition and Attribute Prediction

Title Deep Imbalanced Learning for Face Recognition and Attribute Prediction
Authors Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang
Abstract Data for face analysis often exhibit highly-skewed class distribution, i.e., most data belong to a few majority classes, while the minority classes only contain a scarce amount of instances. To mitigate this issue, contemporary deep learning methods typically follow classic strategies such as class re-sampling or cost-sensitive training. In this paper, we conduct extensive and systematic experiments to validate the effectiveness of these classic schemes for representation learning on class-imbalanced data. We further demonstrate that more discriminative deep representation can be learned by enforcing a deep network to maintain inter-cluster margins both within and between classes. This tight constraint effectively reduces the class imbalance inherent in the local data neighborhood, thus carving much more balanced class boundaries locally. We show that it is easy to deploy angular margins between the cluster distributions on a hypersphere manifold. Such learned Cluster-based Large Margin Local Embedding (CLMLE), when combined with a simple k-nearest cluster algorithm, shows significant improvements in accuracy over existing methods on both face recognition and face attribute prediction tasks that exhibit imbalanced class distribution.
Tasks Face Recognition, Representation Learning
Published 2018-06-01
URL http://arxiv.org/abs/1806.00194v2
PDF http://arxiv.org/pdf/1806.00194v2.pdf
PWC https://paperswithcode.com/paper/deep-imbalanced-learning-for-face-recognition
Repo https://github.com/JoyLuo/face-attribute-recognition-paper-list
Framework none

Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos

Title Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
Authors Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
Abstract Models and examples built with TensorFlow
Tasks Depth And Camera Motion, Depth Estimation, Motion Estimation, Robot Navigation
Published 2018-11-15
URL http://arxiv.org/abs/1811.06152v1
PDF http://arxiv.org/pdf/1811.06152v1.pdf
PWC https://paperswithcode.com/paper/depth-prediction-without-the-sensors
Repo https://github.com/tensorflow/models/tree/master/research/struct2depth
Framework tf

Hyperbolic Entailment Cones for Learning Hierarchical Embeddings

Title Hyperbolic Entailment Cones for Learning Hierarchical Embeddings
Authors Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann
Abstract Learning graph representations via low-dimensional embeddings that preserve relevant network properties is an important class of problems in machine learning. We here present a novel method to embed directed acyclic graphs. Following prior work, we first advocate for using hyperbolic spaces which provably model tree-like structures better than Euclidean geometry. Second, we view hierarchical relations as partial orders defined using a family of nested geodesically convex cones. We prove that these entailment cones admit an optimal shape with a closed form expression both in the Euclidean and hyperbolic spaces, and they canonically define the embedding learning process. Experiments show significant improvements of our method over strong recent baselines both in terms of representational capacity and generalization.
Tasks Graph Embedding, Hypernym Discovery, Link Prediction, Representation Learning
Published 2018-04-03
URL http://arxiv.org/abs/1804.01882v3
PDF http://arxiv.org/pdf/1804.01882v3.pdf
PWC https://paperswithcode.com/paper/hyperbolic-entailment-cones-for-learning
Repo https://github.com/dalab/hyperbolic_cones
Framework none
comments powered by Disqus