October 20, 2019

3004 words 15 mins read

Paper Group AWR 202

Learning to adapt: a meta-learning approach for speaker adaptation. Towards Fast Computation of Certified Robustness for ReLU Networks. Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering. Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learnin …

Learning to adapt: a meta-learning approach for speaker adaptation


Title	Learning to adapt: a meta-learning approach for speaker adaptation
Authors	Ondřej Klejch, Joachim Fainberg, Peter Bell
Abstract	The performance of automatic speech recognition systems can be improved by adapting an acoustic model to compensate for the mismatch between training and testing conditions, for example by adapting to unseen speakers. The success of speaker adaptation methods relies on selecting weights that are suitable for adaptation and using good adaptation schedules to update these weights in order not to overfit to the adaptation data. In this paper we investigate a principled way of adapting all the weights of the acoustic model using a meta-learning. We show that the meta-learner can learn to perform supervised and unsupervised speaker adaptation and that it outperforms a strong baseline adapting LHUC parameters when adapting a DNN AM with 1.5M parameters. We also report initial experiments on adapting TDNN AMs, where the meta-learner achieves comparable performance with LHUC.
Tasks	Meta-Learning, Speech Recognition
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10239v1
PDF	http://arxiv.org/pdf/1808.10239v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-adapt-a-meta-learning-approach
Repo	https://github.com/choko/learning_to_adapt
Framework	tf

Towards Fast Computation of Certified Robustness for ReLU Networks


Title	Towards Fast Computation of Certified Robustness for ReLU Networks
Authors	Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel
Abstract	Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17]. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or delivering low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms Fast-Lin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Lin, or by bounding the local Lipschitz constant Fast-Lip. Experiments show that (1) our proposed methods deliver bounds close to (the gap is 2-3X) exact minimum distortion found by Reluplex in small MNIST networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that, in fact, there is no polynomial time algorithm that can approximately find the minimum $\ell_1$ adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\mathsf{P}$, where $n$ is the number of neurons in the network.
Tasks
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09699v4
PDF	http://arxiv.org/pdf/1804.09699v4.pdf
PWC	https://paperswithcode.com/paper/towards-fast-computation-of-certified
Repo	https://github.com/huanzhang12/CertifiedReLURobustness
Framework	tf

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering


Title	Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering
Authors	Wuwei Lan, Wei Xu
Abstract	In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks. Although most of these models have claimed state-of-the-art performance, the original papers often reported on only one or two selected datasets. We provide a systematic study and show that (i) encoding contextual information by LSTM and inter-sentence interactions are critical, (ii) Tree-LSTM does not help as much as previously claimed but surprisingly improves performance on Twitter datasets, (iii) the Enhanced Sequential Inference Model is the best so far for larger datasets, while the Pairwise Word Interaction Model achieves the best performance when less data is available. We release our implementations as an open-source toolkit.
Tasks	Natural Language Inference, Paraphrase Identification, Question Answering, Semantic Textual Similarity, Sentence Pair Modeling
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04330v2
PDF	http://arxiv.org/pdf/1806.04330v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-models-for-paraphrase
Repo	https://github.com/lanwuwei/SPM_toolkit
Framework	pytorch

Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm


Title	Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm
Authors	Fan Zhu, Lin Ma, Xin Xu, Dingfeng Guo, Xiao Cui, Qi Kong
Abstract	For any autonomous driving vehicle, control module determines its road performance and safety, i.e. its precision and stability should stay within a carefully-designed range. Nonetheless, control algorithms require vehicle dynamics (such as longitudinal dynamics) as inputs, which, unfortunately, are obscure to calibrate in real time. As a result, to achieve reasonable performance, most, if not all, research-oriented autonomous vehicles do manual calibrations in a one-by-one fashion. Since manual calibration is not sustainable once entering into mass production stage for industrial purposes, we here introduce a machine-learning based auto-calibration system for autonomous driving vehicles. In this paper, we will show how we build a data-driven longitudinal calibration procedure using machine learning techniques. We first generated offline calibration tables from human driving data. The offline table serves as an initial guess for later uses and it only needs twenty-minutes data collection and process. We then used an online-learning algorithm to appropriately update the initial table (the offline table) based on real-time performance analysis. This longitudinal auto-calibration system has been deployed to more than one hundred Baidu Apollo self-driving vehicles (including hybrid family vehicles and electronic delivery-only vehicles) since April 2018. By August 27, 2018, it had been tested for more than two thousands hours, ten thousands kilometers (6,213 miles) and yet proven to be effective.
Tasks	Autonomous Driving, Autonomous Vehicles, Calibration
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10134v1
PDF	http://arxiv.org/pdf/1808.10134v1.pdf
PWC	https://paperswithcode.com/paper/baidu-apollo-auto-calibration-system-an
Repo	https://github.com/purewater0901/carCalibration
Framework	tf

Bilinear Attention Networks


Title	Bilinear Attention Networks
Authors	Jin-Hwa Kim, Jaehyun Jun, Byoung-Tak Zhang
Abstract	Attention networks in multimodal learning provide an efficient way to utilize given visual information selectively. However, the computational cost to learn attention distributions for every pair of multimodal input channels is prohibitively expensive. To solve this problem, co-attention builds two separate attention distributions for each modality neglecting the interaction between multimodal inputs. In this paper, we propose bilinear attention networks (BAN) that find bilinear attention distributions to utilize given vision-language information seamlessly. BAN considers bilinear interactions among two groups of input channels, while low-rank bilinear pooling extracts the joint representations for each pair of channels. Furthermore, we propose a variant of multimodal residual networks to exploit eight-attention maps of the BAN efficiently. We quantitatively and qualitatively evaluate our model on visual question answering (VQA 2.0) and Flickr30k Entities datasets, showing that BAN significantly outperforms previous methods and achieves new state-of-the-arts on both datasets.
Tasks	Visual Question Answering
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07932v2
PDF	http://arxiv.org/pdf/1805.07932v2.pdf
PWC	https://paperswithcode.com/paper/bilinear-attention-networks
Repo	https://github.com/jnhwkim/ban-vqa
Framework	pytorch

Energy-Based Hindsight Experience Prioritization


Title	Energy-Based Hindsight Experience Prioritization
Authors	Rui Zhao, Volker Tresp
Abstract	In Hindsight Experience Replay (HER), a reinforcement learning agent is trained by treating whatever it has achieved as virtual goals. However, in previous work, the experience was replayed at random, without considering which episode might be the most valuable for learning. In this paper, we develop an energy-based framework for prioritizing hindsight experience in robotic manipulation tasks. Our approach is inspired by the work-energy principle in physics. We define a trajectory energy function as the sum of the transition energy of the target object over the trajectory. We hypothesize that replaying episodes that have high trajectory energy is more effective for reinforcement learning in robotics. To verify our hypothesis, we designed a framework for hindsight experience prioritization based on the trajectory energy of goal states. The trajectory energy function takes the potential, kinetic, and rotational energy into consideration. We evaluate our Energy-Based Prioritization (EBP) approach on four challenging robotic manipulation tasks in simulation. Our empirical results show that our proposed method surpasses state-of-the-art approaches in terms of both performance and sample-efficiency on all four tasks, without increasing computational time. A video showing experimental results is available at https://youtu.be/jtsF2tTeUGQ
Tasks
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01363v4
PDF	http://arxiv.org/pdf/1810.01363v4.pdf
PWC	https://paperswithcode.com/paper/energy-based-hindsight-experience
Repo	https://github.com/ruizhaogit/EnergyBasedPrioritization
Framework	none

ITE: A Lightweight Implementation of Stratified Reasoning for Constructive Logical Operators


Title	ITE: A Lightweight Implementation of Stratified Reasoning for Constructive Logical Operators
Authors	Arnaud Gotlieb, Dusica Marijan, Helge Spieker
Abstract	Constraint Programming (CP) is a powerful declarative programming paradigm where inference and search are interleaved to find feasible and optimal solutions to various type of constraint systems. However, handling logical connectors with constructive information in CP is notoriously difficult. This paper presents If Then Else (ITE), a lightweight implementation of stratified constructive reasoning for logical connectives. Stratification is introduced to cope with the risk of combinatorial explosion of constructing information from nested and combined logical operators. ITE is an open-source library built on top of SICStus Prolog clp(fd), which proposes various operators, including constructive disjunction and negation, constructive implication and conditional. These operators can be used to express global constraints and to benefit from constructive reasoning for more domain pruning during constraint filtering. Even though ITE is not competitive with specialized filtering algorithms available in some global constraints implementations, its expressiveness allows users to easily define well-tuned constraints with powerful deduction capabilities. Our extended experimental results show that ITE is more efficient than available generic approaches that handle logical constraint systems over finite domains.
Tasks
Published	2018-11-09
URL	https://arxiv.org/abs/1811.03906v2
PDF	https://arxiv.org/pdf/1811.03906v2.pdf
PWC	https://paperswithcode.com/paper/stratified-constructive-disjunction-and
Repo	https://github.com/ite4cp/ite
Framework	none

Modular meta-learning in abstract graph networks for combinatorial generalization


Title	Modular meta-learning in abstract graph networks for combinatorial generalization
Authors	Ferran Alet, Maria Bauza, Alberto Rodriguez, Tomas Lozano-Perez, Leslie P. Kaelbling
Abstract	Modular meta-learning is a new framework that generalizes to unseen datasets by combining a small set of neural modules in different ways. In this work we propose abstract graph networks: using graphs as abstractions of a system’s subparts without a fixed assignment of nodes to system subparts, for which we would need supervision. We combine this idea with modular meta-learning to get a flexible framework with combinatorial generalization to new tasks built in. We then use it to model the pushing of arbitrarily shaped objects from little or no training data.
Tasks	Meta-Learning
Published	2018-12-19
URL	http://arxiv.org/abs/1812.07768v1
PDF	http://arxiv.org/pdf/1812.07768v1.pdf
PWC	https://paperswithcode.com/paper/modular-meta-learning-in-abstract-graph
Repo	https://github.com/FerranAlet/modular-metalearning
Framework	pytorch

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects


Title	Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects
Authors	Michael A. Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, Anh Nguyen
Abstract	Despite excellent performance on stationary test sets, deep neural networks (DNNs) can fail to generalize to out-of-distribution (OoD) inputs, including natural, non-adversarial ones, which are common in real-world settings. In this paper, we present a framework for discovering DNN failures that harnesses 3D renderers and 3D models. That is, we estimate the parameters of a 3D renderer that cause a target DNN to misbehave in response to the rendered image. Using our framework and a self-assembled dataset of 3D objects, we investigate the vulnerability of DNNs to OoD poses of well-known objects in ImageNet. For objects that are readily recognized by DNNs in their canonical poses, DNNs incorrectly classify 97% of their pose space. In addition, DNNs are highly sensitive to slight pose perturbations. Importantly, adversarial poses transfer across models and datasets. We find that 99.9% and 99.4% of the poses misclassified by Inception-v3 also transfer to the AlexNet and ResNet-50 image classifiers trained on the same ImageNet dataset, respectively, and 75.5% transfer to the YOLOv3 object detector trained on MS COCO.
Tasks
Published	2018-11-28
URL	http://arxiv.org/abs/1811.11553v3
PDF	http://arxiv.org/pdf/1811.11553v3.pdf
PWC	https://paperswithcode.com/paper/strike-with-a-pose-neural-networks-are-easily
Repo	https://github.com/airalcorn2/strike-with-a-pose
Framework	pytorch

BA-Net: Dense Bundle Adjustment Network


Title	BA-Net: Dense Bundle Adjustment Network
Authors	Chengzhou Tang, Ping Tan
Abstract	This paper introduces a network architecture to solve the structure-from-motion (SfM) problem via feature-metric bundle adjustment (BA), which explicitly enforces multi-view geometry constraints in the form of feature-metric error. The whole pipeline is differentiable so that the network can learn suitable features that make the BA problem more tractable. Furthermore, this work introduces a novel depth parameterization to recover dense per-pixel depth. The network first generates several basis depth maps according to the input image and optimizes the final depth as a linear combination of these basis depth maps via feature-metric BA. The basis depth maps generator is also learned via end-to-end training. The whole system nicely combines domain knowledge (i.e. hard-coded multi-view geometry constraints) and deep learning (i.e. feature learning and basis depth maps learning) to address the challenging dense SfM problem. Experiments on large scale real data prove the success of the proposed method.
Tasks	Depth And Camera Motion
Published	2018-06-13
URL	https://arxiv.org/abs/1806.04807v3
PDF	https://arxiv.org/pdf/1806.04807v3.pdf
PWC	https://paperswithcode.com/paper/ba-net-dense-bundle-adjustment-network
Repo	https://github.com/frobelbest/BANet
Framework	tf

Pre-Defined Sparse Neural Networks with Hardware Acceleration


Title	Pre-Defined Sparse Neural Networks with Hardware Acceleration
Authors	Sourya Dey, Kuan-Wen Huang, Peter A. Beerel, Keith M. Chugg
Abstract	Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized Field Programmable Gate Array (FPGA)s.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01164v1
PDF	http://arxiv.org/pdf/1812.01164v1.pdf
PWC	https://paperswithcode.com/paper/pre-defined-sparse-neural-networks-with
Repo	https://github.com/souryadey/predefinedsparse-nnets
Framework	tf

MMA Training: Direct Input Space Margin Maximization through Adversarial Training


Title	MMA Training: Direct Input Space Margin Maximization through Adversarial Training
Authors	Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, Ruitong Huang
Abstract	We study adversarial robustness of neural networks from a margin maximization perspective, where margins are defined as the distances from inputs to a classifier’s decision boundary. Our study shows that maximizing margins can be achieved by minimizing the adversarial loss on the decision boundary at the “shortest successful perturbation”, demonstrating a close connection between adversarial losses and the margins. We propose Max-Margin Adversarial (MMA) training to directly maximize the margins to achieve adversarial robustness. Instead of adversarial training with a fixed $\epsilon$, MMA offers an improvement by enabling adaptive selection of the “correct” $\epsilon$ as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training’s efficacy on the MNIST and CIFAR10 datasets w.r.t. $\ell_\infty$ and $\ell_2$ robustness. Code and models are available at https://github.com/BorealisAI/mma_training.
Tasks
Published	2018-12-06
URL	https://arxiv.org/abs/1812.02637v4
PDF	https://arxiv.org/pdf/1812.02637v4.pdf
PWC	https://paperswithcode.com/paper/max-margin-adversarial-mma-training-direct
Repo	https://github.com/BorealisAI/mma_training
Framework	pytorch

Deep Imbalanced Learning for Face Recognition and Attribute Prediction


Title	Deep Imbalanced Learning for Face Recognition and Attribute Prediction
Authors	Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang
Abstract	Data for face analysis often exhibit highly-skewed class distribution, i.e., most data belong to a few majority classes, while the minority classes only contain a scarce amount of instances. To mitigate this issue, contemporary deep learning methods typically follow classic strategies such as class re-sampling or cost-sensitive training. In this paper, we conduct extensive and systematic experiments to validate the effectiveness of these classic schemes for representation learning on class-imbalanced data. We further demonstrate that more discriminative deep representation can be learned by enforcing a deep network to maintain inter-cluster margins both within and between classes. This tight constraint effectively reduces the class imbalance inherent in the local data neighborhood, thus carving much more balanced class boundaries locally. We show that it is easy to deploy angular margins between the cluster distributions on a hypersphere manifold. Such learned Cluster-based Large Margin Local Embedding (CLMLE), when combined with a simple k-nearest cluster algorithm, shows significant improvements in accuracy over existing methods on both face recognition and face attribute prediction tasks that exhibit imbalanced class distribution.
Tasks	Face Recognition, Representation Learning
Published	2018-06-01
URL	http://arxiv.org/abs/1806.00194v2
PDF	http://arxiv.org/pdf/1806.00194v2.pdf
PWC	https://paperswithcode.com/paper/deep-imbalanced-learning-for-face-recognition
Repo	https://github.com/JoyLuo/face-attribute-recognition-paper-list
Framework	none

Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos


Title	Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
Authors	Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
Abstract	Models and examples built with TensorFlow
Tasks	Depth And Camera Motion, Depth Estimation, Motion Estimation, Robot Navigation
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06152v1
PDF	http://arxiv.org/pdf/1811.06152v1.pdf
PWC	https://paperswithcode.com/paper/depth-prediction-without-the-sensors
Repo	https://github.com/tensorflow/models/tree/master/research/struct2depth
Framework	tf

Hyperbolic Entailment Cones for Learning Hierarchical Embeddings


Title	Hyperbolic Entailment Cones for Learning Hierarchical Embeddings
Authors	Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann
Abstract	Learning graph representations via low-dimensional embeddings that preserve relevant network properties is an important class of problems in machine learning. We here present a novel method to embed directed acyclic graphs. Following prior work, we first advocate for using hyperbolic spaces which provably model tree-like structures better than Euclidean geometry. Second, we view hierarchical relations as partial orders defined using a family of nested geodesically convex cones. We prove that these entailment cones admit an optimal shape with a closed form expression both in the Euclidean and hyperbolic spaces, and they canonically define the embedding learning process. Experiments show significant improvements of our method over strong recent baselines both in terms of representational capacity and generalization.
Tasks	Graph Embedding, Hypernym Discovery, Link Prediction, Representation Learning
Published	2018-04-03
URL	http://arxiv.org/abs/1804.01882v3
PDF	http://arxiv.org/pdf/1804.01882v3.pdf
PWC	https://paperswithcode.com/paper/hyperbolic-entailment-cones-for-learning
Repo	https://github.com/dalab/hyperbolic_cones
Framework	none