Paper Group AWR 202
Learning to adapt: a meta-learning approach for speaker adaptation. Towards Fast Computation of Certified Robustness for ReLU Networks. Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering. Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learnin …
Learning to adapt: a meta-learning approach for speaker adaptation
Title | Learning to adapt: a meta-learning approach for speaker adaptation |
Authors | Ondřej Klejch, Joachim Fainberg, Peter Bell |
Abstract | The performance of automatic speech recognition systems can be improved by adapting an acoustic model to compensate for the mismatch between training and testing conditions, for example by adapting to unseen speakers. The success of speaker adaptation methods relies on selecting weights that are suitable for adaptation and using good adaptation schedules to update these weights in order not to overfit to the adaptation data. In this paper we investigate a principled way of adapting all the weights of the acoustic model using a meta-learning. We show that the meta-learner can learn to perform supervised and unsupervised speaker adaptation and that it outperforms a strong baseline adapting LHUC parameters when adapting a DNN AM with 1.5M parameters. We also report initial experiments on adapting TDNN AMs, where the meta-learner achieves comparable performance with LHUC. |
Tasks | Meta-Learning, Speech Recognition |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10239v1 |
http://arxiv.org/pdf/1808.10239v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-adapt-a-meta-learning-approach |
Repo | https://github.com/choko/learning_to_adapt |
Framework | tf |
Towards Fast Computation of Certified Robustness for ReLU Networks
Title | Towards Fast Computation of Certified Robustness for ReLU Networks |
Authors | Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Duane Boning, Inderjit S. Dhillon, Luca Daniel |
Abstract | Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17]. Although finding the exact minimum adversarial distortion is hard, giving a certified lower bound of the minimum distortion is possible. Current available methods of computing such a bound are either time-consuming or delivering low quality bounds that are too loose to be useful. In this paper, we exploit the special structure of ReLU networks and provide two computationally efficient algorithms Fast-Lin and Fast-Lip that are able to certify non-trivial lower bounds of minimum distortions, by bounding the ReLU units with appropriate linear functions Fast-Lin, or by bounding the local Lipschitz constant Fast-Lip. Experiments show that (1) our proposed methods deliver bounds close to (the gap is 2-3X) exact minimum distortion found by Reluplex in small MNIST networks while our algorithms are more than 10,000 times faster; (2) our methods deliver similar quality of bounds (the gap is within 35% and usually around 10%; sometimes our bounds are even better) for larger networks compared to the methods based on solving linear programming problems but our algorithms are 33-14,000 times faster; (3) our method is capable of solving large MNIST and CIFAR networks up to 7 layers with more than 10,000 neurons within tens of seconds on a single CPU core. In addition, we show that, in fact, there is no polynomial time algorithm that can approximately find the minimum $\ell_1$ adversarial distortion of a ReLU network with a $0.99\ln n$ approximation ratio unless $\mathsf{NP}$=$\mathsf{P}$, where $n$ is the number of neurons in the network. |
Tasks | |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09699v4 |
http://arxiv.org/pdf/1804.09699v4.pdf | |
PWC | https://paperswithcode.com/paper/towards-fast-computation-of-certified |
Repo | https://github.com/huanzhang12/CertifiedReLURobustness |
Framework | tf |
Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering
Title | Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering |
Authors | Wuwei Lan, Wei Xu |
Abstract | In this paper, we analyze several neural network designs (and their variations) for sentence pair modeling and compare their performance extensively across eight datasets, including paraphrase identification, semantic textual similarity, natural language inference, and question answering tasks. Although most of these models have claimed state-of-the-art performance, the original papers often reported on only one or two selected datasets. We provide a systematic study and show that (i) encoding contextual information by LSTM and inter-sentence interactions are critical, (ii) Tree-LSTM does not help as much as previously claimed but surprisingly improves performance on Twitter datasets, (iii) the Enhanced Sequential Inference Model is the best so far for larger datasets, while the Pairwise Word Interaction Model achieves the best performance when less data is available. We release our implementations as an open-source toolkit. |
Tasks | Natural Language Inference, Paraphrase Identification, Question Answering, Semantic Textual Similarity, Sentence Pair Modeling |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04330v2 |
http://arxiv.org/pdf/1806.04330v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-models-for-paraphrase |
Repo | https://github.com/lanwuwei/SPM_toolkit |
Framework | pytorch |
Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm
Title | Baidu Apollo Auto-Calibration System - An Industry-Level Data-Driven and Learning based Vehicle Longitude Dynamic Calibrating Algorithm |
Authors | Fan Zhu, Lin Ma, Xin Xu, Dingfeng Guo, Xiao Cui, Qi Kong |
Abstract | For any autonomous driving vehicle, control module determines its road performance and safety, i.e. its precision and stability should stay within a carefully-designed range. Nonetheless, control algorithms require vehicle dynamics (such as longitudinal dynamics) as inputs, which, unfortunately, are obscure to calibrate in real time. As a result, to achieve reasonable performance, most, if not all, research-oriented autonomous vehicles do manual calibrations in a one-by-one fashion. Since manual calibration is not sustainable once entering into mass production stage for industrial purposes, we here introduce a machine-learning based auto-calibration system for autonomous driving vehicles. In this paper, we will show how we build a data-driven longitudinal calibration procedure using machine learning techniques. We first generated offline calibration tables from human driving data. The offline table serves as an initial guess for later uses and it only needs twenty-minutes data collection and process. We then used an online-learning algorithm to appropriately update the initial table (the offline table) based on real-time performance analysis. This longitudinal auto-calibration system has been deployed to more than one hundred Baidu Apollo self-driving vehicles (including hybrid family vehicles and electronic delivery-only vehicles) since April 2018. By August 27, 2018, it had been tested for more than two thousands hours, ten thousands kilometers (6,213 miles) and yet proven to be effective. |
Tasks | Autonomous Driving, Autonomous Vehicles, Calibration |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10134v1 |
http://arxiv.org/pdf/1808.10134v1.pdf | |
PWC | https://paperswithcode.com/paper/baidu-apollo-auto-calibration-system-an |
Repo | https://github.com/purewater0901/carCalibration |
Framework | tf |
Bilinear Attention Networks
Title | Bilinear Attention Networks |
Authors | Jin-Hwa Kim, Jaehyun Jun, Byoung-Tak Zhang |
Abstract | Attention networks in multimodal learning provide an efficient way to utilize given visual information selectively. However, the computational cost to learn attention distributions for every pair of multimodal input channels is prohibitively expensive. To solve this problem, co-attention builds two separate attention distributions for each modality neglecting the interaction between multimodal inputs. In this paper, we propose bilinear attention networks (BAN) that find bilinear attention distributions to utilize given vision-language information seamlessly. BAN considers bilinear interactions among two groups of input channels, while low-rank bilinear pooling extracts the joint representations for each pair of channels. Furthermore, we propose a variant of multimodal residual networks to exploit eight-attention maps of the BAN efficiently. We quantitatively and qualitatively evaluate our model on visual question answering (VQA 2.0) and Flickr30k Entities datasets, showing that BAN significantly outperforms previous methods and achieves new state-of-the-arts on both datasets. |
Tasks | Visual Question Answering |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07932v2 |
http://arxiv.org/pdf/1805.07932v2.pdf | |
PWC | https://paperswithcode.com/paper/bilinear-attention-networks |
Repo | https://github.com/jnhwkim/ban-vqa |
Framework | pytorch |
Energy-Based Hindsight Experience Prioritization
Title | Energy-Based Hindsight Experience Prioritization |
Authors | Rui Zhao, Volker Tresp |
Abstract | In Hindsight Experience Replay (HER), a reinforcement learning agent is trained by treating whatever it has achieved as virtual goals. However, in previous work, the experience was replayed at random, without considering which episode might be the most valuable for learning. In this paper, we develop an energy-based framework for prioritizing hindsight experience in robotic manipulation tasks. Our approach is inspired by the work-energy principle in physics. We define a trajectory energy function as the sum of the transition energy of the target object over the trajectory. We hypothesize that replaying episodes that have high trajectory energy is more effective for reinforcement learning in robotics. To verify our hypothesis, we designed a framework for hindsight experience prioritization based on the trajectory energy of goal states. The trajectory energy function takes the potential, kinetic, and rotational energy into consideration. We evaluate our Energy-Based Prioritization (EBP) approach on four challenging robotic manipulation tasks in simulation. Our empirical results show that our proposed method surpasses state-of-the-art approaches in terms of both performance and sample-efficiency on all four tasks, without increasing computational time. A video showing experimental results is available at https://youtu.be/jtsF2tTeUGQ |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01363v4 |
http://arxiv.org/pdf/1810.01363v4.pdf | |
PWC | https://paperswithcode.com/paper/energy-based-hindsight-experience |
Repo | https://github.com/ruizhaogit/EnergyBasedPrioritization |
Framework | none |
ITE: A Lightweight Implementation of Stratified Reasoning for Constructive Logical Operators
Title | ITE: A Lightweight Implementation of Stratified Reasoning for Constructive Logical Operators |
Authors | Arnaud Gotlieb, Dusica Marijan, Helge Spieker |
Abstract | Constraint Programming (CP) is a powerful declarative programming paradigm where inference and search are interleaved to find feasible and optimal solutions to various type of constraint systems. However, handling logical connectors with constructive information in CP is notoriously difficult. This paper presents If Then Else (ITE), a lightweight implementation of stratified constructive reasoning for logical connectives. Stratification is introduced to cope with the risk of combinatorial explosion of constructing information from nested and combined logical operators. ITE is an open-source library built on top of SICStus Prolog clp(fd), which proposes various operators, including constructive disjunction and negation, constructive implication and conditional. These operators can be used to express global constraints and to benefit from constructive reasoning for more domain pruning during constraint filtering. Even though ITE is not competitive with specialized filtering algorithms available in some global constraints implementations, its expressiveness allows users to easily define well-tuned constraints with powerful deduction capabilities. Our extended experimental results show that ITE is more efficient than available generic approaches that handle logical constraint systems over finite domains. |
Tasks | |
Published | 2018-11-09 |
URL | https://arxiv.org/abs/1811.03906v2 |
https://arxiv.org/pdf/1811.03906v2.pdf | |
PWC | https://paperswithcode.com/paper/stratified-constructive-disjunction-and |
Repo | https://github.com/ite4cp/ite |
Framework | none |
Modular meta-learning in abstract graph networks for combinatorial generalization
Title | Modular meta-learning in abstract graph networks for combinatorial generalization |
Authors | Ferran Alet, Maria Bauza, Alberto Rodriguez, Tomas Lozano-Perez, Leslie P. Kaelbling |
Abstract | Modular meta-learning is a new framework that generalizes to unseen datasets by combining a small set of neural modules in different ways. In this work we propose abstract graph networks: using graphs as abstractions of a system’s subparts without a fixed assignment of nodes to system subparts, for which we would need supervision. We combine this idea with modular meta-learning to get a flexible framework with combinatorial generalization to new tasks built in. We then use it to model the pushing of arbitrarily shaped objects from little or no training data. |
Tasks | Meta-Learning |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07768v1 |
http://arxiv.org/pdf/1812.07768v1.pdf | |
PWC | https://paperswithcode.com/paper/modular-meta-learning-in-abstract-graph |
Repo | https://github.com/FerranAlet/modular-metalearning |
Framework | pytorch |
Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects
Title | Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects |
Authors | Michael A. Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, Wei-Shinn Ku, Anh Nguyen |
Abstract | Despite excellent performance on stationary test sets, deep neural networks (DNNs) can fail to generalize to out-of-distribution (OoD) inputs, including natural, non-adversarial ones, which are common in real-world settings. In this paper, we present a framework for discovering DNN failures that harnesses 3D renderers and 3D models. That is, we estimate the parameters of a 3D renderer that cause a target DNN to misbehave in response to the rendered image. Using our framework and a self-assembled dataset of 3D objects, we investigate the vulnerability of DNNs to OoD poses of well-known objects in ImageNet. For objects that are readily recognized by DNNs in their canonical poses, DNNs incorrectly classify 97% of their pose space. In addition, DNNs are highly sensitive to slight pose perturbations. Importantly, adversarial poses transfer across models and datasets. We find that 99.9% and 99.4% of the poses misclassified by Inception-v3 also transfer to the AlexNet and ResNet-50 image classifiers trained on the same ImageNet dataset, respectively, and 75.5% transfer to the YOLOv3 object detector trained on MS COCO. |
Tasks | |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11553v3 |
http://arxiv.org/pdf/1811.11553v3.pdf | |
PWC | https://paperswithcode.com/paper/strike-with-a-pose-neural-networks-are-easily |
Repo | https://github.com/airalcorn2/strike-with-a-pose |
Framework | pytorch |
BA-Net: Dense Bundle Adjustment Network
Title | BA-Net: Dense Bundle Adjustment Network |
Authors | Chengzhou Tang, Ping Tan |
Abstract | This paper introduces a network architecture to solve the structure-from-motion (SfM) problem via feature-metric bundle adjustment (BA), which explicitly enforces multi-view geometry constraints in the form of feature-metric error. The whole pipeline is differentiable so that the network can learn suitable features that make the BA problem more tractable. Furthermore, this work introduces a novel depth parameterization to recover dense per-pixel depth. The network first generates several basis depth maps according to the input image and optimizes the final depth as a linear combination of these basis depth maps via feature-metric BA. The basis depth maps generator is also learned via end-to-end training. The whole system nicely combines domain knowledge (i.e. hard-coded multi-view geometry constraints) and deep learning (i.e. feature learning and basis depth maps learning) to address the challenging dense SfM problem. Experiments on large scale real data prove the success of the proposed method. |
Tasks | Depth And Camera Motion |
Published | 2018-06-13 |
URL | https://arxiv.org/abs/1806.04807v3 |
https://arxiv.org/pdf/1806.04807v3.pdf | |
PWC | https://paperswithcode.com/paper/ba-net-dense-bundle-adjustment-network |
Repo | https://github.com/frobelbest/BANet |
Framework | tf |
Pre-Defined Sparse Neural Networks with Hardware Acceleration
Title | Pre-Defined Sparse Neural Networks with Hardware Acceleration |
Authors | Sourya Dey, Kuan-Wen Huang, Peter A. Beerel, Keith M. Chugg |
Abstract | Neural networks have proven to be extremely powerful tools for modern artificial intelligence applications, but computational and storage complexity remain limiting factors. This paper presents two compatible contributions towards reducing the time, energy, computational, and storage complexities associated with multilayer perceptrons. Pre-defined sparsity is proposed to reduce the complexity during both training and inference, regardless of the implementation platform. Our results show that storage and computational complexity can be reduced by factors greater than 5X without significant performance loss. The second contribution is an architecture for hardware acceleration that is compatible with pre-defined sparsity. This architecture supports both training and inference modes and is flexible in the sense that it is not tied to a specific number of neurons. For example, this flexibility implies that various sized neural networks can be supported on various sized Field Programmable Gate Array (FPGA)s. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01164v1 |
http://arxiv.org/pdf/1812.01164v1.pdf | |
PWC | https://paperswithcode.com/paper/pre-defined-sparse-neural-networks-with |
Repo | https://github.com/souryadey/predefinedsparse-nnets |
Framework | tf |
MMA Training: Direct Input Space Margin Maximization through Adversarial Training
Title | MMA Training: Direct Input Space Margin Maximization through Adversarial Training |
Authors | Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, Ruitong Huang |
Abstract | We study adversarial robustness of neural networks from a margin maximization perspective, where margins are defined as the distances from inputs to a classifier’s decision boundary. Our study shows that maximizing margins can be achieved by minimizing the adversarial loss on the decision boundary at the “shortest successful perturbation”, demonstrating a close connection between adversarial losses and the margins. We propose Max-Margin Adversarial (MMA) training to directly maximize the margins to achieve adversarial robustness. Instead of adversarial training with a fixed $\epsilon$, MMA offers an improvement by enabling adaptive selection of the “correct” $\epsilon$ as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training’s efficacy on the MNIST and CIFAR10 datasets w.r.t. $\ell_\infty$ and $\ell_2$ robustness. Code and models are available at https://github.com/BorealisAI/mma_training. |
Tasks | |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02637v4 |
https://arxiv.org/pdf/1812.02637v4.pdf | |
PWC | https://paperswithcode.com/paper/max-margin-adversarial-mma-training-direct |
Repo | https://github.com/BorealisAI/mma_training |
Framework | pytorch |
Deep Imbalanced Learning for Face Recognition and Attribute Prediction
Title | Deep Imbalanced Learning for Face Recognition and Attribute Prediction |
Authors | Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang |
Abstract | Data for face analysis often exhibit highly-skewed class distribution, i.e., most data belong to a few majority classes, while the minority classes only contain a scarce amount of instances. To mitigate this issue, contemporary deep learning methods typically follow classic strategies such as class re-sampling or cost-sensitive training. In this paper, we conduct extensive and systematic experiments to validate the effectiveness of these classic schemes for representation learning on class-imbalanced data. We further demonstrate that more discriminative deep representation can be learned by enforcing a deep network to maintain inter-cluster margins both within and between classes. This tight constraint effectively reduces the class imbalance inherent in the local data neighborhood, thus carving much more balanced class boundaries locally. We show that it is easy to deploy angular margins between the cluster distributions on a hypersphere manifold. Such learned Cluster-based Large Margin Local Embedding (CLMLE), when combined with a simple k-nearest cluster algorithm, shows significant improvements in accuracy over existing methods on both face recognition and face attribute prediction tasks that exhibit imbalanced class distribution. |
Tasks | Face Recognition, Representation Learning |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00194v2 |
http://arxiv.org/pdf/1806.00194v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-imbalanced-learning-for-face-recognition |
Repo | https://github.com/JoyLuo/face-attribute-recognition-paper-list |
Framework | none |
Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
Title | Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos |
Authors | Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova |
Abstract | Models and examples built with TensorFlow |
Tasks | Depth And Camera Motion, Depth Estimation, Motion Estimation, Robot Navigation |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06152v1 |
http://arxiv.org/pdf/1811.06152v1.pdf | |
PWC | https://paperswithcode.com/paper/depth-prediction-without-the-sensors |
Repo | https://github.com/tensorflow/models/tree/master/research/struct2depth |
Framework | tf |
Hyperbolic Entailment Cones for Learning Hierarchical Embeddings
Title | Hyperbolic Entailment Cones for Learning Hierarchical Embeddings |
Authors | Octavian-Eugen Ganea, Gary Bécigneul, Thomas Hofmann |
Abstract | Learning graph representations via low-dimensional embeddings that preserve relevant network properties is an important class of problems in machine learning. We here present a novel method to embed directed acyclic graphs. Following prior work, we first advocate for using hyperbolic spaces which provably model tree-like structures better than Euclidean geometry. Second, we view hierarchical relations as partial orders defined using a family of nested geodesically convex cones. We prove that these entailment cones admit an optimal shape with a closed form expression both in the Euclidean and hyperbolic spaces, and they canonically define the embedding learning process. Experiments show significant improvements of our method over strong recent baselines both in terms of representational capacity and generalization. |
Tasks | Graph Embedding, Hypernym Discovery, Link Prediction, Representation Learning |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.01882v3 |
http://arxiv.org/pdf/1804.01882v3.pdf | |
PWC | https://paperswithcode.com/paper/hyperbolic-entailment-cones-for-learning |
Repo | https://github.com/dalab/hyperbolic_cones |
Framework | none |