January 29, 2020

3384 words 16 mins read

Paper Group ANR 521

Self-attention based end-to-end Hindi-English Neural Machine Translation. A Hybrid Evolutionary Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters. Brain-inspired reverse adversarial examples. Investigating Multilingual NMT Representations at Scale. COLTRANE: ConvolutiOnaL TRAjectory NEtwork for Deep Map Infe …

Self-attention based end-to-end Hindi-English Neural Machine Translation


Title	Self-attention based end-to-end Hindi-English Neural Machine Translation
Authors	Siddhant Srivastava, Ritu Tiwari
Abstract	Machine Translation (MT) is a zone of concentrate in Natural Language processing which manages the programmed interpretation of human language, starting with one language then onto the next by the PC. Having a rich research history spreading over about three decades, Machine interpretation is a standout amongst the most looked for after region of research in the computational linguistics network. As a piece of this current ace’s proposal, the fundamental center examines the Deep-learning based strategies that have gained critical ground as of late and turning into the de facto strategy in MT. We would like to point out the recent advances that have been put forward in the field of Neural Translation models, different domains under which NMT has replaced conventional SMT models and would also like to mention future avenues in the field. Consequently, we propose an end-to-end self-attention transformer network for Neural Machine Translation, trained on Hindi-English parallel corpus and compare the model’s efficiency with other state of art models like encoder-decoder and attention-based encoder-decoder neural models on the basis of BLEU. We conclude this paper with a comparative analysis of the three proposed models.
Tasks	Machine Translation
Published	2019-09-21
URL	https://arxiv.org/abs/1909.09779v1
PDF	https://arxiv.org/pdf/1909.09779v1.pdf
PWC	https://paperswithcode.com/paper/190909779
Repo
Framework

A Hybrid Evolutionary Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters


Title	A Hybrid Evolutionary Algorithm Framework for Optimising Power Take Off and Placements of Wave Energy Converters
Authors	Mehdi Neshat, Bradley Alexander, Nataliia Sergiienko, Markus Wagner
Abstract	Ocean wave energy is a source of renewable energy that has gained much attention for its potential to contribute significantly to meeting the global energy demand. In this research, we investigate the problem of maximising the energy delivered by farms of wave energy converters (WEC’s). We consider state-of-the-art fully submerged three-tether converters deployed in arrays. The goal of this work is to use heuristic search to optimise the power output of arrays in a size-constrained environment by configuring WEC locations and the power-take-off (PTO) settings for each WEC. Modelling the complex hydrodynamic interactions in wave farms is expensive, which constrains search to only a few thousand model evaluations. We explore a variety of heuristic approaches including cooperative and hybrid methods. The effectiveness of these approaches is assessed in two real wave scenarios (Sydney and Perth) with farms of two different scales. We find that a combination of symmetric local search with Nelder-Mead Simplex direct search combined with a back-tracking optimization strategy is able to outperform previously defined search techniques by up to 3%.
Tasks
Published	2019-04-15
URL	http://arxiv.org/abs/1904.07043v1
PDF	http://arxiv.org/pdf/1904.07043v1.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-evolutionary-algorithm-framework-for
Repo
Framework

Brain-inspired reverse adversarial examples


Title	Brain-inspired reverse adversarial examples
Authors	Shaokai Ye, Sia Huat Tan, Kaidi Xu, Yanzhi Wang, Chenglong Bao, Kaisheng Ma
Abstract	A human does not have to see all elephants to recognize an animal as an elephant. On contrast, current state-of-the-art deep learning approaches heavily depend on the variety of training samples and the capacity of the network. In practice, the size of network is always limited and it is impossible to access all the data samples. Under this circumstance, deep learning models are extremely fragile to human-imperceivable adversarial examples, which impose threats to all safety critical systems. Inspired by the association and attention mechanisms of the human brain, we propose reverse adversarial examples method that can greatly improve models’ robustness on unseen data. Experiments show that our reverse adversarial method can improve accuracy on average 19.02% on ResNet18, MobileNet, and VGG16 on unseen data transformation. Besides, the proposed method is also applicable to compressed models and shows potential to compensate the robustness drop brought by model quantization - an absolute 30.78% accuracy improvement.
Tasks	Quantization
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12171v1
PDF	https://arxiv.org/pdf/1905.12171v1.pdf
PWC	https://paperswithcode.com/paper/brain-inspired-reverse-adversarial-examples
Repo
Framework

Investigating Multilingual NMT Representations at Scale


Title	Investigating Multilingual NMT Representations at Scale
Authors	Sneha Reddy Kudugunta, Ankur Bapna, Isaac Caswell, Naveen Arivazhagan, Orhan Firat
Abstract	Multilingual Neural Machine Translation (NMT) models have yielded large empirical success in transfer learning settings. However, these black-box representations are poorly understood, and their mode of transfer remains elusive. In this work, we attempt to understand massively multilingual NMT representations (with 103 languages) using Singular Value Canonical Correlation Analysis (SVCCA), a representation similarity framework that allows us to compare representations across different languages, layers and models. Our analysis validates several empirical results and long-standing intuitions, and unveils new observations regarding how representations evolve in a multilingual translation model. We draw three major conclusions from our analysis, with implications on cross-lingual transfer learning: (i) Encoder representations of different languages cluster based on linguistic similarity, (ii) Representations of a source language learned by the encoder are dependent on the target language, and vice-versa, and (iii) Representations of high resource and/or linguistically similar languages are more robust when fine-tuning on an arbitrary language pair, which is critical to determining how much cross-lingual transfer can be expected in a zero or few-shot setting. We further connect our findings with existing empirical observations in multilingual NMT and transfer learning.
Tasks	Cross-Lingual Transfer, Machine Translation, Transfer Learning
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02197v2
PDF	https://arxiv.org/pdf/1909.02197v2.pdf
PWC	https://paperswithcode.com/paper/investigating-multilingual-nmt
Repo
Framework

COLTRANE: ConvolutiOnaL TRAjectory NEtwork for Deep Map Inference


Title	COLTRANE: ConvolutiOnaL TRAjectory NEtwork for Deep Map Inference
Authors	Arian Prabowo, Piotr Koniusz, Wei Shao, Flora D. Salim
Abstract	The process of automatic generation of a road map from GPS trajectories, called map inference, remains a challenging task to perform on a geospatial data from a variety of domains as the majority of existing studies focus on road maps in cities. Inherently, existing algorithms are not guaranteed to work on unusual geospatial sites, such as an airport tarmac, pedestrianized paths and shortcuts, or animal migration routes, etc. Moreover, deep learning has not been explored well enough for such tasks. This paper introduces COLTRANE, ConvolutiOnaL TRAjectory NEtwork, a novel deep map inference framework which operates on GPS trajectories collected in various environments. This framework includes an Iterated Trajectory Mean Shift (ITMS) module to localize road centerlines, which copes with noisy GPS data points. Convolutional Neural Network trained on our novel trajectory descriptor is then introduced into our framework to detect and accurately classify junctions for refinement of the road maps. COLTRANE yields up to 37% improvement in F1 scores over existing methods on two distinct real-world datasets: city roads and airport tarmac.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.11048v1
PDF	https://arxiv.org/pdf/1909.11048v1.pdf
PWC	https://paperswithcode.com/paper/coltrane-convolutional-trajectory-network-for
Repo
Framework

Deep Transfer Across Domains for Face Anti-spoofing


Title	Deep Transfer Across Domains for Face Anti-spoofing
Authors	Xiaoguang Tu, Hengsheng Zhang, Mei Xie, Yao Luo, Yuefei Zhang, Zheng Ma
Abstract	A practical face recognition system demands not only high recognition performance, but also the capability of detecting spoofing attacks. While emerging approaches of face anti-spoofing have been proposed in recent years, most of them do not generalize well to new database. The generalization ability of face anti-spoofing needs to be significantly improved before they can be adopted by practical application systems. The main reason for the poor generalization of current approaches is the variety of materials among the spoofing devices. As the attacks are produced by putting a spoofing display (e.t., paper, electronic screen, forged mask) in front of a camera, the variety of spoofing materials can make the spoofing attacks quite different. Furthermore, the background/lighting condition of a new environment can make both the real accesses and spoofing attacks different. Another reason for the poor generalization is that limited labeled data is available for training in face anti-spoofing. In this paper, we focus on improving the generalization ability across different kinds of datasets. We propose a CNN framework using sparsely labeled data from the target domain to learn features that are invariant across domains for face anti-spoofing. Experiments on public-domain face spoofing databases show that the proposed method significantly improve the cross-dataset testing performance only with a small number of labeled samples from the target domain.
Tasks	Face Anti-Spoofing, Face Recognition
Published	2019-01-17
URL	https://arxiv.org/abs/1901.05633v2
PDF	https://arxiv.org/pdf/1901.05633v2.pdf
PWC	https://paperswithcode.com/paper/deep-transfer-across-domains-for-face-anti
Repo
Framework

Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent


Title	Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Authors	Shuheng Shen, Linli Xu, Jingchang Liu, Xianfeng Liang, Yifei Cheng
Abstract	With the increase in the amount of data and the expansion of model scale, distributed parallel training becomes an important and successful technique to address the optimization challenges. Nevertheless, although distributed stochastic gradient descent (SGD) algorithms can achieve a linear iteration speedup, they are limited significantly in practice by the communication cost, making it difficult to achieve a linear time speedup. In this paper, we propose a computation and communication decoupled stochastic gradient descent (CoCoD-SGD) algorithm to run computation and communication in parallel to reduce the communication cost. We prove that CoCoD-SGD has a linear iteration speedup with respect to the total computation capability of the hardware resources. In addition, it has a lower communication complexity and better time speedup comparing with traditional distributed SGD algorithms. Experiments on deep neural network training demonstrate the significant improvements of CoCoD-SGD: when training ResNet18 and VGG16 with 16 Geforce GTX 1080Ti GPUs, CoCoD-SGD is up to 2-3$\times$ faster than traditional synchronous SGD.
Tasks
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12043v2
PDF	https://arxiv.org/pdf/1906.12043v2.pdf
PWC	https://paperswithcode.com/paper/faster-distributed-deep-net-training
Repo
Framework

Feedback Linearization for Unknown Systems via Reinforcement Learning


Title	Feedback Linearization for Unknown Systems via Reinforcement Learning
Authors	Tyler Westenbroek, David Fridovich-Keil, Eric Mazumdar, Shreyas Arora, Valmik Prabhu, S. Shankar Sastry, Claire J. Tomlin
Abstract	We present a novel approach to control design for nonlinear systems, which leverages reinforcement learning techniques to learn a linearizing controller for a physical plant with unknown dynamics. Feedback linearization is a technique from nonlinear control which renders the input-output dynamics of a nonlinear plant \emph{linear} under application of an appropriate feedback controller. Once a linearizing controller has been constructed, desired output trajectories for the nonlinear plant can be tracked using a variety of linear control techniques. A single learned policy then serves to track arbitrary desired reference signals provided by a higher-level planner. We present theoretical results which provide conditions under which the learning problem has a unique solution which exactly linearizes the plant. We demonstrate the performance of our approach on two simulated problems and a physical robotic platform. For the simulated environments, we observe that the learned feedback linearizing policies can achieve arbitrary tracking of reference trajectories for a fully actuated double pendulum and a 14 dimensional quadrotor. In hardware, we demonstrate that our approach significantly improves tracking performance on a 7-DOF Baxter robot after less than two hours of training.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13272v1
PDF	https://arxiv.org/pdf/1910.13272v1.pdf
PWC	https://paperswithcode.com/paper/191013272
Repo
Framework

Cross-Lingual Transfer Learning for Question Answering


Title	Cross-Lingual Transfer Learning for Question Answering
Authors	Chia-Hsuan Lee, Hung-Yi Lee
Abstract	Deep learning based question answering (QA) on English documents has achieved success because there is a large amount of English training examples. However, for most languages, training examples for high-quality QA models are not available. In this paper, we explore the problem of cross-lingual transfer learning for QA, where a source language task with plentiful annotations is utilized to improve the performance of a QA model on a target language task with limited available annotations. We examine two different approaches. A machine translation (MT) based approach translates the source language into the target language, or vice versa. Although the MT-based approach brings improvement, it assumes the availability of a sentence-level translation system. A GAN-based approach incorporates a language discriminator to learn language-universal feature representations, and consequentially transfer knowledge from the source language. The GAN-based approach rivals the performance of the MT-based approach with fewer linguistic resources. Applying both approaches simultaneously yield the best results. We use two English benchmark datasets, SQuAD and NewsQA, as source language data, and show significant improvements over a number of established baselines on a Chinese QA task. We achieve the new state-of-the-art on the Chinese QA dataset.
Tasks	Cross-Lingual Transfer, Machine Translation, Question Answering, Transfer Learning
Published	2019-07-13
URL	https://arxiv.org/abs/1907.06042v1
PDF	https://arxiv.org/pdf/1907.06042v1.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-transfer-learning-for-question
Repo
Framework

Audio Classification of Bit-Representation Waveform


Title	Audio Classification of Bit-Representation Waveform
Authors	Masaki Okawa, Takuya Saito, Naoki Sawada, Hiromitsu Nishizaki
Abstract	This study investigated the waveform representation for audio signal classification. Recently, many studies on audio waveform classification such as acoustic event detection and music genre classification have been published. Most studies on audio waveform classification have proposed the use of a deep learning (neural network) framework. Generally, a frequency analysis method such as Fourier transform is applied to extract the frequency or spectral information from the input audio waveform before inputting the raw audio waveform into the neural network. In contrast to these previous studies, in this paper, we propose a novel waveform representation method, in which audio waveforms are represented as a bit sequence, for audio classification. In our experiment, we compare the proposed bit representation waveform, which is directly given to a neural network, to other representations of audio waveforms such as a raw audio waveform and a power spectrum with two classification tasks: one is an acoustic event classification task and the other is a sound/music classification task. The experimental results showed that the bit representation waveform achieved the best classification performance for both the tasks.
Tasks	Audio Classification, Music Classification
Published	2019-04-08
URL	https://arxiv.org/abs/1904.04364v2
PDF	https://arxiv.org/pdf/1904.04364v2.pdf
PWC	https://paperswithcode.com/paper/audio-classification-of-bit-representation
Repo
Framework

Manifold Graph with Learned Prototypes for Semi-Supervised Image Classification


Title	Manifold Graph with Learned Prototypes for Semi-Supervised Image Classification
Authors	Chia-Wen Kuo, Chih-Yao Ma, Jia-Bin Huang, Zsolt Kira
Abstract	Recent advances in semi-supervised learning methods rely on estimating the categories of unlabeled data using a model trained on the labeled data (pseudo-labeling) and using the unlabeled data for various consistency-based regularization. In this work, we propose to explicitly leverage the structure of the data manifold based on a Manifold Graph constructed over the image instances within the feature space. Specifically, we propose an architecture based on graph networks that jointly optimizes feature extraction, graph connectivity, and feature propagation and aggregation to unlabeled data in an end-to-end manner. Further, we present a novel Prototype Generator for producing a diverse set of prototypes that compactly represent each category, which supports feature propagation. To evaluate our method, we first contribute a strong baseline that combines two consistency-based regularizers that already achieves state-of-the-art results especially with fewer labels. We then show that when combined with these regularizers, the proposed method facilitates the propagation of information from generated prototypes to image data to further improve results. We provide extensive qualitative and quantitative experimental results on semi-supervised benchmarks demonstrating the improvements arising from our design and show that our method achieves state-of-the-art performance when compared with existing methods using a single model and comparable with ensemble methods. Specifically, we achieve error rates of 3.35% on SVHN, 8.27% on CIFAR-10, and 33.83% on CIFAR-100. With much fewer labels, we surpass the state of the arts by significant margins of 41% relative error decrease on average.
Tasks	Image Classification, Semi-Supervised Image Classification
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05202v2
PDF	https://arxiv.org/pdf/1906.05202v2.pdf
PWC	https://paperswithcode.com/paper/manifold-graph-with-learned-prototypes-for
Repo
Framework

Metrology for AI: From Benchmarks to Instruments


Title	Metrology for AI: From Benchmarks to Instruments
Authors	Chris Welty, Praveen Paritosh, Lora Aroyo
Abstract	In this paper we present the first steps towards hardening the science of measuring AI systems, by adopting metrology, the science of measurement and its application, and applying it to human (crowd) powered evaluations. We begin with the intuitive observation that evaluating the performance of an AI system is a form of measurement. In all other science and engineering disciplines, the devices used to measure are called instruments, and all measurements are recorded with respect to the characteristics of the instruments used. One does not report mass, speed, or length, for example, of a studied object without disclosing the precision (measurement variance) and resolution (smallest detectable change) of the instrument used. It is extremely common in the AI literature to compare the performance of two systems by using a crowd-sourced dataset as an instrument, but failing to report if the performance difference lies within the capability of that instrument to measure. To illustrate the adoption of metrology to benchmark datasets we use the word similarity benchmark WS353 and several previously published experiments that use it for evaluation.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01875v1
PDF	https://arxiv.org/pdf/1911.01875v1.pdf
PWC	https://paperswithcode.com/paper/metrology-for-ai-from-benchmarks-to
Repo
Framework

Point Cloud Instance Segmentation using Probabilistic Embeddings


Title	Point Cloud Instance Segmentation using Probabilistic Embeddings
Authors	Biao Zhang, Peter Wonka
Abstract	In this paper we propose a new framework for point cloud instance segmentation. Our framework has two steps: an embedding step and a clustering step. In the embedding step, our main contribution is to propose a probabilistic embedding space for point cloud embedding. Specifically, each point is represented as a tri-variate normal distribution. In the clustering step, we propose a novel loss function, which benefits both the semantic segmentation and the clustering. Our experimental results show important improvements to the SOTA, i.e., 3.1% increased average per-category mAP on the PartNet dataset.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-11-30
URL	https://arxiv.org/abs/1912.00145v1
PDF	https://arxiv.org/pdf/1912.00145v1.pdf
PWC	https://paperswithcode.com/paper/point-cloud-instance-segmentation-using
Repo
Framework

Attentive Context Normalization for Robust Permutation-Equivariant Learning


Title	Attentive Context Normalization for Robust Permutation-Equivariant Learning
Authors	Weiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi
Abstract	Many problems in computer vision require dealing with sparse, unordered data in the form of point clouds. Permutation-equivariant networks have become a popular solution-they operate on individual data points with simple perceptrons and extract contextual information with global pooling. This can be achieved with a simple normalization of the feature maps, a global operation that is unaffected by the order. In this paper, we propose Attentive ContextNormalization (ACN), a simple yet effective technique to build permutation-equivariant networks robust to outliers. Specifically, we show how to normalize the feature maps with weights that are estimated within the network, excluding outliers from this normalization. We use this mechanism to leverage two types of attention: local and global-by combining them, our method is able to find the essential data points in high-dimensional space in order to solve a given task. We demonstrate through extensive experiments that our approach, which we call Attentive Context Networks (ACNe), provides a significant leap in performance compared to the state-of-the-art on camera pose estimation, robust fitting, and point cloud classification under noise and outliers.
Tasks	Pose Estimation
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02545v2
PDF	https://arxiv.org/pdf/1907.02545v2.pdf
PWC	https://paperswithcode.com/paper/attentive-context-normalization-for-robust
Repo
Framework

Self-Attention Capsule Networks for Object Classification


Title	Self-Attention Capsule Networks for Object Classification
Authors	Assaf Hoogi, Brian Wilcox, Yachee Gupta, Daniel L. Rubin
Abstract	We propose a novel architecture for object classification, called Self-Attention Capsule Networks (SACN). SACN is the first model that incorporates the Self-Attention mechanism as an integral layer within the Capsule Network (CapsNet). While the Self-Attention mechanism supplies a long-range dependencies, results in selecting the more dominant image regions to focus on, the CapsNet analyzes the relevant features and their spatial correlations inside these regions only. The features are extracted in the convolutional layer. Then, the Self-Attention layer learns to suppress irrelevant regions based on features analysis and highlights salient features useful for a specific task. The attention map is then fed into the CapsNet primary layer that is followed by a classification layer. The proposed SACN model was designed to solve two main limitations of the baseline CapsNet - analysis of complex data and significant computational load. In this work, we use a shallow CapsNet architecture and compensates for the absence of a deeper network by using the Self-Attention module to significantly improve the results. The proposed Self-Attention CapsNet architecture was extensively evaluated on six different datasets, mainly on three different medical sets, in addition to the natural MNIST, SVHN and CIFAR10. The model was able to classify images and their patches with diverse and complex backgrounds better than the baseline CapsNet. As a result, the proposed Self-Attention CapsNet significantly improved classification performance within and across different datasets and outperformed the baseline CapsNet, ResNet-18 and DenseNet-40 not only in classification accuracy but also in robustness.
Tasks	Image Classification, Object Classification
Published	2019-04-29
URL	https://arxiv.org/abs/1904.12483v2
PDF	https://arxiv.org/pdf/1904.12483v2.pdf
PWC	https://paperswithcode.com/paper/190412483
Repo
Framework