January 28, 2020

3173 words 15 mins read

Paper Group ANR 824

Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time). Self-Organizing Maps with Variable Input Length for Motif Discovery and Word Segmentation. Online Multiple Pedestrian Tracking using Deep Temporal Appearance Matching Association. Reinforced Genetic Algorithm Learning for Optimizing Computation …

Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time)


Title	Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time)
Authors	Heng Gong, Xiaocheng Feng, Bing Qin, Ting Liu
Abstract	Although Seq2Seq models for table-to-text generation have achieved remarkable progress, modeling table representation in one dimension is inadequate. This is because (1) the table consists of multiple rows and columns, which means that encoding a table should not depend only on one dimensional sequence or set of records and (2) most of the tables are time series data (e.g. NBA game data, stock market data), which means that the description of the current table may be affected by its historical data. To address aforementioned problems, not only do we model each table cell considering other records in the same row, we also enrich table’s representation by modeling each table cell in context of other cells in the same column or with historical (time dimension) data respectively. In addition, we develop a table cell fusion gate to combine representations from row, column and time dimension into one dense vector according to the saliency of each dimension’s representation. We evaluated our methods on ROTOWIRE, a benchmark dataset of NBA basketball games. Both automatic and human evaluation results demonstrate the effectiveness of our model with improvement of 2.66 in BLEU over the strong baseline and outperformance of state-of-the-art model.
Tasks	Table-to-Text Generation, Text Generation, Time Series
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02304v1
PDF	https://arxiv.org/pdf/1909.02304v1.pdf
PWC	https://paperswithcode.com/paper/table-to-text-generation-with-effective
Repo
Framework

Self-Organizing Maps with Variable Input Length for Motif Discovery and Word Segmentation


Title	Self-Organizing Maps with Variable Input Length for Motif Discovery and Word Segmentation
Authors	Raphael C. Brito, Hansenclever F. Bassani
Abstract	Time Series Motif Discovery (TSMD) is defined as searching for patterns that are previously unknown and appear with a given frequency in time series. Another problem strongly related with TSMD is Word Segmentation. This problem has received much attention from the community that studies early language acquisition in babies and toddlers. The development of biologically plausible models for word segmentation could greatly advance this field. Therefore, in this article, we propose the Variable Input Length Map (VILMAP) for Motif Discovery and Word Segmentation. The model is based on the Self-Organizing Maps and can identify Motifs with different lengths in time series. In our experiments, we show that VILMAP presents good results in finding Motifs in a standard Motif discovery dataset and can avoid catastrophic forgetting when trained with datasets with increasing values of input size. We also show that VILMAP achieves results similar or superior to other methods in the literature developed for the task of word segmentation.
Tasks	Language Acquisition, Time Series
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02830v1
PDF	https://arxiv.org/pdf/1908.02830v1.pdf
PWC	https://paperswithcode.com/paper/self-organizing-maps-with-variable-input
Repo
Framework

Online Multiple Pedestrian Tracking using Deep Temporal Appearance Matching Association


Title	Online Multiple Pedestrian Tracking using Deep Temporal Appearance Matching Association
Authors	Young-Chul Yoon, Du Yong Kim, Kwangjin Yoon, Young-min Song, Moongu Jeon
Abstract	In online multiple pedestrian tracking, it is of great importance to model appearance and geometric similarity between existing tracks and targets appeared in a new frame. The appearance model contains discriminative information with higher dimension compared to the geometric model. Thanks to the recent success of deep learning based methods, handling of high dimensional appearance information becomes possible. Among many deep networks, the Siamese network with triplet loss is popularly adopted as an appearance feature extractor. Since the Siamese network can extract features of each input independently, it is possible to update and maintain target-specific features. However, it is not suitable for multi-object settings that require comparison with other inputs. In this paper we propose a novel track appearance model based on joint-inference network to address this issue. The proposed method enables comparison of two inputs to be used for adaptive appearance modeling. It contributes to disambiguating the process of target-observation matching and consolidating the identity consistency. Diverse experimental results support effectiveness of our method. Our work has been awarded as a 3rd-highest tracker on MOTChallenge19, held in CVPR2019.
Tasks
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00831v3
PDF	https://arxiv.org/pdf/1907.00831v3.pdf
PWC	https://paperswithcode.com/paper/online-multiple-pedestrian-tracking-using
Repo
Framework

Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs


Title	Reinforced Genetic Algorithm Learning for Optimizing Computation Graphs
Authors	Aditya Paliwal, Felix Gimeno, Vinod Nair, Yujia Li, Miles Lubin, Pushmeet Kohli, Oriol Vinyals
Abstract	We present a deep reinforcement learning approach to minimizing the execution cost of neural network computation graphs in an optimizing compiler. Unlike earlier learning-based works that require training the optimizer on the same graph to be optimized, we propose a learning approach that trains an optimizer offline and then generalizes to previously unseen graphs without further training. This allows our approach to produce high-quality execution decisions on real-world TensorFlow graphs in seconds instead of hours. We consider two optimization tasks for computation graphs: minimizing running time and peak memory usage. In comparison to an extensive set of baselines, our approach achieves significant improvements over classical and other learning-based methods on these two tasks.
Tasks	Transfer Learning
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02494v4
PDF	https://arxiv.org/pdf/1905.02494v4.pdf
PWC	https://paperswithcode.com/paper/regal-transfer-learning-for-fast-optimization
Repo
Framework

HAWKEYE: Adversarial Example Detector for Deep Neural Networks


Title	HAWKEYE: Adversarial Example Detector for Deep Neural Networks
Authors	Jinkyu Koo, Michael Roth, Saurabh Bagchi
Abstract	Adversarial examples (AEs) are images that can mislead deep neural network (DNN) classifiers via introducing slight perturbations into original images. Recent work has shown that detecting AEs can be more effective against AEs than preventing them from being generated. However, the state-of-the-art AE detection still shows a high false positive rate, thereby rejecting a considerable amount of normal images. To address this issue, we propose HAWKEYE, which is a separate neural network that analyzes the output layer of the DNN, and detects AEs. HAWKEYE’s AE detector utilizes a quantized version of an input image as a reference, and is trained to distinguish the variation characteristics of the DNN output on an input image from the DNN output on its reference image. We also show that cascading our AE detectors that are trained for different quantization step sizes can drastically reduce a false positive rate, while keeping a detection rate high.
Tasks	Quantization
Published	2019-09-22
URL	https://arxiv.org/abs/1909.09938v1
PDF	https://arxiv.org/pdf/1909.09938v1.pdf
PWC	https://paperswithcode.com/paper/190909938
Repo
Framework

Model-aided Deep Neural Network for Source Number Detection


Title	Model-aided Deep Neural Network for Source Number Detection
Authors	Yuwen Yang, Feifei Gao, Cheng Qian, Guisheng Liao
Abstract	Source number detection is a critical problem in array signal processing. Conventional model-driven methods e.g., Akaikes information criterion (AIC) and minimum description length (MDL), suffer from severe performance degradation when the number of snapshots is small or the signal-to-noise ratio (SNR) is low. In this paper, we exploit the model-aided based deep neural network (DNN) to estimate the source number. Specifically, we first propose the eigenvalue based regression network (ERNet) and classification network (ECNet) to estimate the number of non-coherent sources, where the eigenvalues of the received signal covariance matrix and the source number are used as the input and the supervise label of the networks, respectively. Then, we extend the ERNet and ECNet for estimating the number of coherent sources, where the forward-backward spatial smoothing (FBSS) scheme is adopted to improve the performance of ERNet and ECNet. Numerical results demonstrate the outstanding performance of ERNet and ECNet over the conventional AIC and MDL methods as well as their excellent generalization capability, which also shows their great potentials for practical applications.
Tasks
Published	2019-09-29
URL	https://arxiv.org/abs/1909.13273v2
PDF	https://arxiv.org/pdf/1909.13273v2.pdf
PWC	https://paperswithcode.com/paper/model-aided-deep-neural-network-for-source
Repo
Framework

Sample-Efficient Neural Architecture Search by Learning Action Space


Title	Sample-Efficient Neural Architecture Search by Learning Action Space
Authors	Linnan Wang, Saining Xie, Teng Li, Rodrigo Fonseca, Yuandong Tian
Abstract	Neural Architecture Search (NAS) has emerged as a promising technique for automatic neural network design. However, existing NAS approaches often utilize manually designed action space, which is not directly related to the performance metric to be optimized (e.g., accuracy). As a result, using manually designed action space to perform NAS often leads to sample-inefficient explorations of architectures and thus can be sub-optimal. In order to improve sample efficiency, this paper proposes Latent Action Neural Architecture Search (LaNAS) that learns the action space to recursively partition the architecture search space into regions, each with concentrated performance metrics (\emph{i.e.}, low variance). During the search phase, as different architecture search action sequences lead to regions of different performance, the search efficiency can be significantly improved by biasing towards the regions with good performance. On the largest NAS dataset NasBench-101, our experimental results demonstrated that LaNAS is 22x, 14.6x and 12.4x more sample-efficient than random search, regularized evolution, and Monte Carlo Tree Search (MCTS) respectively. When applied to the open domain, LaNAS finds an architecture that achieves SoTA 98.0% accuracy on CIFAR-10 and 75.0% top1 accuracy on ImageNet (mobile setting), after exploring only 6,000 architectures.
Tasks	Neural Architecture Search
Published	2019-06-17
URL	https://arxiv.org/abs/1906.06832v1
PDF	https://arxiv.org/pdf/1906.06832v1.pdf
PWC	https://paperswithcode.com/paper/sample-efficient-neural-architecture-search
Repo
Framework

Progressive Compressed Records: Taking a Byte out of Deep Learning Data


Title	Progressive Compressed Records: Taking a Byte out of Deep Learning Data
Authors	Michael Kuchnik, George Amvrosiadis, Virginia Smith
Abstract	Deep learning training accesses vast amounts of data at high velocity, posing challenges for datasets retrieved over commodity networks and storage devices. We introduce a way to dynamically reduce the overhead of fetching and transporting training data with a method we term Progressive Compressed Records (PCRs). PCRs deviate from previous formats by using progressive compression to convert a single dataset into multiple datasets of increasing fidelity—all without adding to the total dataset size. Empirically, we implement PCRs and evaluate them on a wide range of datasets: ImageNet, HAM10000, Stanford Cars, and CelebA-HQ. Our results show that different tasks can tolerate different levels of compression. PCRs use an on-disk layout that enables applications to efficiently and dynamically access appropriate levels of compression at runtime. In turn, we demonstrate that PCRs can seamlessly enable a 2x speedup in training time on average over baseline formats.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00472v1
PDF	https://arxiv.org/pdf/1911.00472v1.pdf
PWC	https://paperswithcode.com/paper/progressive-compressed-records-taking-a-byte-1
Repo
Framework

Instance-based Transfer Learning for Multilingual Deep Retrieval


Title	Instance-based Transfer Learning for Multilingual Deep Retrieval
Authors	Andrew O. Arnold, William W. Cohen
Abstract	Perhaps the simplest type of multilingual transfer learning is instance-based transfer learning, in which data from the target language and the auxiliary languages are pooled, and a single model is learned from the pooled data. It is not immediately obvious when instance-based transfer learning will improve performance in this multilingual setting: for instance, a plausible conjecture is this kind of transfer learning would help only if the auxiliary languages were very similar to the target. Here we show that at large scale, this method is surprisingly effective, leading to positive transfer on all of 35 target languages we tested. We analyze this improvement and argue that the most natural explanation, namely direct vocabulary overlap between languages, only partially explains the performance gains: in fact, we demonstrate target-language improvement can occur after adding data from an auxiliary language with no vocabulary in common with the target. This surprising result is due to the effect of transitive vocabulary overlaps between pairs of auxiliary and target languages.
Tasks	Transfer Learning
Published	2019-11-08
URL	https://arxiv.org/abs/1911.06111v1
PDF	https://arxiv.org/pdf/1911.06111v1.pdf
PWC	https://paperswithcode.com/paper/instance-based-transfer-learning-for
Repo
Framework

Conditional out-of-sample generation for unpaired data using trVAE


Title	Conditional out-of-sample generation for unpaired data using trVAE
Authors	Mohammad Lotfollahi, Mohsen Naghipourfar, Fabian J. Theis, F. Alexander Wolf
Abstract	While generative models have shown great success in generating high-dimensional samples conditional on low-dimensional descriptors (learning e.g. stroke thickness in MNIST, hair color in CelebA, or speaker identity in Wavenet), their generation out-of-sample poses fundamental problems. The conditional variational autoencoder (CVAE) as a simple conditional generative model does not explicitly relate conditions during training and, hence, has no incentive of learning a compact joint distribution across conditions. We overcome this limitation by matching their distributions using maximum mean discrepancy (MMD) in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. We refer to the architecture as \emph{transformer} VAE (trVAE). Benchmarking trVAE on high-dimensional image and tabular data, we demonstrate higher robustness and higher accuracy than existing approaches. In particular, we show qualitatively improved predictions for cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data, by tackling previously problematic minority classes and multiple conditions. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively.
Tasks
Published	2019-10-04
URL	https://arxiv.org/abs/1910.01791v2
PDF	https://arxiv.org/pdf/1910.01791v2.pdf
PWC	https://paperswithcode.com/paper/conditional-out-of-sample-generation-for
Repo
Framework

Language and Dialect Identification of Cuneiform Texts


Title	Language and Dialect Identification of Cuneiform Texts
Authors	Tommi Jauhiainen, Heidi Jauhiainen, Tero Alstola, Krister Lindén
Abstract	This article introduces a corpus of cuneiform texts from which the dataset for the use of the Cuneiform Language Identification (CLI) 2019 shared task was derived as well as some preliminary language identification experiments conducted using that corpus. We also describe the CLI dataset and how it was derived from the corpus. In addition, we provide some baseline language identification results using the CLI dataset. To the best of our knowledge, the experiments detailed here are the first time automatic language identification methods have been used on cuneiform data.
Tasks	Language Identification
Published	2019-03-05
URL	http://arxiv.org/abs/1903.01891v3
PDF	http://arxiv.org/pdf/1903.01891v3.pdf
PWC	https://paperswithcode.com/paper/language-and-dialect-identification-of
Repo
Framework

Adversarial Robustness of Similarity-Based Link Prediction


Title	Adversarial Robustness of Similarity-Based Link Prediction
Authors	Kai Zhou, Tomasz P. Michalak, Yevgeniy Vorobeychik
Abstract	Link prediction is one of the fundamental problems in social network analysis. A common set of techniques for link prediction rely on similarity metrics which use the topology of the observed subnetwork to quantify the likelihood of unobserved links. Recently, similarity metrics for link prediction have been shown to be vulnerable to attacks whereby observations about the network are adversarially modified to hide target links. We propose a novel approach for increasing robustness of similarity-based link prediction by endowing the analyst with a restricted set of reliable queries which accurately measure the existence of queried links. The analyst aims to robustly predict a collection of possible links by optimally allocating the reliable queries. We formalize the analyst problem as a Bayesian Stackelberg game in which they first choose the reliable queries, followed by an adversary who deletes a subset of links among the remaining (unreliable) queries by the analyst. The analyst in our model is uncertain about the particular target link the adversary attempts to hide, whereas the adversary has full information about the analyst and the network. Focusing on similarity metrics using only local information, we show that the problem is NP-Hard for both players, and devise two principled and efficient approaches for solving it approximately. Extensive experiments with real and synthetic networks demonstrate the effectiveness of our approach.
Tasks	Link Prediction
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01432v1
PDF	https://arxiv.org/pdf/1909.01432v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-robustness-of-similarity-based
Repo
Framework

Polarimetric Thermal to Visible Face Verification via Self-Attention Guided Synthesis


Title	Polarimetric Thermal to Visible Face Verification via Self-Attention Guided Synthesis
Authors	Xing Di, Benjamin S. Riggan, Shuowen Hu, Nathaniel J. Short, Vishal M. Patel
Abstract	Polarimetric thermal to visible face verification entails matching two images that contain significant domain differences. Several recent approaches have attempted to synthesize visible faces from thermal images for cross-modal matching. In this paper, we take a different approach in which rather than focusing only on synthesizing visible faces from thermal faces, we also propose to synthesize thermal faces from visible faces. Our intuition is based on the fact that thermal images also contain some discriminative information about the person for verification. Deep features from a pre-trained Convolutional Neural Network (CNN) are extracted from the original as well as the synthesized images. These features are then fused to generate a template which is then used for verification. The proposed synthesis network is based on the self-attention generative adversarial network (SAGAN) which essentially allows efficient attention-guided image synthesis. Extensive experiments on the ARL polarimetric thermal face dataset demonstrate that the proposed method achieves state-of-the-art performance.
Tasks	Face Verification, Image Generation
Published	2019-04-15
URL	http://arxiv.org/abs/1904.07344v1
PDF	http://arxiv.org/pdf/1904.07344v1.pdf
PWC	https://paperswithcode.com/paper/polarimetric-thermal-to-visible-face-1
Repo
Framework

Occlusion-guided compact template learning for ensemble deep network-based pose-invariant face recognition


Title	Occlusion-guided compact template learning for ensemble deep network-based pose-invariant face recognition
Authors	Yuhang Wu, Ioannis A. Kakadiaris
Abstract	Concatenation of the deep network representations extracted from different facial patches helps to improve face recognition performance. However, the concatenated facial template increases in size and contains redundant information. Previous solutions aim to reduce the dimensionality of the facial template without considering the occlusion pattern of the facial patches. In this paper, we propose an occlusion-guided compact template learning (OGCTL) approach that only uses the information from visible patches to construct the compact template. The compact face representation is not sensitive to the number of patches that are used to construct the facial template and is more suitable for incorporating the information from different view angles for image-set based face recognition. Instead of using occlusion masks in face matching (e.g., DPRFS [38]), the proposed method uses occlusion masks in template construction and achieves significantly better image-set based face verification performance on a challenging database with a template size that is an order-of-magnitude smaller than DPRFS.
Tasks	Face Recognition, Face Verification, Robust Face Recognition
Published	2019-03-12
URL	http://arxiv.org/abs/1903.04752v2
PDF	http://arxiv.org/pdf/1903.04752v2.pdf
PWC	https://paperswithcode.com/paper/occlusion-guided-compact-template-learning
Repo
Framework

Human Visual Attention Prediction Boosts Learning & Performance of Autonomous Driving Agents


Title	Human Visual Attention Prediction Boosts Learning & Performance of Autonomous Driving Agents
Authors	Alexander Makrigiorgos, Ali Shafti, Alex Harston, Julien Gerard, A. Aldo Faisal
Abstract	Autonomous driving is a multi-task problem requiring a deep understanding of the visual environment. End-to-end autonomous systems have attracted increasing interest as a method of learning to drive without exhaustively programming behaviours for different driving scenarios. When humans drive, they rely on a finely tuned sensory system which enables them to quickly acquire the information they need while filtering unnecessary details. This ability to identify task-specific high-interest regions within an image could be beneficial to autonomous driving agents and machine learning systems in general. To create a system capable of imitating human gaze patterns and visual attention, we collect eye movement data from human drivers in a virtual reality environment. We use this data to train deep neural networks predicting where humans are most likely to look when driving. We then use the outputs of this trained network to selectively mask driving images using a variety of masking techniques. Finally, autonomous driving agents are trained using these masked images as input. Upon comparison, we found that a dual-branch architecture which processes both raw and attention-masked images substantially outperforms all other models, reducing error in control signal predictions by 25.5% compared to a standard end-to-end model trained only on raw images.
Tasks	Autonomous Driving
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05003v1
PDF	https://arxiv.org/pdf/1909.05003v1.pdf
PWC	https://paperswithcode.com/paper/human-visual-attention-prediction-boosts
Repo
Framework