January 26, 2020

3155 words 15 mins read

Paper Group ANR 1541

Paper Group ANR 1541

Understanding Knowledge Distillation in Non-autoregressive Machine Translation. Doubly Robust Crowdsourcing. Free Gap Information from the Differentially Private Sparse Vector and Noisy Max Mechanisms. Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning. Translating SAR to Optical Images for Assisted Interpr …

Understanding Knowledge Distillation in Non-autoregressive Machine Translation

Title Understanding Knowledge Distillation in Non-autoregressive Machine Translation
Authors Chunting Zhou, Graham Neubig, Jiatao Gu
Abstract Non-autoregressive machine translation (NAT) systems predict a sequence of output tokens in parallel, achieving substantial improvements in generation speed compared to autoregressive models. Existing NAT models usually rely on the technique of knowledge distillation, which creates the training data from a pretrained autoregressive model for better performance. Knowledge distillation is empirically useful, leading to large gains in accuracy for NAT models, but the reason for this success has, as of yet, been unclear. In this paper, we first design systematic experiments to investigate why knowledge distillation is crucial to NAT training. We find that knowledge distillation can reduce the complexity of data sets and help NAT to model the variations in the output data. Furthermore, a strong correlation is observed between the capacity of an NAT model and the optimal complexity of the distilled data for the best translation quality. Based on these findings, we further propose several approaches that can alter the complexity of data sets to improve the performance of NAT models. We achieve the state-of-the-art performance for the NAT-based models, and close the gap with the autoregressive baseline on WMT14 En-De benchmark.
Tasks Machine Translation
Published 2019-11-07
URL https://arxiv.org/abs/1911.02727v2
PDF https://arxiv.org/pdf/1911.02727v2.pdf
PWC https://paperswithcode.com/paper/understanding-knowledge-distillation-in-non
Repo
Framework

Doubly Robust Crowdsourcing

Title Doubly Robust Crowdsourcing
Authors Chong Liu, Yu-Xiang Wang
Abstract Large-scale labeled datasets are the indispensable fuel that ignites the AI revolution as we see today. Most such datasets are constructed using crowdsourcing services such as Amazon Mechanical Turk which provides noisy labels from non-experts at a fair price. The sheer size of such datasets mandates that it is only feasible to collect a few labels per data point. We formulate the problem of test-time label aggregation as a statistical estimation problem of inferring the expected voting score in an ideal world where all workers label all items. By imitating workers with supervised learners and using them in a doubly robust estimation framework, we prove that the variance of estimation can be substantially reduced, even if the learner is a poor approximation. Synthetic and real-world experiments show that by combining the doubly robust approach with adaptive worker/item selection, we often need as low as 0.1 labels per data point to achieve nearly the same accuracy as in the ideal world where all workers label all data points.
Tasks
Published 2019-06-08
URL https://arxiv.org/abs/1906.08591v1
PDF https://arxiv.org/pdf/1906.08591v1.pdf
PWC https://paperswithcode.com/paper/doubly-robust-crowdsourcing
Repo
Framework

Free Gap Information from the Differentially Private Sparse Vector and Noisy Max Mechanisms

Title Free Gap Information from the Differentially Private Sparse Vector and Noisy Max Mechanisms
Authors Zeyu Ding, Yuxin Wang, Danfeng Zhang, Daniel Kifer
Abstract Noisy Max and Sparse Vector are selection algorithms for differential privacy and serve as building blocks for more complex algorithms. In this paper we show that both algorithms can release additional information for free (i.e., at no additional privacy cost). Noisy Max is used to return the approximate maximizer among a set of queries. We show that it can also release for free the noisy gap between the approximate maximizer and runner-up. This free information can improve the accuracy of certain subsequent counting queries by up to 50%. Sparse Vector is used to return a set of queries that are approximately larger than a fixed threshold. We show that it can adaptively control its privacy budget (use less budget for queries that are likely to be much larger than the threshold) in order to increase the amount of queries it can process. These results follow from a careful privacy analysis.
Tasks
Published 2019-04-29
URL https://arxiv.org/abs/1904.12773v3
PDF https://arxiv.org/pdf/1904.12773v3.pdf
PWC https://paperswithcode.com/paper/free-gap-information-from-the-differentially
Repo
Framework

Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning

Title Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning
Authors Macheng Shen, Jonathan P How
Abstract We pose an active perception problem where an autonomous agent actively interacts with a second agent with potentially adversarial behaviors. Given the uncertainty in the intent of the other agent, the objective is to collect further evidence to help discriminate potential threats. The main technical challenges are the partial observability of the agent intent, the adversary modeling, and the corresponding uncertainty modeling. Note that an adversary agent may act to mislead the autonomous agent by using a deceptive strategy that is learned from past experiences. We propose an approach that combines belief space planning, generative adversary modeling, and maximum entropy reinforcement learning to obtain a stochastic belief space policy. By accounting for various adversarial behaviors in the simulation framework and minimizing the predictability of the autonomous agent’s action, the resulting policy is more robust to unmodeled adversarial strategies. This improved robustness is empirically shown against an adversary that adapts to and exploits the autonomous agent’s policy when compared with a standard Chance-Constraint Partially Observable Markov Decision Process robust approach.
Tasks
Published 2019-02-14
URL https://arxiv.org/abs/1902.05644v2
PDF https://arxiv.org/pdf/1902.05644v2.pdf
PWC https://paperswithcode.com/paper/active-perception-in-adversarial-scenarios
Repo
Framework

Translating SAR to Optical Images for Assisted Interpretation

Title Translating SAR to Optical Images for Assisted Interpretation
Authors Shilei Fu, Feng Xu, Ya-Qiu Jin
Abstract Despite the advantages of all-weather and all-day high-resolution imaging, SAR remote sensing images are much less viewed and used by general people because human vision is not adapted to microwave scattering phenomenon. However, expert interpreters can be trained by compare side-by-side SAR and optical images to learn the translation rules from SAR to optical. This paper attempts to develop machine intelligence that are trainable with large-volume co-registered SAR and optical images to translate SAR image to optical version for assisted SAR interpretation. A novel reciprocal GAN scheme is proposed for this translation task. It is trained and tested on both spaceborne GF-3 and airborne UAVSAR images. Comparisons and analyses are presented for datasets of different resolutions and polarizations. Results show that the proposed translation network works well under many scenarios and it could potentially be used for assisted SAR interpretation.
Tasks
Published 2019-01-08
URL http://arxiv.org/abs/1901.03749v1
PDF http://arxiv.org/pdf/1901.03749v1.pdf
PWC https://paperswithcode.com/paper/translating-sar-to-optical-images-for
Repo
Framework

The Performance Envelope of Inverted Indexing on Modern Hardware

Title The Performance Envelope of Inverted Indexing on Modern Hardware
Authors Jimmy Lin, Lori Paniak, Gordon Boerke
Abstract This paper explores the performance envelope of “traditional” inverted indexing on modern hardware using the implementation in the open-source Lucene search library. We benchmark indexing throughput on a single high-end multi-core commodity server in a number of configurations varying the media of the source collection and target index, examining a network-attached store, a direct-attached disk array, and an SSD. Experiments show that the largest determinants of performance are the physical characteristics of the source and target media, and that physically isolating the two yields the highest indexing throughput. Results suggest that current indexing techniques have reached physical device limits, and that further algorithmic improvements in performance are unlikely without rethinking the inverted indexing pipeline in light of observed bottlenecks.
Tasks
Published 2019-10-24
URL https://arxiv.org/abs/1910.11028v2
PDF https://arxiv.org/pdf/1910.11028v2.pdf
PWC https://paperswithcode.com/paper/the-performance-envelope-of-inverted-indexing
Repo
Framework

A Comparative Analysis of Virtual Reality Head-Mounted Display Systems

Title A Comparative Analysis of Virtual Reality Head-Mounted Display Systems
Authors Arian Mehrfard, Javad Fotouhi, Giacomo Taylor, Tess Forster, Nassir Navab, Bernhard Fuerst
Abstract With recent advances of Virtual Reality (VR) technology, the deployment of such will dramatically increase in non-entertainment environments, such as professional education and training, manufacturing, service, or low frequency/high risk scenarios. Clinical education is an area that especially stands to benefit from VR technology due to the complexity, high cost, and difficult logistics. The effectiveness of the deployment of VR systems, is subject to factors that may not be necessarily considered for devices targeting the entertainment market. In this work, we systematically compare a wide range of VR Head-Mounted Displays (HMDs) technologies and designs by defining a new set of metrics that are 1) relevant to most generic VR solutions and 2) are of paramount importance for VR-based education and training. We evaluated ten HMDs based on various criteria, including neck strain, heat development, and color accuracy. Other metrics such as text readability, comfort, and contrast perception were evaluated in a multi-user study on three selected HMDs, namely Oculus Rift S, HTC Vive Pro and Samsung Odyssey+. Results indicate that the HTC Vive Pro performs best with regards to comfort, display quality and compatibility with glasses.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02913v1
PDF https://arxiv.org/pdf/1912.02913v1.pdf
PWC https://paperswithcode.com/paper/a-comparative-analysis-of-virtual-reality
Repo
Framework

A Typedriven Vector Semantics for Ellipsis with Anaphora using Lambek Calculus with Limited Contraction

Title A Typedriven Vector Semantics for Ellipsis with Anaphora using Lambek Calculus with Limited Contraction
Authors Gijs Wijnholds, Mehrnoosh Sadrzadeh
Abstract We develop a vector space semantics for verb phrase ellipsis with anaphora using type-driven compositional distributional semantics based on the Lambek calculus with limited contraction (LCC) of J"ager (2006). Distributional semantics has a lot to say about the statistical collocation-based meanings of content words, but provides little guidance on how to treat function words. Formal semantics on the other hand, has powerful mechanisms for dealing with relative pronouns, coordinators, and the like. Type-driven compositional distributional semantics brings these two models together. We review previous compositional distributional models of relative pronouns, coordination and a restricted account of ellipsis in the DisCoCat framework of Coecke et al. (2010, 2013). We show how DisCoCat cannot deal with general forms of ellipsis, which rely on copying of information, and develop a novel way of connecting typelogical grammar to distributional semantics by assigning vector interpretable lambda terms to derivations of LCC in the style of Muskens & Sadrzadeh (2016). What follows is an account of (verb phrase) ellipsis in which word meanings can be copied: the meaning of a sentence is now a program with non-linear access to individual word embeddings. We present the theoretical setting, work out examples, and demonstrate our results on a toy distributional model motivated by data.
Tasks Word Embeddings
Published 2019-05-05
URL https://arxiv.org/abs/1905.01647v1
PDF https://arxiv.org/pdf/1905.01647v1.pdf
PWC https://paperswithcode.com/paper/a-typedriven-vector-semantics-for-ellipsis
Repo
Framework

Multi-Precision Quantized Neural Networks via Encoding Decomposition of -1 and +1

Title Multi-Precision Quantized Neural Networks via Encoding Decomposition of -1 and +1
Authors Qigong Sun, Fanhua Shang, Kang Yang, Xiufang Li, Yan Ren, Licheng Jiao
Abstract The training of deep neural networks (DNNs) requires intensive resources both for computation and for storage performance. Thus, DNNs cannot be efficiently applied to mobile phones and embedded devices, which seriously limits their applicability in industry applications. To address this issue, we propose a novel encoding scheme of using {-1,+1} to decompose quantized neural networks (QNNs) into multi-branch binary networks, which can be efficiently implemented by bitwise operations (xnor and bitcount) to achieve model compression, computational acceleration and resource saving. Based on our method, users can easily achieve different encoding precisions arbitrarily according to their requirements and hardware resources. The proposed mechanism is very suitable for the use of FPGA and ASIC in terms of data storage and computation, which provides a feasible idea for smart chips. We validate the effectiveness of our method on both large-scale image classification tasks (e.g., ImageNet) and object detection tasks. In particular, our method with low-bit encoding can still achieve almost the same performance as its full-precision counterparts.
Tasks Image Classification, Model Compression, Object Detection
Published 2019-05-31
URL https://arxiv.org/abs/1905.13389v1
PDF https://arxiv.org/pdf/1905.13389v1.pdf
PWC https://paperswithcode.com/paper/multi-precision-quantized-neural-networks-via
Repo
Framework

Cursive Overlapped Character Segmentation: An Enhanced Approach

Title Cursive Overlapped Character Segmentation: An Enhanced Approach
Authors Amjad Rehman
Abstract Segmentation of highly slanted and horizontally overlapped characters is a challenging research area that is still fresh. Several techniques are reported in the state of art, but produce low accuracy for the highly slanted characters segmentation and cause overall low handwriting recognition precision. Accordingly, this paper presents a simple yet effective approach for character segmentation of such difficult slanted cursive words without using any slant correction technique. Rather a new concept of core-zone is introduced for segmenting such difficult slanted handwritten words. However, due to the inherent nature of cursive words, few characters are over-segmented and therefore, a threshold is selected heuristically to overcome this problem. For fair comparison, difficult words are extracted from the IAM benchmark database. Experiments thus performed exhibit promising result and high speed.
Tasks
Published 2019-03-23
URL http://arxiv.org/abs/1904.00792v1
PDF http://arxiv.org/pdf/1904.00792v1.pdf
PWC https://paperswithcode.com/paper/cursive-overlapped-character-segmentation-an
Repo
Framework

Neural Graph Embedding Methods for Natural Language Processing

Title Neural Graph Embedding Methods for Natural Language Processing
Authors Shikhar Vashishth
Abstract Knowledge graphs are structured representations of facts in a graph, where nodes represent entities and edges represent relationships between them. Recent research has resulted in the development of several large KGs. However, all of them tend to be sparse with very few facts per entity. In the first part of the thesis, we propose three solutions to alleviate this problem: (1) KG Canonicalization, i.e., identifying and merging duplicate entities in a KG, (2) Relation Extraction which involves automating the process of extracting semantic relationships between entities from unstructured text, and (3) Link prediction which includes inferring missing facts based on the known facts in a KG. Traditional Neural Networks like CNNs and RNNs are constrained to handle Euclidean data. However, graphs in Natural Language Processing (NLP) are prominent. Recently, Graph Convolutional Networks (GCNs) have been proposed to address this shortcoming and have been successfully applied for several problems. In the second part of the thesis, we utilize GCNs for Document Timestamping problem and for learning word embeddings using dependency context of a word instead of sequential context. In this third part of the thesis, we address two limitations of existing GCN models, i.e., (1) The standard neighborhood aggregation scheme puts no constraints on the number of nodes that can influence the representation of a target node. This leads to a noisy representation of hub-nodes which coves almost the entire graph in a few hops. (2) Most of the existing GCN models are limited to handle undirected graphs. However, a more general and pervasive class of graphs are relational graphs where each edge has a label and direction associated with it. Existing approaches to handle such graphs suffer from over-parameterization and are restricted to learning representation of nodes only.
Tasks Graph Embedding, Knowledge Graphs, Learning Word Embeddings, Link Prediction, Relation Extraction, Word Embeddings
Published 2019-11-08
URL https://arxiv.org/abs/1911.03042v2
PDF https://arxiv.org/pdf/1911.03042v2.pdf
PWC https://paperswithcode.com/paper/neural-graph-embedding-methods-for-natural
Repo
Framework

Expert-Level Atari Imitation Learning from Demonstrations Only

Title Expert-Level Atari Imitation Learning from Demonstrations Only
Authors Xin-Qiang Cai, Yao-Xiang Ding, Yuan Jiang, Zhi-Hua Zhou
Abstract One of the key issues for imitation learning lies in making policy learned from limited samples to generalize well in the whole state-action space. This problem is much more severe in high-dimensional state environments, such as game playing with raw pixel inputs. Under this situation, even state-of-the-art adversary based imitation learning algorithms fail. Through theoretical and empirical studies, we find that the main cause lies in the failure of training a powerful discriminator to generate meaningful rewards in high-dimensional environments. Theoretical results are provided to suggest the necessity of dimensionality reduction. However, since preserving important discriminative information via feature transformation is a non-trivial task, a straightforward application of off-the-shelf methods cannot achieve desirable performance. To address the above issues, we propose HashReward, which is a novel imitation learning algorithm utilizing the idea of supervised hashing to realize effective training of the discriminator. As far as we are aware, HashReward is the first pure imitation learning approach to achieve expert comparable performance in Atari game environments with raw pixel inputs.
Tasks Dimensionality Reduction, Imitation Learning
Published 2019-09-09
URL https://arxiv.org/abs/1909.03773v1
PDF https://arxiv.org/pdf/1909.03773v1.pdf
PWC https://paperswithcode.com/paper/expert-level-atari-imitation-learning-from
Repo
Framework

SSSDET: Simple Short and Shallow Network for Resource Efficient Vehicle Detection in Aerial Scenes

Title SSSDET: Simple Short and Shallow Network for Resource Efficient Vehicle Detection in Aerial Scenes
Authors Murari Mandal, Manal Shah, Prashant Meena, Santosh Kumar Vipparthi
Abstract Detection of small-sized targets is of paramount importance in many aerial vision-based applications. The commonly deployed low cost unmanned aerial vehicles (UAVs) for aerial scene analysis are highly resource constrained in nature. In this paper we propose a simple short and shallow network (SSSDet) to robustly detect and classify small-sized vehicles in aerial scenes. The proposed SSSDet is up to 4x faster, requires 4.4x less FLOPs, has 30x less parameters, requires 31x less memory space and provides better accuracy in comparison to existing state-of-the-art detectors. Thus, it is more suitable for hardware implementation in real-time applications. We also created a new airborne image dataset (ABD) by annotating 1396 new objects in 79 aerial images for our experiments. The effectiveness of the proposed method is validated on the existing VEDAI, DLR-3K, DOTA and Combined dataset. The SSSDet outperforms state-of-the-art detectors in term of accuracy, speed, compute and memory efficiency.
Tasks
Published 2019-08-31
URL https://arxiv.org/abs/1909.00292v1
PDF https://arxiv.org/pdf/1909.00292v1.pdf
PWC https://paperswithcode.com/paper/sssdet-simple-short-and-shallow-network-for
Repo
Framework

Second-order Non-local Attention Networks for Person Re-identification

Title Second-order Non-local Attention Networks for Person Re-identification
Authors Bryan, Xia, Yuan Gong, Yizhe Zhang, Christian Poellabauer
Abstract Recent efforts have shown promising results for person re-identification by designing part-based architectures to allow a neural network to learn discriminative representations from semantically coherent parts. Some efforts use soft attention to reallocate distant outliers to their most similar parts, while others adjust part granularity to incorporate more distant positions for learning the relationships. Others seek to generalize part-based methods by introducing a dropout mechanism on consecutive regions of the feature map to enhance distant region relationships. However, only few prior efforts model the distant or non-local positions of the feature map directly for the person re-ID task. In this paper, we propose a novel attention mechanism to directly model long-range relationships via second-order feature statistics. When combined with a generalized DropBlock module, our method performs equally to or better than state-of-the-art results for mainstream person re-identification datasets, including Market1501, CUHK03, and DukeMTMC-reID.
Tasks Person Re-Identification
Published 2019-08-31
URL https://arxiv.org/abs/1909.00295v1
PDF https://arxiv.org/pdf/1909.00295v1.pdf
PWC https://paperswithcode.com/paper/second-order-non-local-attention-networks-for
Repo
Framework

Collaborative Homomorphic Computation on Data Encrypted under Multiple Keys

Title Collaborative Homomorphic Computation on Data Encrypted under Multiple Keys
Authors Asma Aloufi, Peizhao Hu
Abstract Homomorphic encryption (HE) is a promising cryptographic technique for enabling secure collaborative machine learning in the cloud. However, support for homomorphic computation on ciphertexts under multiple keys is inefficient. Current solutions often require key setup before any computation or incur large ciphertext size (at best, grow linearly to the number of involved keys). In this paper, we proposed a new approach that leverages threshold and multi-key HE to support computations on ciphertexts under different keys. Our new approach removes the need for key setup between each client and the set of model owners. At the same time, this approach reduces the number of encrypted models to be offloaded to the cloud evaluator, and the ciphertext size with a dimension reduction from (N+1)x2 to 2x2. We present the details of each step and discuss the complexity and security of our approach.
Tasks Dimensionality Reduction
Published 2019-11-11
URL https://arxiv.org/abs/1911.04101v1
PDF https://arxiv.org/pdf/1911.04101v1.pdf
PWC https://paperswithcode.com/paper/collaborative-homomorphic-computation-on-data
Repo
Framework
comments powered by Disqus