January 29, 2020

3133 words 15 mins read

Paper Group ANR 602

Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control. Domain-adversarial Network Alignment. Nefnir: A high accuracy lemmatizer for Icelandic. Identifying Editor Roles in Argumentative Writing from Student Revision Histories. Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Se …

Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control


Title	Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control
Authors	Jason M. Gregory, Christopher Reardon, Kevin Lee, Geoffrey White, Ki Ng, Caitlyn Sims
Abstract	Human-robot teaming offers great potential because of the opportunities to combine strengths of heterogeneous agents. However, one of the critical challenges in realizing an effective human-robot team is efficient information exchange - both from the human to the robot as well as from the robot to the human. In this work, we present and analyze an augmented reality-enabled, gesture-based system that supports intuitive human-robot teaming through improved information exchange. Our proposed system requires no external instrumentation aside from human-wearable devices and shows promise of real-world applicability for service-oriented missions. Additionally, we present preliminary results from a pilot study with human participants, and highlight lessons learned and open research questions that may help direct future development, fielding, and experimentation of autonomous HRI systems.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06415v1
PDF	https://arxiv.org/pdf/1909.06415v1.pdf
PWC	https://paperswithcode.com/paper/enabling-intuitive-human-robot-teaming-using
Repo
Framework

Domain-adversarial Network Alignment


Title	Domain-adversarial Network Alignment
Authors	Huiting Hong, Xin Li, Yuangang Pan, Ivor Tsang
Abstract	Network alignment is a critical task to a wide variety of fields. Many existing works leverage on representation learning to accomplish this task without eliminating domain representation bias induced by domain-dependent features, which yield inferior alignment performance. This paper proposes a unified deep architecture (DANA) to obtain a domain-invariant representation for network alignment via an adversarial domain classifier. Specifically, we employ the graph convolutional networks to perform network embedding under the domain adversarial principle, given a small set of observed anchors. Then, the semi-supervised learning framework is optimized by maximizing a posterior probability distribution of observed anchors and the loss of a domain classifier simultaneously. We also develop a few variants of our model, such as, direction-aware network alignment, weight-sharing for directed networks and simplification of parameter space. Experiments on three real-world social network datasets demonstrate that our proposed approaches achieve state-of-the-art alignment results.
Tasks	Network Embedding, Representation Learning
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05429v1
PDF	https://arxiv.org/pdf/1908.05429v1.pdf
PWC	https://paperswithcode.com/paper/domain-adversarial-network-alignment
Repo
Framework

Nefnir: A high accuracy lemmatizer for Icelandic


Title	Nefnir: A high accuracy lemmatizer for Icelandic
Authors	Svanhvít Lilja Ingólfsdóttir, Hrafn Loftsson, Jón Friðrik Daðason, Kristín Bjarnadóttir
Abstract	Lemmatization, finding the basic morphological form of a word in a corpus, is an important step in many natural language processing tasks when working with morphologically rich languages. We describe and evaluate Nefnir, a new open source lemmatizer for Icelandic. Nefnir uses suffix substitution rules, derived from a large morphological database, to lemmatize tagged text. Evaluation shows that for correctly tagged text, Nefnir obtains an accuracy of 99.55%, and for text tagged with a PoS tagger, the accuracy obtained is 96.88%.
Tasks	Lemmatization
Published	2019-07-27
URL	https://arxiv.org/abs/1907.11907v1
PDF	https://arxiv.org/pdf/1907.11907v1.pdf
PWC	https://paperswithcode.com/paper/nefnir-a-high-accuracy-lemmatizer-for
Repo
Framework

Identifying Editor Roles in Argumentative Writing from Student Revision Histories


Title	Identifying Editor Roles in Argumentative Writing from Student Revision Histories
Authors	Tazin Afrin, Diane Litman
Abstract	We present a method for identifying editor roles from students’ revision behaviors during argumentative writing. We first develop a method for applying a topic modeling algorithm to identify a set of editor roles from a vocabulary capturing three aspects of student revision behaviors: operation, purpose, and position. We validate the identified roles by showing that modeling the editor roles that students take when revising a paper not only accounts for the variance in revision purposes in our data, but also relates to writing improvement.
Tasks
Published	2019-09-03
URL	https://arxiv.org/abs/1909.05308v1
PDF	https://arxiv.org/pdf/1909.05308v1.pdf
PWC	https://paperswithcode.com/paper/identifying-editor-roles-in-argumentative
Repo
Framework

Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Sensed Imagery


Title	Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Sensed Imagery
Authors	Jonathan Howe, Kyle Pula, Aaron A. Reite
Abstract	The difficulty in obtaining labeled data relevant to a given task is among the most common and well-known practical obstacles to applying deep learning techniques to new or even slightly modified domains. The data volumes required by the current generation of supervised learning algorithms typically far exceed what a human needs to learn and complete a given task. We investigate ways to expand a given labeled corpus of remote sensed imagery into a larger corpus using Generative Adversarial Networks (GANs). We then measure how these additional synthetic data affect supervised machine learning performance on an object detection task. Our data driven strategy is to train GANs to (1) generate synthetic segmentation masks and (2) generate plausible synthetic remote sensing imagery corresponding to these segmentation masks. Run sequentially, these GANs allow the generation of synthetic remote sensing imagery complete with segmentation labels. We apply this strategy to the data set from ISPRS’ 2D Semantic Labeling Contest - Potsdam, with a follow on vehicle detection task. We find that in scenarios with limited training data, augmenting the available data with such synthetically generated data can improve detector performance.
Tasks	Data Augmentation, Object Detection
Published	2019-08-10
URL	https://arxiv.org/abs/1908.03809v1
PDF	https://arxiv.org/pdf/1908.03809v1.pdf
PWC	https://paperswithcode.com/paper/conditional-generative-adversarial-networks-2
Repo
Framework

A Hierarchical Network for Diverse Trajectory Proposals


Title	A Hierarchical Network for Diverse Trajectory Proposals
Authors	Sriram N. N., Gourav Kumar, Abhay Singh, M. Siva Karthik, Saket Saurav Brojeshwar Bhowmick, K. Madhava Krishna
Abstract	Autonomous explorative robots frequently encounter scenarios where multiple future trajectories can be pursued. Often these are cases with multiple paths around an obstacle or trajectory options towards various frontiers. Humans in such situations can inherently perceive and reason about the surrounding environment to identify several possibilities of either manoeuvring around the obstacles or moving towards various frontiers. In this work, we propose a 2 stage Convolutional Neural Network architecture which mimics such an ability to map the perceived surroundings to multiple trajectories that a robot can choose to traverse. The first stage is a Trajectory Proposal Network which suggests diverse regions in the environment which can be occupied in the future. The second stage is a Trajectory Sampling network which provides a finegrained trajectory over the regions proposed by Trajectory Proposal Network. We evaluate our framework in diverse and complicated real life settings. For the outdoor case, we use the KITTI dataset and our own outdoor driving dataset. In the indoor setting, we use an autonomous drone to navigate various scenarios and also a ground robot which can explore the environment using the trajectories proposed by our framework. Our experiments suggest that the framework is able to develop a semantic understanding of the obstacles, open regions and identify diverse trajectories that a robot can traverse. Our comparisons portray the performance gain of the proposed architecture over a diverse set of methods against which it is compared.
Tasks
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03584v1
PDF	https://arxiv.org/pdf/1906.03584v1.pdf
PWC	https://paperswithcode.com/paper/a-hierarchical-network-for-diverse-trajectory
Repo
Framework

Machine translation considering context information using Encoder-Decoder model


Title	Machine translation considering context information using Encoder-Decoder model
Authors	Tetsuto Takano, Satoshi Yamane
Abstract	In the task of machine translation, context information is one of the important factor. But considering the context information model dose not proposed. The paper propose a new model which can integrate context information and make translation. In this paper, we create a new model based Encoder Decoder model. When translating current sentence, the model integrates output from preceding encoder with current encoder. The model can consider context information and the result score is higher than existing model.
Tasks	Machine Translation
Published	2019-03-30
URL	http://arxiv.org/abs/1904.00160v1
PDF	http://arxiv.org/pdf/1904.00160v1.pdf
PWC	https://paperswithcode.com/paper/machine-translation-considering-context
Repo
Framework

A context-aware knowledge acquisition for planning applications using ontologies


Title	A context-aware knowledge acquisition for planning applications using ontologies
Authors	Mohannad Babli, Eva Onaindia
Abstract	Automated planning technology has developed significantly. Designing a planning model that allows an automated agent to be capable of reacting intelligently to unexpected events in a real execution environment yet remains a challenge. This article describes a domain-independent approach to allow the agent to be context-aware of its execution environment and the task it performs, acquire new information that is guaranteed to be related and more importantly manageable, and integrate such information into its model through the use of ontologies and semantic operations to autonomously formulate new objectives, resulting in a more human-like behaviour for handling unexpected events in the context of opportunities.
Tasks
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09845v1
PDF	http://arxiv.org/pdf/1904.09845v1.pdf
PWC	https://paperswithcode.com/paper/190409845
Repo
Framework

Ultrafast Photorealistic Style Transfer via Neural Architecture Search


Title	Ultrafast Photorealistic Style Transfer via Neural Architecture Search
Authors	Jie An, Haoyi Xiong, Jun Huan, Jiebo Luo
Abstract	The key challenge in photorealistic style transfer is that an algorithm should faithfully transfer the style of a reference photo to a content photo while the generated image should look like one captured by a camera. Although several photorealistic style transfer algorithms have been proposed, they need to rely on post- and/or pre-processing to make the generated images look photorealistic. If we disable the additional processing, these algorithms would fail to produce plausible photorealistic stylization in terms of detail preservation and photorealism. In this work, we propose an effective solution to these issues. Our method consists of a construction step (C-step) to build a photorealistic stylization network and a pruning step (P-step) for acceleration. In the C-step, we propose a dense auto-encoder named PhotoNet based on a carefully designed pre-analysis. PhotoNet integrates a feature aggregation module (BFA) and instance normalized skip links (INSL). To generate faithful stylization, we introduce multiple style transfer modules in the decoder and INSLs. PhotoNet significantly outperforms existing algorithms in terms of both efficiency and effectiveness. In the P-step, we adopt a neural architecture search method to accelerate PhotoNet. We propose an automatic network pruning framework in the manner of teacher-student learning for photorealistic stylization. The network architecture named PhotoNAS resulted from the search achieves significant acceleration over PhotoNet while keeping the stylization effects almost intact. We conduct extensive experiments on both image and video transfer. The results show that our method can produce favorable results while achieving 20-30 times acceleration in comparison with the existing state-of-the-art approaches. It is worth noting that the proposed algorithm accomplishes better performance without any pre- or post-processing.
Tasks	Network Pruning, Neural Architecture Search, Style Transfer
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02398v1
PDF	https://arxiv.org/pdf/1912.02398v1.pdf
PWC	https://paperswithcode.com/paper/ultrafast-photorealistic-style-transfer-via
Repo
Framework

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data


Title	Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Authors	Kunal Dhawan, Ganji Sreeram, Kumar Priyadarshi, Rohit Sinha
Abstract	End-to-end (E2E) systems are fast replacing the conventional systems in the domain of automatic speech recognition. As the target labels are learned directly from speech data, the E2E systems need a bigger corpus for effective training. In the context of code-switching task, the E2E systems face two challenges: (i) the expansion of the target set due to multiple languages involved, and (ii) the lack of availability of sufficiently large domain-specific corpus. Towards addressing those challenges, we propose an approach for reducing the number of target labels for reliable training of the E2E systems on limited data. The efficacy of the proposed approach has been demonstrated on two prominent architectures, namely CTC-based and attention-based E2E networks. The experimental validations are performed on a recently created Hindi-English code-switching corpus. For contrast purpose, the results for the full target set based E2E system and a hybrid DNN-HMM system are also reported.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2019-07-15
URL	https://arxiv.org/abs/1907.08293v1
PDF	https://arxiv.org/pdf/1907.08293v1.pdf
PWC	https://paperswithcode.com/paper/investigating-target-set-reduction-for-end-to
Repo
Framework

Traffic Queue Length and Pressure Estimation for Road Networks with Geometric Deep Learning Algorithms


Title	Traffic Queue Length and Pressure Estimation for Road Networks with Geometric Deep Learning Algorithms
Authors	Simon F. G. Ehlers
Abstract	Due to urbanization and the increase of individual mobility, in most metropolitan areas around the world congestion and inefficient traffic management occur. Highly necessary intelligent traffic control systems, which are able to reduce congestion, rely on measurements of traffic situations in urban road networks and freeways. Unfortunately, the instrumentation for accurate traffic measurement is expensive and not widely implemented. This thesis addresses this problem, where relatively inexpensive and easy to install loop-detectors are used by a geometric deep learning algorithm, which uses loop-detector data in a spatial context of a road network, to estimate queue length in front of signalized intersections, which can be then used for following traffic control tasks. Therefore, in the first part of this work a conventional estimation method for queue length (which does not use machine learning techniques) based on second-by-second loop-detector data is implemented, which uses detected shockwaves in queues to estimate the length and point of time for the maximum queue. The method is later used as reference but also as additional input information for the geometric deep learning approach. In the second part the geometric deep learning algorithm is developed, which uses spatial correlations in the road network but also temporal correlations in detector data time sequences by new attention mechanisms, to overcome the limitations of conventional methods like excess traffic demand, lane changing and stop-and-go traffic. Therefore, it is necessary to abstract the topology of the road network in a graph. Both approaches are compared regarding their performance, reliability as well as limitations and validated by usage of the traffic simulation software SUMO (Simulation of Urban MObility). Finally, the results are discussed in the conclusions and further investigations are suggested.
Tasks
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03889v1
PDF	https://arxiv.org/pdf/1905.03889v1.pdf
PWC	https://paperswithcode.com/paper/traffic-queue-length-and-pressure-estimation
Repo
Framework

Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction


Title	Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
Authors	Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, Junji Tomita
Abstract	Question answering (QA) using textual sources for purposes such as reading comprehension (RC) has attracted much attention. This study focuses on the task of explainable multi-hop QA, which requires the system to return the answer with evidence sentences by reasoning and gathering disjoint pieces of the reference texts. It proposes the Query Focused Extractor (QFE) model for evidence extraction and uses multi-task learning with the QA model. QFE is inspired by extractive summarization models; compared with the existing method, which extracts each evidence sentence independently, it sequentially extracts evidence sentences by using an RNN with an attention mechanism on the question sentence. It enables QFE to consider the dependency among the evidence sentences and cover important information in the question sentence. Experimental results show that QFE with a simple RC baseline model achieves a state-of-the-art evidence extraction score on HotpotQA. Although designed for RC, it also achieves a state-of-the-art evidence extraction score on FEVER, which is a recognizing textual entailment task on a large textual database.
Tasks	Answer Selection, Multi-Task Learning, Natural Language Inference, Question Answering, Reading Comprehension
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08511v2
PDF	https://arxiv.org/pdf/1905.08511v2.pdf
PWC	https://paperswithcode.com/paper/answering-while-summarizing-multi-task
Repo
Framework

SCRAM: Spatially Coherent Randomized Attention Maps


Title	SCRAM: Spatially Coherent Randomized Attention Maps
Authors	Dan A. Calian, Peter Roelants, Jacques Cali, Ben Carr, Krishna Dubba, John E. Reid, Dell Zhang
Abstract	Attention mechanisms and non-local mean operations in general are key ingredients in many state-of-the-art deep learning techniques. In particular, the Transformer model based on multi-head self-attention has recently achieved great success in natural language processing and computer vision. However, the vanilla algorithm computing the Transformer of an image with n pixels has O(n^2) complexity, which is often painfully slow and sometimes prohibitively expensive for large-scale image data. In this paper, we propose a fast randomized algorithm — SCRAM — that only requires O(n log(n)) time to produce an image attention map. Such a dramatic acceleration is attributed to our insight that attention maps on real-world images usually exhibit (1) spatial coherence and (2) sparse structure. The central idea of SCRAM is to employ PatchMatch, a randomized correspondence algorithm, to quickly pinpoint the most compatible key (argmax) for each query first, and then exploit that knowledge to design a sparse approximation to non-local mean operations. Using the argmax (mode) to dynamically construct the sparse approximation distinguishes our algorithm from all of the existing sparse approximate methods and makes it very efficient. Moreover, SCRAM is a broadly applicable approximation to any non-local mean layer in contrast to some other sparse approximations that can only approximate self-attention. Our preliminary experimental results suggest that SCRAM is indeed promising for speeding up or scaling up the computation of attention maps in the Transformer.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10308v1
PDF	https://arxiv.org/pdf/1905.10308v1.pdf
PWC	https://paperswithcode.com/paper/scram-spatially-coherent-randomized-attention
Repo
Framework

A Framework for Depth Estimation and Relative Localization of Ground Robots using Computer Vision


Title	A Framework for Depth Estimation and Relative Localization of Ground Robots using Computer Vision
Authors	Romulo T. Rodrigues, Pedro Miraldo, Dimos V. Dimarogonas, A. Pedro Aguiar
Abstract	The 3D depth estimation and relative pose estimation problem within a decentralized architecture is a challenging problem that arises in missions that require coordination among multiple vision-controlled robots. The depth estimation problem aims at recovering the 3D information of the environment. The relative localization problem consists of estimating the relative pose between two robots, by sensing each other’s pose or sharing information about the perceived environment. Most solutions for these problems use a set of discrete data without taking into account the chronological order of the events. This paper builds on recent results on continuous estimation to propose a framework that estimates the depth and relative pose between two non-holonomic vehicles. The basic idea consists in estimating the depth of the points by explicitly considering the dynamics of the camera mounted on a ground robot, and feeding the estimates of 3D points observed by both cameras in a filter that computes the relative pose between the robots. We evaluate the convergence for a set of simulated scenarios and show experimental results validating the proposed framework.
Tasks	3D Depth Estimation, Depth Estimation, Pose Estimation
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00309v1
PDF	https://arxiv.org/pdf/1908.00309v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-depth-estimation-and-relative
Repo
Framework

A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks


Title	A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks
Authors	Owen Marschall, Kyunghyun Cho, Cristina Savin
Abstract	We present a framework for compactly summarizing many recent results in efficient and/or biologically plausible online training of recurrent neural networks (RNN). The framework organizes algorithms according to several criteria: (a) past vs. future facing, (b) tensor structure, (c) stochastic vs. deterministic, and (d) closed form vs. numerical. These axes reveal latent conceptual connections among several recent advances in online learning. Furthermore, we provide novel mathematical intuitions for their degree of success. Testing various algorithms on two synthetic tasks shows that performances cluster according to our criteria. Although a similar clustering is also observed for gradient alignment, alignment with exact methods does not alone explain ultimate performance, especially for stochastic algorithms. This suggests the need for better comparison metrics.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02649v1
PDF	https://arxiv.org/pdf/1907.02649v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-of-online-learning
Repo
Framework