Paper Group ANR 602
Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control. Domain-adversarial Network Alignment. Nefnir: A high accuracy lemmatizer for Icelandic. Identifying Editor Roles in Argumentative Writing from Student Revision Histories. Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Se …
Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control
Title | Enabling Intuitive Human-Robot Teaming Using Augmented Reality and Gesture Control |
Authors | Jason M. Gregory, Christopher Reardon, Kevin Lee, Geoffrey White, Ki Ng, Caitlyn Sims |
Abstract | Human-robot teaming offers great potential because of the opportunities to combine strengths of heterogeneous agents. However, one of the critical challenges in realizing an effective human-robot team is efficient information exchange - both from the human to the robot as well as from the robot to the human. In this work, we present and analyze an augmented reality-enabled, gesture-based system that supports intuitive human-robot teaming through improved information exchange. Our proposed system requires no external instrumentation aside from human-wearable devices and shows promise of real-world applicability for service-oriented missions. Additionally, we present preliminary results from a pilot study with human participants, and highlight lessons learned and open research questions that may help direct future development, fielding, and experimentation of autonomous HRI systems. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06415v1 |
https://arxiv.org/pdf/1909.06415v1.pdf | |
PWC | https://paperswithcode.com/paper/enabling-intuitive-human-robot-teaming-using |
Repo | |
Framework | |
Domain-adversarial Network Alignment
Title | Domain-adversarial Network Alignment |
Authors | Huiting Hong, Xin Li, Yuangang Pan, Ivor Tsang |
Abstract | Network alignment is a critical task to a wide variety of fields. Many existing works leverage on representation learning to accomplish this task without eliminating domain representation bias induced by domain-dependent features, which yield inferior alignment performance. This paper proposes a unified deep architecture (DANA) to obtain a domain-invariant representation for network alignment via an adversarial domain classifier. Specifically, we employ the graph convolutional networks to perform network embedding under the domain adversarial principle, given a small set of observed anchors. Then, the semi-supervised learning framework is optimized by maximizing a posterior probability distribution of observed anchors and the loss of a domain classifier simultaneously. We also develop a few variants of our model, such as, direction-aware network alignment, weight-sharing for directed networks and simplification of parameter space. Experiments on three real-world social network datasets demonstrate that our proposed approaches achieve state-of-the-art alignment results. |
Tasks | Network Embedding, Representation Learning |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05429v1 |
https://arxiv.org/pdf/1908.05429v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-adversarial-network-alignment |
Repo | |
Framework | |
Nefnir: A high accuracy lemmatizer for Icelandic
Title | Nefnir: A high accuracy lemmatizer for Icelandic |
Authors | Svanhvít Lilja Ingólfsdóttir, Hrafn Loftsson, Jón Friðrik Daðason, Kristín Bjarnadóttir |
Abstract | Lemmatization, finding the basic morphological form of a word in a corpus, is an important step in many natural language processing tasks when working with morphologically rich languages. We describe and evaluate Nefnir, a new open source lemmatizer for Icelandic. Nefnir uses suffix substitution rules, derived from a large morphological database, to lemmatize tagged text. Evaluation shows that for correctly tagged text, Nefnir obtains an accuracy of 99.55%, and for text tagged with a PoS tagger, the accuracy obtained is 96.88%. |
Tasks | Lemmatization |
Published | 2019-07-27 |
URL | https://arxiv.org/abs/1907.11907v1 |
https://arxiv.org/pdf/1907.11907v1.pdf | |
PWC | https://paperswithcode.com/paper/nefnir-a-high-accuracy-lemmatizer-for |
Repo | |
Framework | |
Identifying Editor Roles in Argumentative Writing from Student Revision Histories
Title | Identifying Editor Roles in Argumentative Writing from Student Revision Histories |
Authors | Tazin Afrin, Diane Litman |
Abstract | We present a method for identifying editor roles from students’ revision behaviors during argumentative writing. We first develop a method for applying a topic modeling algorithm to identify a set of editor roles from a vocabulary capturing three aspects of student revision behaviors: operation, purpose, and position. We validate the identified roles by showing that modeling the editor roles that students take when revising a paper not only accounts for the variance in revision purposes in our data, but also relates to writing improvement. |
Tasks | |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.05308v1 |
https://arxiv.org/pdf/1909.05308v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-editor-roles-in-argumentative |
Repo | |
Framework | |
Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Sensed Imagery
Title | Conditional Generative Adversarial Networks for Data Augmentation and Adaptation in Remotely Sensed Imagery |
Authors | Jonathan Howe, Kyle Pula, Aaron A. Reite |
Abstract | The difficulty in obtaining labeled data relevant to a given task is among the most common and well-known practical obstacles to applying deep learning techniques to new or even slightly modified domains. The data volumes required by the current generation of supervised learning algorithms typically far exceed what a human needs to learn and complete a given task. We investigate ways to expand a given labeled corpus of remote sensed imagery into a larger corpus using Generative Adversarial Networks (GANs). We then measure how these additional synthetic data affect supervised machine learning performance on an object detection task. Our data driven strategy is to train GANs to (1) generate synthetic segmentation masks and (2) generate plausible synthetic remote sensing imagery corresponding to these segmentation masks. Run sequentially, these GANs allow the generation of synthetic remote sensing imagery complete with segmentation labels. We apply this strategy to the data set from ISPRS’ 2D Semantic Labeling Contest - Potsdam, with a follow on vehicle detection task. We find that in scenarios with limited training data, augmenting the available data with such synthetically generated data can improve detector performance. |
Tasks | Data Augmentation, Object Detection |
Published | 2019-08-10 |
URL | https://arxiv.org/abs/1908.03809v1 |
https://arxiv.org/pdf/1908.03809v1.pdf | |
PWC | https://paperswithcode.com/paper/conditional-generative-adversarial-networks-2 |
Repo | |
Framework | |
A Hierarchical Network for Diverse Trajectory Proposals
Title | A Hierarchical Network for Diverse Trajectory Proposals |
Authors | Sriram N. N., Gourav Kumar, Abhay Singh, M. Siva Karthik, Saket Saurav Brojeshwar Bhowmick, K. Madhava Krishna |
Abstract | Autonomous explorative robots frequently encounter scenarios where multiple future trajectories can be pursued. Often these are cases with multiple paths around an obstacle or trajectory options towards various frontiers. Humans in such situations can inherently perceive and reason about the surrounding environment to identify several possibilities of either manoeuvring around the obstacles or moving towards various frontiers. In this work, we propose a 2 stage Convolutional Neural Network architecture which mimics such an ability to map the perceived surroundings to multiple trajectories that a robot can choose to traverse. The first stage is a Trajectory Proposal Network which suggests diverse regions in the environment which can be occupied in the future. The second stage is a Trajectory Sampling network which provides a finegrained trajectory over the regions proposed by Trajectory Proposal Network. We evaluate our framework in diverse and complicated real life settings. For the outdoor case, we use the KITTI dataset and our own outdoor driving dataset. In the indoor setting, we use an autonomous drone to navigate various scenarios and also a ground robot which can explore the environment using the trajectories proposed by our framework. Our experiments suggest that the framework is able to develop a semantic understanding of the obstacles, open regions and identify diverse trajectories that a robot can traverse. Our comparisons portray the performance gain of the proposed architecture over a diverse set of methods against which it is compared. |
Tasks | |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03584v1 |
https://arxiv.org/pdf/1906.03584v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hierarchical-network-for-diverse-trajectory |
Repo | |
Framework | |
Machine translation considering context information using Encoder-Decoder model
Title | Machine translation considering context information using Encoder-Decoder model |
Authors | Tetsuto Takano, Satoshi Yamane |
Abstract | In the task of machine translation, context information is one of the important factor. But considering the context information model dose not proposed. The paper propose a new model which can integrate context information and make translation. In this paper, we create a new model based Encoder Decoder model. When translating current sentence, the model integrates output from preceding encoder with current encoder. The model can consider context information and the result score is higher than existing model. |
Tasks | Machine Translation |
Published | 2019-03-30 |
URL | http://arxiv.org/abs/1904.00160v1 |
http://arxiv.org/pdf/1904.00160v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-translation-considering-context |
Repo | |
Framework | |
A context-aware knowledge acquisition for planning applications using ontologies
Title | A context-aware knowledge acquisition for planning applications using ontologies |
Authors | Mohannad Babli, Eva Onaindia |
Abstract | Automated planning technology has developed significantly. Designing a planning model that allows an automated agent to be capable of reacting intelligently to unexpected events in a real execution environment yet remains a challenge. This article describes a domain-independent approach to allow the agent to be context-aware of its execution environment and the task it performs, acquire new information that is guaranteed to be related and more importantly manageable, and integrate such information into its model through the use of ontologies and semantic operations to autonomously formulate new objectives, resulting in a more human-like behaviour for handling unexpected events in the context of opportunities. |
Tasks | |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09845v1 |
http://arxiv.org/pdf/1904.09845v1.pdf | |
PWC | https://paperswithcode.com/paper/190409845 |
Repo | |
Framework | |
Ultrafast Photorealistic Style Transfer via Neural Architecture Search
Title | Ultrafast Photorealistic Style Transfer via Neural Architecture Search |
Authors | Jie An, Haoyi Xiong, Jun Huan, Jiebo Luo |
Abstract | The key challenge in photorealistic style transfer is that an algorithm should faithfully transfer the style of a reference photo to a content photo while the generated image should look like one captured by a camera. Although several photorealistic style transfer algorithms have been proposed, they need to rely on post- and/or pre-processing to make the generated images look photorealistic. If we disable the additional processing, these algorithms would fail to produce plausible photorealistic stylization in terms of detail preservation and photorealism. In this work, we propose an effective solution to these issues. Our method consists of a construction step (C-step) to build a photorealistic stylization network and a pruning step (P-step) for acceleration. In the C-step, we propose a dense auto-encoder named PhotoNet based on a carefully designed pre-analysis. PhotoNet integrates a feature aggregation module (BFA) and instance normalized skip links (INSL). To generate faithful stylization, we introduce multiple style transfer modules in the decoder and INSLs. PhotoNet significantly outperforms existing algorithms in terms of both efficiency and effectiveness. In the P-step, we adopt a neural architecture search method to accelerate PhotoNet. We propose an automatic network pruning framework in the manner of teacher-student learning for photorealistic stylization. The network architecture named PhotoNAS resulted from the search achieves significant acceleration over PhotoNet while keeping the stylization effects almost intact. We conduct extensive experiments on both image and video transfer. The results show that our method can produce favorable results while achieving 20-30 times acceleration in comparison with the existing state-of-the-art approaches. It is worth noting that the proposed algorithm accomplishes better performance without any pre- or post-processing. |
Tasks | Network Pruning, Neural Architecture Search, Style Transfer |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02398v1 |
https://arxiv.org/pdf/1912.02398v1.pdf | |
PWC | https://paperswithcode.com/paper/ultrafast-photorealistic-style-transfer-via |
Repo | |
Framework | |
Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data
Title | Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data |
Authors | Kunal Dhawan, Ganji Sreeram, Kumar Priyadarshi, Rohit Sinha |
Abstract | End-to-end (E2E) systems are fast replacing the conventional systems in the domain of automatic speech recognition. As the target labels are learned directly from speech data, the E2E systems need a bigger corpus for effective training. In the context of code-switching task, the E2E systems face two challenges: (i) the expansion of the target set due to multiple languages involved, and (ii) the lack of availability of sufficiently large domain-specific corpus. Towards addressing those challenges, we propose an approach for reducing the number of target labels for reliable training of the E2E systems on limited data. The efficacy of the proposed approach has been demonstrated on two prominent architectures, namely CTC-based and attention-based E2E networks. The experimental validations are performed on a recently created Hindi-English code-switching corpus. For contrast purpose, the results for the full target set based E2E system and a hybrid DNN-HMM system are also reported. |
Tasks | End-To-End Speech Recognition, Speech Recognition |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.08293v1 |
https://arxiv.org/pdf/1907.08293v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-target-set-reduction-for-end-to |
Repo | |
Framework | |
Traffic Queue Length and Pressure Estimation for Road Networks with Geometric Deep Learning Algorithms
Title | Traffic Queue Length and Pressure Estimation for Road Networks with Geometric Deep Learning Algorithms |
Authors | Simon F. G. Ehlers |
Abstract | Due to urbanization and the increase of individual mobility, in most metropolitan areas around the world congestion and inefficient traffic management occur. Highly necessary intelligent traffic control systems, which are able to reduce congestion, rely on measurements of traffic situations in urban road networks and freeways. Unfortunately, the instrumentation for accurate traffic measurement is expensive and not widely implemented. This thesis addresses this problem, where relatively inexpensive and easy to install loop-detectors are used by a geometric deep learning algorithm, which uses loop-detector data in a spatial context of a road network, to estimate queue length in front of signalized intersections, which can be then used for following traffic control tasks. Therefore, in the first part of this work a conventional estimation method for queue length (which does not use machine learning techniques) based on second-by-second loop-detector data is implemented, which uses detected shockwaves in queues to estimate the length and point of time for the maximum queue. The method is later used as reference but also as additional input information for the geometric deep learning approach. In the second part the geometric deep learning algorithm is developed, which uses spatial correlations in the road network but also temporal correlations in detector data time sequences by new attention mechanisms, to overcome the limitations of conventional methods like excess traffic demand, lane changing and stop-and-go traffic. Therefore, it is necessary to abstract the topology of the road network in a graph. Both approaches are compared regarding their performance, reliability as well as limitations and validated by usage of the traffic simulation software SUMO (Simulation of Urban MObility). Finally, the results are discussed in the conclusions and further investigations are suggested. |
Tasks | |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03889v1 |
https://arxiv.org/pdf/1905.03889v1.pdf | |
PWC | https://paperswithcode.com/paper/traffic-queue-length-and-pressure-estimation |
Repo | |
Framework | |
Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction
Title | Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction |
Authors | Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, Junji Tomita |
Abstract | Question answering (QA) using textual sources for purposes such as reading comprehension (RC) has attracted much attention. This study focuses on the task of explainable multi-hop QA, which requires the system to return the answer with evidence sentences by reasoning and gathering disjoint pieces of the reference texts. It proposes the Query Focused Extractor (QFE) model for evidence extraction and uses multi-task learning with the QA model. QFE is inspired by extractive summarization models; compared with the existing method, which extracts each evidence sentence independently, it sequentially extracts evidence sentences by using an RNN with an attention mechanism on the question sentence. It enables QFE to consider the dependency among the evidence sentences and cover important information in the question sentence. Experimental results show that QFE with a simple RC baseline model achieves a state-of-the-art evidence extraction score on HotpotQA. Although designed for RC, it also achieves a state-of-the-art evidence extraction score on FEVER, which is a recognizing textual entailment task on a large textual database. |
Tasks | Answer Selection, Multi-Task Learning, Natural Language Inference, Question Answering, Reading Comprehension |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08511v2 |
https://arxiv.org/pdf/1905.08511v2.pdf | |
PWC | https://paperswithcode.com/paper/answering-while-summarizing-multi-task |
Repo | |
Framework | |
SCRAM: Spatially Coherent Randomized Attention Maps
Title | SCRAM: Spatially Coherent Randomized Attention Maps |
Authors | Dan A. Calian, Peter Roelants, Jacques Cali, Ben Carr, Krishna Dubba, John E. Reid, Dell Zhang |
Abstract | Attention mechanisms and non-local mean operations in general are key ingredients in many state-of-the-art deep learning techniques. In particular, the Transformer model based on multi-head self-attention has recently achieved great success in natural language processing and computer vision. However, the vanilla algorithm computing the Transformer of an image with n pixels has O(n^2) complexity, which is often painfully slow and sometimes prohibitively expensive for large-scale image data. In this paper, we propose a fast randomized algorithm — SCRAM — that only requires O(n log(n)) time to produce an image attention map. Such a dramatic acceleration is attributed to our insight that attention maps on real-world images usually exhibit (1) spatial coherence and (2) sparse structure. The central idea of SCRAM is to employ PatchMatch, a randomized correspondence algorithm, to quickly pinpoint the most compatible key (argmax) for each query first, and then exploit that knowledge to design a sparse approximation to non-local mean operations. Using the argmax (mode) to dynamically construct the sparse approximation distinguishes our algorithm from all of the existing sparse approximate methods and makes it very efficient. Moreover, SCRAM is a broadly applicable approximation to any non-local mean layer in contrast to some other sparse approximations that can only approximate self-attention. Our preliminary experimental results suggest that SCRAM is indeed promising for speeding up or scaling up the computation of attention maps in the Transformer. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10308v1 |
https://arxiv.org/pdf/1905.10308v1.pdf | |
PWC | https://paperswithcode.com/paper/scram-spatially-coherent-randomized-attention |
Repo | |
Framework | |
A Framework for Depth Estimation and Relative Localization of Ground Robots using Computer Vision
Title | A Framework for Depth Estimation and Relative Localization of Ground Robots using Computer Vision |
Authors | Romulo T. Rodrigues, Pedro Miraldo, Dimos V. Dimarogonas, A. Pedro Aguiar |
Abstract | The 3D depth estimation and relative pose estimation problem within a decentralized architecture is a challenging problem that arises in missions that require coordination among multiple vision-controlled robots. The depth estimation problem aims at recovering the 3D information of the environment. The relative localization problem consists of estimating the relative pose between two robots, by sensing each other’s pose or sharing information about the perceived environment. Most solutions for these problems use a set of discrete data without taking into account the chronological order of the events. This paper builds on recent results on continuous estimation to propose a framework that estimates the depth and relative pose between two non-holonomic vehicles. The basic idea consists in estimating the depth of the points by explicitly considering the dynamics of the camera mounted on a ground robot, and feeding the estimates of 3D points observed by both cameras in a filter that computes the relative pose between the robots. We evaluate the convergence for a set of simulated scenarios and show experimental results validating the proposed framework. |
Tasks | 3D Depth Estimation, Depth Estimation, Pose Estimation |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00309v1 |
https://arxiv.org/pdf/1908.00309v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-depth-estimation-and-relative |
Repo | |
Framework | |
A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks
Title | A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks |
Authors | Owen Marschall, Kyunghyun Cho, Cristina Savin |
Abstract | We present a framework for compactly summarizing many recent results in efficient and/or biologically plausible online training of recurrent neural networks (RNN). The framework organizes algorithms according to several criteria: (a) past vs. future facing, (b) tensor structure, (c) stochastic vs. deterministic, and (d) closed form vs. numerical. These axes reveal latent conceptual connections among several recent advances in online learning. Furthermore, we provide novel mathematical intuitions for their degree of success. Testing various algorithms on two synthetic tasks shows that performances cluster according to our criteria. Although a similar clustering is also observed for gradient alignment, alignment with exact methods does not alone explain ultimate performance, especially for stochastic algorithms. This suggests the need for better comparison metrics. |
Tasks | |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02649v1 |
https://arxiv.org/pdf/1907.02649v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-framework-of-online-learning |
Repo | |
Framework | |