October 17, 2019

3102 words 15 mins read

Paper Group ANR 830

Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks. Learning and Matching Multi-View Descriptors for Registration of Point Clouds. On Improving Deep Reinforcement Learning for POMDPs. R$^3$SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems. Designing Artificial Cognitive Archite …

Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks


Title	Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks
Authors	Dan Xu, Elisa Ricci, Wanli Ouyang, Xiaogang Wang, Nicu Sebe
Abstract	Depth cues have been proved very useful in various computer vision and robotic tasks. This paper addresses the problem of monocular depth estimation from a single still image. Inspired by the effectiveness of recent works on multi-scale convolutional neural networks (CNN), we propose a deep model which fuses complementary information derived from multiple CNN side outputs. Different from previous methods using concatenation or weighted average schemes, the integration is obtained by means of continuous Conditional Random Fields (CRFs). In particular, we propose two different variations, one based on a cascade of multiple CRFs, the other on a unified graphical model. By designing a novel CNN implementation of mean-field updates for continuous CRFs, we show that both proposed models can be regarded as sequential deep networks and that training can be performed end-to-end. Through an extensive experimental evaluation, we demonstrate the effectiveness of the proposed approach and establish new state of the art results for the monocular depth estimation task on three publicly available datasets, i.e. NYUD-V2, Make3D and KITTI.
Tasks	Depth Estimation, Monocular Depth Estimation
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00891v1
PDF	http://arxiv.org/pdf/1803.00891v1.pdf
PWC	https://paperswithcode.com/paper/monocular-depth-estimation-using-multi-scale
Repo
Framework

Learning and Matching Multi-View Descriptors for Registration of Point Clouds


Title	Learning and Matching Multi-View Descriptors for Registration of Point Clouds
Authors	Lei Zhou, Siyu Zhu, Zixin Luo, Tianwei Shen, Runze Zhang, Mingmin Zhen, Tian Fang, Long Quan
Abstract	Critical to the registration of point clouds is the establishment of a set of accurate correspondences between points in 3D space. The correspondence problem is generally addressed by the design of discriminative 3D local descriptors on the one hand, and the development of robust matching strategies on the other hand. In this work, we first propose a multi-view local descriptor, which is learned from the images of multiple views, for the description of 3D keypoints. Then, we develop a robust matching approach, aiming at rejecting outlier matches based on the efficient inference via belief propagation on the defined graphical model. We have demonstrated the boost of our approaches to registration on the public scanning and multi-view stereo datasets. The superior performance has been verified by the intensive comparisons against a variety of descriptors and matching methods.
Tasks
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05653v1
PDF	http://arxiv.org/pdf/1807.05653v1.pdf
PWC	https://paperswithcode.com/paper/learning-and-matching-multi-view-descriptors
Repo
Framework

On Improving Deep Reinforcement Learning for POMDPs


Title	On Improving Deep Reinforcement Learning for POMDPs
Authors	Pengfei Zhu, Xin Li, Pascal Poupart, Guanghui Miao
Abstract	Deep Reinforcement Learning (RL) recently emerged as one of the most competitive approaches for learning in sequential decision making problems with fully observable environments, e.g., computer Go. However, very little work has been done in deep RL to handle partially observable environments. We propose a new architecture called Action-specific Deep Recurrent Q-Network (ADRQN) to enhance learning performance in partially observable domains. Actions are encoded by a fully connected layer and coupled with a convolutional observation to form an action-observation pair. The time series of action-observation pairs are then integrated by an LSTM layer that learns latent states based on which a fully connected layer computes Q-values as in conventional Deep Q-Networks (DQNs). We demonstrate the effectiveness of our new architecture in several partially observable domains, including flickering Atari games.
Tasks	Atari Games, Decision Making, Time Series
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06309v2
PDF	http://arxiv.org/pdf/1804.06309v2.pdf
PWC	https://paperswithcode.com/paper/on-improving-deep-reinforcement-learning-for-1
Repo
Framework

R$^3$SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems


Title	R$^3$SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems
Authors	Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Simon Walker, Philip H. S. Torr
Abstract	Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM’s techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.
Tasks	Depth Estimation, Stereo Depth Estimation
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12988v1
PDF	http://arxiv.org/pdf/1810.12988v1.pdf
PWC	https://paperswithcode.com/paper/r3sgm-real-time-raster-respecting-semi-global
Repo
Framework

Designing Artificial Cognitive Architectures: Brain Inspired or Biologically Inspired?


Title	Designing Artificial Cognitive Architectures: Brain Inspired or Biologically Inspired?
Authors	Emanuel Diamant
Abstract	Artificial Neural Networks (ANNs) were devised as a tool for Artificial Intelligence design implementations. However, it was soon became obvious that they are unable to fulfill their duties. The fully autonomous way of ANNs working, precluded from any human intervention or supervision, deprived of any theoretical underpinning, leads to a strange state of affairs, when ANN designers cannot explain why and how they achieve their amazing and remarkable results. Therefore, contemporary Artificial Intelligence R&D looks more like a Modern Alchemy enterprise rather than a respected scientific or technological undertaking. On the other hand, modern biological science posits that intelligence can be distinguished not only in human brains. Intelligence today is considered as a fundamental property of each and every living being. Therefore, lower simplified forms of natural intelligence are more suitable for investigation and further replication in artificial cognitive architectures.
Tasks
Published	2018-12-12
URL	http://arxiv.org/abs/1812.04769v1
PDF	http://arxiv.org/pdf/1812.04769v1.pdf
PWC	https://paperswithcode.com/paper/designing-artificial-cognitive-architectures
Repo
Framework

Contextual Encoding for Translation Quality Estimation


Title	Contextual Encoding for Translation Quality Estimation
Authors	Junjie Hu, Wei-Cheng Chang, Yuexin Wu, Graham Neubig
Abstract	The task of word-level quality estimation (QE) consists of taking a source sentence and machine-generated translation, and predicting which words in the output are correct and which are wrong. In this paper, propose a method to effectively encode the local and global contextual information for each target word using a three-part neural network approach. The first part uses an embedding layer to represent words and their part-of-speech tags in both languages. The second part leverages a one-dimensional convolution layer to integrate local context information for each target word. The third part applies a stack of feed-forward and recurrent neural networks to further encode the global context in the sentence before making the predictions. This model was submitted as the CMU entry to the WMT2018 shared task on QE, and achieves strong results, ranking first in three of the six tracks.
Tasks
Published	2018-09-01
URL	http://arxiv.org/abs/1809.00129v1
PDF	http://arxiv.org/pdf/1809.00129v1.pdf
PWC	https://paperswithcode.com/paper/contextual-encoding-for-translation-quality
Repo
Framework

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari


Title	Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari
Authors	Patryk Chrabaszcz, Ilya Loshchilov, Frank Hutter
Abstract	Evolution Strategies (ES) have recently been demonstrated to be a viable alternative to reinforcement learning (RL) algorithms on a set of challenging deep RL problems, including Atari games and MuJoCo humanoid locomotion benchmarks. While the ES algorithms in that work belonged to the specialized class of natural evolution strategies (which resemble approximate gradient RL algorithms, such as REINFORCE), we demonstrate that even a very basic canonical ES algorithm can achieve the same or even better performance. This success of a basic ES algorithm suggests that the state-of-the-art can be advanced further by integrating the many advances made in the field of ES in the last decades. We also demonstrate qualitatively that ES algorithms have very different performance characteristics than traditional RL algorithms: on some games, they learn to exploit the environment and perform much better while on others they can get stuck in suboptimal local minima. Combining their strengths with those of traditional RL algorithms is therefore likely to lead to new advances in the state of the art.
Tasks	Atari Games
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08842v1
PDF	http://arxiv.org/pdf/1802.08842v1.pdf
PWC	https://paperswithcode.com/paper/back-to-basics-benchmarking-canonical
Repo
Framework

Recovery of Point Clouds on Surfaces: Application to Image Reconstruction


Title	Recovery of Point Clouds on Surfaces: Application to Image Reconstruction
Authors	Sunrita Poddar, Mathews Jacob
Abstract	We introduce a framework for the recovery of points on a smooth surface in high-dimensional space, with application to dynamic imaging. We assume the surface to be the zero-level set of a bandlimited function. We show that the exponential maps of the points on the surface satisfy annihilation relations, implying that they lie in a finite dimensional subspace. We rely on nuclear norm minimization of the maps to recover the points from noisy and undersampled measurements. Since this direct approach suffers from the curse of dimensionality, we introduce an iterative reweighted algorithm that uses the “kernel trick”. The resulting algorithm has similarities to iterative algorithms used in graph signal processing (GSP); this framework can be seen as a continuous domain alternative to discrete GSP theory. The use of the algorithm in recovering free breathing and ungated cardiac data shows the potential of this framework in practical applications.
Tasks	Image Reconstruction
Published	2018-01-03
URL	http://arxiv.org/abs/1801.00886v1
PDF	http://arxiv.org/pdf/1801.00886v1.pdf
PWC	https://paperswithcode.com/paper/recovery-of-point-clouds-on-surfaces
Repo
Framework

Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings


Title	Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings
Authors	Joewie J. Koh, Barton Rhodes
Abstract	Domain generation algorithms (DGAs) are frequently employed by malware to generate domains used for connecting to command-and-control (C2) servers. Recent work in DGA detection leveraged deep learning architectures like convolutional neural networks (CNNs) and character-level long short-term memory networks (LSTMs) to classify domains. However, these classifiers perform poorly with wordlist-based DGA families, which generate domains by pseudorandomly concatenating dictionary words. We propose a novel approach that combines context-sensitive word embeddings with a simple fully-connected classifier to perform classification of domains based on word-level information. The word embeddings were pre-trained on a large unrelated corpus and left frozen during the training on domain data. The resulting small number of trainable parameters enabled extremely short training durations, while the transfer of language knowledge stored in the representations allowed for high-performing models with small training datasets. We show that this architecture reliably outperformed existing techniques on wordlist-based DGA families with just 30 DGA training examples and achieved state-of-the-art performance with around 100 DGA training examples, all while requiring an order of magnitude less time to train compared to current techniques. Of special note is the technique’s performance on the matsnu DGA: the classifier attained a 89.5% detection rate with a 1:1,000 false positive rate (FPR) after training on only 30 examples of the DGA domains, and a 91.2% detection rate with a 1:10,000 FPR after 90 examples. Considering that some of these DGAs have wordlists of several hundred words, our results demonstrate that this technique does not rely on the classifier learning the DGA wordlists. Instead, the classifier is able to learn the semantic signatures of the wordlist-based DGA families.
Tasks	Word Embeddings
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08705v1
PDF	http://arxiv.org/pdf/1811.08705v1.pdf
PWC	https://paperswithcode.com/paper/inline-detection-of-domain-generation
Repo
Framework

Various Approaches to Aspect-based Sentiment Analysis


Title	Various Approaches to Aspect-based Sentiment Analysis
Authors	Amlaan Bhoi, Sandeep Joshi
Abstract	The problem of aspect-based sentiment analysis deals with classifying sentiments (negative, neutral, positive) for a given aspect in a sentence. A traditional sentiment classification task involves treating the entire sentence as a text document and classifying sentiments based on all the words. Let us assume, we have a sentence such as “the acceleration of this car is fast, but the reliability is horrible”. This can be a difficult sentence because it has two aspects with conflicting sentiments about the same entity. Considering machine learning techniques (or deep learning), how do we encode the information that we are interested in one aspect and its sentiment but not the other? Let us explore various pre-processing steps, features, and methods used to facilitate in solving this task.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2018-05-05
URL	http://arxiv.org/abs/1805.01984v1
PDF	http://arxiv.org/pdf/1805.01984v1.pdf
PWC	https://paperswithcode.com/paper/various-approaches-to-aspect-based-sentiment
Repo
Framework

Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning


Title	Trust-Aware Decision Making for Human-Robot Collaboration: Model Learning and Planning
Authors	Min Chen, Stefanos Nikolaidis, Harold Soh, David Hsu, Siddhartha Srinivasa
Abstract	Trust in autonomy is essential for effective human-robot collaboration and user adoption of autonomous systems such as robot assistants. This paper introduces a computational model which integrates trust into robot decision-making. Specifically, we learn from data a partially observable Markov decision process (POMDP) with human trust as a latent variable. The trust-POMDP model provides a principled approach for the robot to (i) infer the trust of a human teammate through interaction, (ii) reason about the effect of its own actions on human trust, and (iii) choose actions that maximize team performance over the long term. We validated the model through human subject experiments on a table-clearing task in simulation (201 participants) and with a real robot (20 participants). In our studies, the robot builds human trust by manipulating low-risk objects first. Interestingly, the robot sometimes fails intentionally in order to modulate human trust and achieve the best team performance. These results show that the trust-POMDP calibrates trust to improve human-robot team performance over the long term. Further, they highlight that maximizing trust alone does not always lead to the best performance.
Tasks	Decision Making
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04099v3
PDF	http://arxiv.org/pdf/1801.04099v3.pdf
PWC	https://paperswithcode.com/paper/trust-aware-decision-making-for-human-robot
Repo
Framework


Title	NavigationNet: A Large-scale Interactive Indoor Navigation Dataset
Authors	He Huang, Yujing Shen, Jiankai Sun, Cewu Lu
Abstract	Indoor navigation aims at performing navigation within buildings. In scenes like home and factory, most intelligent mobile devices require an functionality of routing to guide itself precisely through indoor scenes to complete various tasks in order to serve human. In most scenarios, we expected an intelligent device capable of navigating itself in unseen environment. Although several solutions have been proposed to deal with this issue, they usually require pre-installed beacons or a map pre-built with SLAM, which means that they are not capable of working in novel environments. To address this, we proposed NavigationNet, a computer vision dataset and benchmark to allow the utilization of deep reinforcement learning on scene-understanding-based indoor navigation. We also proposed and formalized several typical indoor routing problems that are suitable for deep reinforcement learning.
Tasks	Scene Understanding
Published	2018-08-25
URL	http://arxiv.org/abs/1808.08374v1
PDF	http://arxiv.org/pdf/1808.08374v1.pdf
PWC	https://paperswithcode.com/paper/navigationnet-a-large-scale-interactive
Repo
Framework

GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent


Title	GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent
Authors	Jeff Daily, Abhinav Vishnu, Charles Siegel, Thomas Warfel, Vinay Amatya
Abstract	In this paper, we present GossipGraD - a gossip communication protocol based Stochastic Gradient Descent (SGD) algorithm for scaling Deep Learning (DL) algorithms on large-scale systems. The salient features of GossipGraD are: 1) reduction in overall communication complexity from {\Theta}(log(p)) for p compute nodes in well-studied SGD to O(1), 2) model diffusion such that compute nodes exchange their updates (gradients) indirectly after every log(p) steps, 3) rotation of communication partners for facilitating direct diffusion of gradients, 4) asynchronous distributed shuffle of samples during the feedforward phase in SGD to prevent over-fitting, 5) asynchronous communication of gradients for further reducing the communication cost of SGD and GossipGraD. We implement GossipGraD for GPU and CPU clusters and use NVIDIA GPUs (Pascal P100) connected with InfiniBand, and Intel Knights Landing (KNL) connected with Aries network. We evaluate GossipGraD using well-studied dataset ImageNet-1K (~250GB), and widely studied neural network topologies such as GoogLeNet and ResNet50 (current winner of ImageNet Large Scale Visualization Research Challenge (ILSVRC)). Our performance evaluation using both KNL and Pascal GPUs indicates that GossipGraD can achieve perfect efficiency for these datasets and their associated neural network topologies. Specifically, for ResNet50, GossipGraD is able to achieve ~100% compute efficiency using 128 NVIDIA Pascal P100 GPUs - while matching the top-1 classification accuracy published in literature.
Tasks
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05880v1
PDF	http://arxiv.org/pdf/1803.05880v1.pdf
PWC	https://paperswithcode.com/paper/gossipgrad-scalable-deep-learning-using
Repo
Framework

Aspect Term Extraction with History Attention and Selective Transformation


Title	Aspect Term Extraction with History Attention and Selective Transformation
Authors	Xin Li, Lidong Bing, Piji Li, Wai Lam, Zhimou Yang
Abstract	Aspect Term Extraction (ATE), a key sub-task in Aspect-Based Sentiment Analysis, aims to extract explicit aspect expressions from online user reviews. We present a new framework for tackling ATE. It can exploit two useful clues, namely opinion summary and aspect detection history. Opinion summary is distilled from the whole input sentence, conditioned on each current token for aspect prediction, and thus the tailor-made summary can help aspect prediction on this token. Another clue is the information of aspect detection history, and it is distilled from the previous aspect predictions so as to leverage the coordinate structure and tagging schema constraints to upgrade the aspect prediction. Experimental results over four benchmark datasets clearly demonstrate that our framework can outperform all state-of-the-art methods.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2018-05-02
URL	http://arxiv.org/abs/1805.00760v1
PDF	http://arxiv.org/pdf/1805.00760v1.pdf
PWC	https://paperswithcode.com/paper/aspect-term-extraction-with-history-attention
Repo
Framework

Feudal Reinforcement Learning for Dialogue Management in Large Domains


Title	Feudal Reinforcement Learning for Dialogue Management in Large Domains
Authors	Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Stefan Ultes, Lina Rojas-Barahona, Bo-Hsiang Tseng, Milica Gašić
Abstract	Reinforcement learning (RL) is a promising approach to solve dialogue policy optimisation. Traditional RL algorithms, however, fail to scale to large domains due to the curse of dimensionality. We propose a novel Dialogue Management architecture, based on Feudal RL, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a second step where a primitive action is chosen from the selected subset. The structural information included in the domain ontology is used to abstract the dialogue state space, taking the decisions at each step using different parts of the abstracted state. This, combined with an information sharing mechanism between slots, increases the scalability to large domains. We show that an implementation of this approach, based on Deep-Q Networks, significantly outperforms previous state of the art in several dialogue domains and environments, without the need of any additional reward signal.
Tasks	Dialogue Management
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03232v1
PDF	http://arxiv.org/pdf/1803.03232v1.pdf
PWC	https://paperswithcode.com/paper/feudal-reinforcement-learning-for-dialogue
Repo
Framework