Paper Group ANR 624
WHY: Natural Explanations from a Robot Navigator. Learning from partial correction. Fast Stochastic Hierarchical Bayesian MAP for Tomographic Imaging. Attention-Set based Metric Learning for Video Face Recognition. Tracking the Best Expert in Non-stationary Stochastic Environments. Neural Network Memory Architectures for Autonomous Robot Navigation …
WHY: Natural Explanations from a Robot Navigator
Title | WHY: Natural Explanations from a Robot Navigator |
Authors | Raj Korpan, Susan L. Epstein, Anoop Aroor, Gil Dekel |
Abstract | Effective collaboration between a robot and a person requires natural communication. When a robot travels with a human companion, the robot should be able to explain its navigation behavior in natural language. This paper explains how a cognitively-based, autonomous robot navigation system produces informative, intuitive explanations for its decisions. Language generation here is based upon the robot’s commonsense, its qualitative reasoning, and its learned spatial model. This approach produces natural explanations in real time for a robot as it navigates in a large, complex indoor environment. |
Tasks | Robot Navigation, Text Generation |
Published | 2017-09-27 |
URL | http://arxiv.org/abs/1709.09741v1 |
http://arxiv.org/pdf/1709.09741v1.pdf | |
PWC | https://paperswithcode.com/paper/why-natural-explanations-from-a-robot |
Repo | |
Framework | |
Learning from partial correction
Title | Learning from partial correction |
Authors | Sanjoy Dasgupta, Michael Luby |
Abstract | We introduce a new model of interactive learning in which an expert examines the predictions of a learner and partially fixes them if they are wrong. Although this kind of feedback is not i.i.d., we show statistical generalization bounds on the quality of the learned model. |
Tasks | |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08076v4 |
http://arxiv.org/pdf/1705.08076v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-partial-correction |
Repo | |
Framework | |
Fast Stochastic Hierarchical Bayesian MAP for Tomographic Imaging
Title | Fast Stochastic Hierarchical Bayesian MAP for Tomographic Imaging |
Authors | John McKay, Raghu G. Raj, Vishal Monga |
Abstract | Any image recovery algorithm attempts to achieve the highest quality reconstruction in a timely manner. The former can be achieved in several ways, among which are by incorporating Bayesian priors that exploit natural image tendencies to cue in on relevant phenomena. The Hierarchical Bayesian MAP (HB-MAP) is one such approach which is known to produce compelling results albeit at a substantial computational cost. We look to provide further analysis and insights into what makes the HB-MAP work. While retaining the proficient nature of HB-MAP’s Type-I estimation, we propose a stochastic approximation-based approach to Type-II estimation. The resulting algorithm, fast stochastic HB-MAP (fsHBMAP), takes dramatically fewer operations while retaining high reconstruction quality. We employ our fsHBMAP scheme towards the problem of tomographic imaging and demonstrate that fsHBMAP furnishes promising results when compared to many competing methods. |
Tasks | |
Published | 2017-07-07 |
URL | http://arxiv.org/abs/1707.02336v1 |
http://arxiv.org/pdf/1707.02336v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-stochastic-hierarchical-bayesian-map-for |
Repo | |
Framework | |
Attention-Set based Metric Learning for Video Face Recognition
Title | Attention-Set based Metric Learning for Video Face Recognition |
Authors | Yibo Hu, Xiang Wu, Ran He |
Abstract | Face recognition has made great progress with the development of deep learning. However, video face recognition (VFR) is still an ongoing task due to various illumination, low-resolution, pose variations and motion blur. Most existing CNN-based VFR methods only obtain a feature vector from a single image and simply aggregate the features in a video, which less consider the correlations of face images in one video. In this paper, we propose a novel Attention-Set based Metric Learning (ASML) method to measure the statistical characteristics of image sets. It is a promising and generalized extension of Maximum Mean Discrepancy with memory attention weighting. First, we define an effective distance metric on image sets, which explicitly minimizes the intra-set distance and maximizes the inter-set distance simultaneously. Second, inspired by Neural Turing Machine, a Memory Attention Weighting is proposed to adapt set-aware global contents. Then ASML is naturally integrated into CNNs, resulting in an end-to-end learning scheme. Our method achieves state-of-the-art performance for the task of video face recognition on the three widely used benchmarks including YouTubeFace, YouTube Celebrities and Celebrity-1000. |
Tasks | Face Recognition, Metric Learning |
Published | 2017-04-12 |
URL | http://arxiv.org/abs/1704.03805v3 |
http://arxiv.org/pdf/1704.03805v3.pdf | |
PWC | https://paperswithcode.com/paper/attention-set-based-metric-learning-for-video |
Repo | |
Framework | |
Tracking the Best Expert in Non-stationary Stochastic Environments
Title | Tracking the Best Expert in Non-stationary Stochastic Environments |
Authors | Chen-Yu Wei, Yi-Te Hong, Chi-Jen Lu |
Abstract | We study the dynamic regret of multi-armed bandit and experts problem in non-stationary stochastic environments. We introduce a new parameter $\Lambda$, which measures the total statistical variance of the loss distributions over $T$ rounds of the process, and study how this amount affects the regret. We investigate the interaction between $\Lambda$ and $\Gamma$, which counts the number of times the distributions change, as well as $\Lambda$ and $V$, which measures how far the distributions deviates over time. One striking result we find is that even when $\Gamma$, $V$, and $\Lambda$ are all restricted to constant, the regret lower bound in the bandit setting still grows with $T$. The other highlight is that in the full-information setting, a constant regret becomes achievable with constant $\Gamma$ and $\Lambda$, as it can be made independent of $T$, while with constant $V$ and $\Lambda$, the regret still has a $T^{1/3}$ dependency. We not only propose algorithms with upper bound guarantee, but prove their matching lower bounds as well. |
Tasks | |
Published | 2017-12-02 |
URL | https://arxiv.org/abs/1712.00578v2 |
https://arxiv.org/pdf/1712.00578v2.pdf | |
PWC | https://paperswithcode.com/paper/tracking-the-best-expert-in-non-stationary |
Repo | |
Framework | |
Neural Network Memory Architectures for Autonomous Robot Navigation
Title | Neural Network Memory Architectures for Autonomous Robot Navigation |
Authors | Steven W Chen, Nikolay Atanasov, Arbaaz Khan, Konstantinos Karydis, Daniel D. Lee, Vijay Kumar |
Abstract | This paper highlights the significance of including memory structures in neural networks when the latter are used to learn perception-action loops for autonomous robot navigation. Traditional navigation approaches rely on global maps of the environment to overcome cul-de-sacs and plan feasible motions. Yet, maintaining an accurate global map may be challenging in real-world settings. A possible way to mitigate this limitation is to use learning techniques that forgo hand-engineered map representations and infer appropriate control responses directly from sensed information. An important but unexplored aspect of such approaches is the effect of memory on their performance. This work is a first thorough study of memory structures for deep-neural-network-based robot navigation, and offers novel tools to train such networks from supervision and quantify their ability to generalize to unseen scenarios. We analyze the separation and generalization abilities of feedforward, long short-term memory, and differentiable neural computer networks. We introduce a new method to evaluate the generalization ability by estimating the VC-dimension of networks with a final linear readout layer. We validate that the VC estimates are good predictors of actual test performance. The reported method can be applied to deep learning problems beyond robotics. |
Tasks | Robot Navigation |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08049v1 |
http://arxiv.org/pdf/1705.08049v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-memory-architectures-for |
Repo | |
Framework | |
A Robust Indoor Scene Recognition Method based on Sparse Representation
Title | A Robust Indoor Scene Recognition Method based on Sparse Representation |
Authors | Guilherme Nascimento, Camila Laranjeira, Vinicius Braz, Anisio Lacerda, Erickson R. Nascimento |
Abstract | In this paper, we present a robust method for scene recognition, which leverages Convolutional Neural Networks (CNNs) features and Sparse Coding setting by creating a new representation of indoor scenes. Although CNNs highly benefited the fields of computer vision and pattern recognition, convolutional layers adjust weights on a global-approach, which might lead to losing important local details such as objects and small structures. Our proposed scene representation relies on both: global features that mostly refers to environment’s structure, and local features that are sparsely combined to capture characteristics of common objects of a given scene. This new representation is based on fragments of the scene and leverages features extracted by CNNs. The experimental evaluation shows that the resulting representation outperforms previous scene recognition methods on Scene15 and MIT67 datasets, and performs competitively on SUN397, while being highly robust to perturbations in the input image such as noise and occlusion. |
Tasks | Scene Recognition |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07555v1 |
http://arxiv.org/pdf/1708.07555v1.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-indoor-scene-recognition-method |
Repo | |
Framework | |
Hierarchical Metric Learning for Optical Remote Sensing Scene Categorization
Title | Hierarchical Metric Learning for Optical Remote Sensing Scene Categorization |
Authors | Akashdeep Goel, Biplab Banerjee, Aleksandra Pizurica |
Abstract | We address the problem of scene classification from optical remote sensing (RS) images based on the paradigm of hierarchical metric learning. Ideally, supervised metric learning strategies learn a projection from a set of training data points so as to minimize intra-class variance while maximizing inter-class separability to the class label space. However, standard metric learning techniques do not incorporate the class interaction information in learning the transformation matrix, which is often considered to be a bottleneck while dealing with fine-grained visual categories. As a remedy, we propose to organize the classes in a hierarchical fashion by exploring their visual similarities and subsequently learn separate distance metric transformations for the classes present at the non-leaf nodes of the tree. We employ an iterative max-margin clustering strategy to obtain the hierarchical organization of the classes. Experiment results obtained on the large-scale NWPU-RESISC45 and the popular UC-Merced datasets demonstrate the efficacy of the proposed hierarchical metric learning based RS scene recognition strategy in comparison to the standard approaches. |
Tasks | Metric Learning, Scene Classification, Scene Recognition |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01494v3 |
http://arxiv.org/pdf/1708.01494v3.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-metric-learning-for-optical |
Repo | |
Framework | |
On the Selective and Invariant Representation of DCNN for High-Resolution Remote Sensing Image Recognition
Title | On the Selective and Invariant Representation of DCNN for High-Resolution Remote Sensing Image Recognition |
Authors | Jie Chen, Chao Yuan, Min Deng, Chao Tao, Jian Peng, Haifeng Li |
Abstract | Human vision possesses strong invariance in image recognition. The cognitive capability of deep convolutional neural network (DCNN) is close to the human visual level because of hierarchical coding directly from raw image. Owing to its superiority in feature representation, DCNN has exhibited remarkable performance in scene recognition of high-resolution remote sensing (HRRS) images and classification of hyper-spectral remote sensing images. In-depth investigation is still essential for understanding why DCNN can accurately identify diverse ground objects via its effective feature representation. Thus, we train the deep neural network called AlexNet on our large scale remote sensing image recognition benchmark. At the neuron level in each convolution layer, we analyze the general properties of DCNN in HRRS image recognition by use of a framework of visual stimulation-characteristic response combined with feature coding-classification decoding. Specifically, we use histogram statistics, representational dissimilarity matrix, and class activation mapping to observe the selective and invariance representations of DCNN in HRRS image recognition. We argue that selective and invariance representations play important roles in remote sensing images tasks, such as classification, detection, and segment. Also selective and invariance representations are significant to design new DCNN liked models for analyzing and understanding remote sensing images. |
Tasks | Scene Recognition |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01420v1 |
http://arxiv.org/pdf/1708.01420v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-selective-and-invariant-representation |
Repo | |
Framework | |
Generative Statistical Models with Self-Emergent Grammar of Chord Sequences
Title | Generative Statistical Models with Self-Emergent Grammar of Chord Sequences |
Authors | Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii |
Abstract | Generative statistical models of chord sequences play crucial roles in music processing. To capture syntactic similarities among certain chords (e.g. in C major key, between G and G7 and between F and Dm), we study hidden Markov models and probabilistic context-free grammar models with latent variables describing syntactic categories of chord symbols and their unsupervised learning techniques for inducing the latent grammar from data. Surprisingly, we find that these models often outperform conventional Markov models in predictive power, and the self-emergent categories often correspond to traditional harmonic functions. This implies the need for chord categories in harmony models from the informatics perspective. |
Tasks | |
Published | 2017-08-07 |
URL | http://arxiv.org/abs/1708.02255v3 |
http://arxiv.org/pdf/1708.02255v3.pdf | |
PWC | https://paperswithcode.com/paper/generative-statistical-models-with-self |
Repo | |
Framework | |
The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal
Title | The Nearest Neighbor Information Estimator is Adaptively Near Minimax Rate-Optimal |
Authors | Jiantao Jiao, Weihao Gao, Yanjun Han |
Abstract | We analyze the Kozachenko–Leonenko (KL) nearest neighbor estimator for the differential entropy. We obtain the first uniform upper bound on its performance over H"older balls on a torus without assuming any conditions on how close the density could be from zero. Accompanying a new minimax lower bound over the H"older ball, we show that the KL estimator is achieving the minimax rates up to logarithmic factors without cognizance of the smoothness parameter $s$ of the H"older ball for $s\in (0,2]$ and arbitrary dimension $d$, rendering it the first estimator that provably satisfies this property. |
Tasks | |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08824v3 |
http://arxiv.org/pdf/1711.08824v3.pdf | |
PWC | https://paperswithcode.com/paper/the-nearest-neighbor-information-estimator-is |
Repo | |
Framework | |
Q-WordNet PPV: Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages
Title | Q-WordNet PPV: Simple, Robust and (almost) Unsupervised Generation of Polarity Lexicons for Multiple Languages |
Authors | Iñaki San Vicente, Rodrigo Agerri, German Rigau |
Abstract | This paper presents a simple, robust and (almost) unsupervised dictionary-based method, qwn-ppv (Q-WordNet as Personalized PageRanking Vector) to automatically generate polarity lexicons. We show that qwn-ppv outperforms other automatically generated lexicons for the four extrinsic evaluations presented here. It also shows very competitive and robust results with respect to manually annotated ones. Results suggest that no single lexicon is best for every task and dataset and that the intrinsic evaluation of polarity lexicons is not a good performance indicator on a Sentiment Analysis task. The qwn-ppv method allows to easily create quality polarity lexicons whenever no domain-based annotated corpora are available for a given language. |
Tasks | Sentiment Analysis |
Published | 2017-02-06 |
URL | http://arxiv.org/abs/1702.01711v1 |
http://arxiv.org/pdf/1702.01711v1.pdf | |
PWC | https://paperswithcode.com/paper/q-wordnet-ppv-simple-robust-and-almost |
Repo | |
Framework | |
Network-Scale Traffic Modeling and Forecasting with Graphical Lasso and Neural Networks
Title | Network-Scale Traffic Modeling and Forecasting with Graphical Lasso and Neural Networks |
Authors | Shiliang Sun, Rongqing Huang, Ya Gao |
Abstract | Traffic flow forecasting, especially the short-term case, is an important topic in intelligent transportation systems (ITS). This paper does a lot of research on network-scale modeling and forecasting of short-term traffic flows. Firstly, we propose the concepts of single-link and multi-link models of traffic flow forecasting. Secondly, we construct four prediction models by combining the two models with single-task learning and multi-task learning. The combination of the multi-link model and multi-task learning not only improves the experimental efficiency but also the prediction accuracy. Moreover, a new multi-link single-task approach that combines graphical lasso (GL) with neural network (NN) is proposed. GL provides a general methodology for solving problems involving lots of variables. Using L1 regularization, GL builds a sparse graphical model making use of the sparse inverse covariance matrix. In addition, Gaussian process regression (GPR) is a classic regression algorithm in Bayesian machine learning. Although there is wide research on GPR, there are few applications of GPR in traffic flow forecasting. In this paper, we apply GPR to traffic flow forecasting and show its potential. Through sufficient experiments, we compare all of the proposed approaches and make an overall assessment at last. |
Tasks | Multi-Task Learning |
Published | 2017-12-25 |
URL | http://arxiv.org/abs/1801.00711v1 |
http://arxiv.org/pdf/1801.00711v1.pdf | |
PWC | https://paperswithcode.com/paper/network-scale-traffic-modeling-and |
Repo | |
Framework | |
A Tale of Two DRAGGNs: A Hybrid Approach for Interpreting Action-Oriented and Goal-Oriented Instructions
Title | A Tale of Two DRAGGNs: A Hybrid Approach for Interpreting Action-Oriented and Goal-Oriented Instructions |
Authors | Siddharth Karamcheti, Edward C. Williams, Dilip Arumugam, Mina Rhee, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex |
Abstract | Robots operating alongside humans in diverse, stochastic environments must be able to accurately interpret natural language commands. These instructions often fall into one of two categories: those that specify a goal condition or target state, and those that specify explicit actions, or how to perform a given task. Recent approaches have used reward functions as a semantic representation of goal-based commands, which allows for the use of a state-of-the-art planner to find a policy for the given task. However, these reward functions cannot be directly used to represent action-oriented commands. We introduce a new hybrid approach, the Deep Recurrent Action-Goal Grounding Network (DRAGGN), for task grounding and execution that handles natural language from either category as input, and generalizes to unseen environments. Our robot-simulation results demonstrate that a system successfully interpreting both goal-oriented and action-oriented task specifications brings us closer to robust natural language understanding for human-robot interaction. |
Tasks | |
Published | 2017-07-26 |
URL | http://arxiv.org/abs/1707.08668v1 |
http://arxiv.org/pdf/1707.08668v1.pdf | |
PWC | https://paperswithcode.com/paper/a-tale-of-two-draggns-a-hybrid-approach-for |
Repo | |
Framework | |
Sparse canonical correlation analysis
Title | Sparse canonical correlation analysis |
Authors | Xiaotong Suo, Victor Minden, Bradley Nelson, Robert Tibshirani, Michael Saunders |
Abstract | Canonical correlation analysis was proposed by Hotelling [6] and it measures linear relationship between two multidimensional variables. In high dimensional setting, the classical canonical correlation analysis breaks down. We propose a sparse canonical correlation analysis by adding l1 constraints on the canonical vectors and show how to solve it efficiently using linearized alternating direction method of multipliers (ADMM) and using TFOCS as a black box. We illustrate this idea on simulated data. |
Tasks | |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10865v2 |
http://arxiv.org/pdf/1705.10865v2.pdf | |
PWC | https://paperswithcode.com/paper/sparse-canonical-correlation-analysis |
Repo | |
Framework | |