Paper Group ANR 28
Creative Robot Dance with Variational Encoder. MovieGraphs: Towards Understanding Human-Centric Situations from Videos. Use of Generative Adversarial Network for Cross-Domain Change Detection. Place recognition: An Overview of Vision Perspective. Reinforcement Learning based Embodied Agents Modelling Human Users Through Interaction and Multi-Sensor …
Creative Robot Dance with Variational Encoder
Title | Creative Robot Dance with Variational Encoder |
Authors | Agnese Augello, Emanuele Cipolla, Ignazio Infantino, Adriano Manfre, Giovanni Pilato, Filippo Vella |
Abstract | What we appreciate in dance is the ability of people to sponta- neously improvise new movements and choreographies, sur- rendering to the music rhythm, being inspired by the cur- rent perceptions and sensations and by previous experiences, deeply stored in their memory. Like other human abilities, this, of course, is challenging to reproduce in an artificial entity such as a robot. Recent generations of anthropomor- phic robots, the so-called humanoids, however, exhibit more and more sophisticated skills and raised the interest in robotic communities to design and experiment systems devoted to automatic dance generation. In this work, we highlight the importance to model a computational creativity behavior in dancing robots to avoid a mere execution of preprogrammed dances. In particular, we exploit a deep learning approach that allows a robot to generate in real time new dancing move- ments according to to the listened music. |
Tasks | |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01489v1 |
http://arxiv.org/pdf/1707.01489v1.pdf | |
PWC | https://paperswithcode.com/paper/creative-robot-dance-with-variational-encoder |
Repo | |
Framework | |
MovieGraphs: Towards Understanding Human-Centric Situations from Videos
Title | MovieGraphs: Towards Understanding Human-Centric Situations from Videos |
Authors | Paul Vicol, Makarand Tapaswi, Lluis Castrejon, Sanja Fidler |
Abstract | There is growing interest in artificial intelligence to build socially intelligent robots. This requires machines to have the ability to “read” people’s emotions, motivations, and other factors that affect behavior. Towards this goal, we introduce a novel dataset called MovieGraphs which provides detailed, graph-based annotations of social situations depicted in movie clips. Each graph consists of several types of nodes, to capture who is present in the clip, their emotional and physical attributes, their relationships (i.e., parent/child), and the interactions between them. Most interactions are associated with topics that provide additional details, and reasons that give motivations for actions. In addition, most interactions and many attributes are grounded in the video with time stamps. We provide a thorough analysis of our dataset, showing interesting common-sense correlations between different social aspects of scenes, as well as across scenes over time. We propose a method for querying videos and text with graphs, and show that: 1) our graphs contain rich and sufficient information to summarize and localize each scene; and 2) subgraphs allow us to describe situations at an abstract level and retrieve multiple semantically relevant situations. We also propose methods for interaction understanding via ordering, and reason understanding. MovieGraphs is the first benchmark to focus on inferred properties of human-centric situations, and opens up an exciting avenue towards socially-intelligent AI agents. |
Tasks | Common Sense Reasoning |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.06761v2 |
http://arxiv.org/pdf/1712.06761v2.pdf | |
PWC | https://paperswithcode.com/paper/moviegraphs-towards-understanding-human |
Repo | |
Framework | |
Use of Generative Adversarial Network for Cross-Domain Change Detection
Title | Use of Generative Adversarial Network for Cross-Domain Change Detection |
Authors | Yamaguchi Kousuke, Tanaka Kanji, Sugimoto Takuma |
Abstract | This paper addresses the problem of cross-domain change detection from a novel perspective of image-to-image translation. In general, change detection aims to identify interesting changes between a given query image and a reference image of the same scene taken at a different time. This problem becomes a challenging one when query and reference images involve different domains (e.g., time of the day, weather, and season) due to variations in object appearance and a limited amount of training examples. In this study, we address the above issue by leveraging a generative adversarial network (GAN). Our key concept is to use a limited amount of training data to train a GAN-based image translator that maps a reference image to a virtual image that cannot be discriminated from query domain images. This enables us to treat the cross-domain change detection task as an in-domain image comparison. This allows us to leverage the large body of literature on in-domain generic change detectors. In addition, we also consider the use of visual place recognition as a method for mining more appropriate reference images over the space of virtual images. Experiments validate efficacy of the proposed approach. |
Tasks | Image-to-Image Translation, Visual Place Recognition |
Published | 2017-12-24 |
URL | http://arxiv.org/abs/1712.08868v1 |
http://arxiv.org/pdf/1712.08868v1.pdf | |
PWC | https://paperswithcode.com/paper/use-of-generative-adversarial-network-for |
Repo | |
Framework | |
Place recognition: An Overview of Vision Perspective
Title | Place recognition: An Overview of Vision Perspective |
Authors | Zhiqiang Zeng, Jian Zhang, Xiaodong Wang, Yuming Chen, Chaoyang Zhu |
Abstract | Place recognition is one of the most fundamental topics in computer vision and robotics communities, where the task is to accurately and efficiently recognize the location of a given query image. Despite years of wisdom accumulated in this field, place recognition still remains an open problem due to the various ways in which the appearance of real-world places may differ. This paper presents an overview of the place recognition literature. Since condition invariant and viewpoint invariant features are essential factors to long-term robust visual place recognition system, We start with traditional image description methodology developed in the past, which exploit techniques from image retrieval field. Recently, the rapid advances of related fields such as object detection and image classification have inspired a new technique to improve visual place recognition system, i.e., convolutional neural networks (CNNs). Thus we then introduce recent progress of visual place recognition system based on CNNs to automatically learn better image representations for places. Eventually, we close with discussions and future work of place recognition. |
Tasks | Image Classification, Image Retrieval, Object Detection, Visual Place Recognition |
Published | 2017-06-17 |
URL | http://arxiv.org/abs/1707.03470v2 |
http://arxiv.org/pdf/1707.03470v2.pdf | |
PWC | https://paperswithcode.com/paper/place-recognition-an-overview-of-vision |
Repo | |
Framework | |
Reinforcement Learning based Embodied Agents Modelling Human Users Through Interaction and Multi-Sensory Perception
Title | Reinforcement Learning based Embodied Agents Modelling Human Users Through Interaction and Multi-Sensory Perception |
Authors | Kory W. Mathewson, Patrick M. Pilarski |
Abstract | This paper extends recent work in interactive machine learning (IML) focused on effectively incorporating human feedback. We show how control and feedback signals complement each other in systems which model human reward. We demonstrate that simultaneously incorporating human control and feedback signals can improve interactive robotic systems’ performance on a self-mirrored movement control task where an RL-agent controlled right arm attempts to match the preprogrammed movement pattern of the left arm. We illustrate the impact of varying human feedback parameters on task performance by investigating the probability of giving feedback on each time step and the likelihood of given feedback being correct. We further illustrate that varying the temporal decay with which the agent incorporates human feedback has a significant impact on task performance. We found that smearing human feedback over time steps improves performance and we show varying the probability of feedback at each time step, and an increased likelihood of those feedbacks being ‘correct’ can impact agent performance. We conclude that understanding latent variables in human feedback is crucial for learning algorithms acting in human-machine interaction domains. |
Tasks | |
Published | 2017-01-09 |
URL | http://arxiv.org/abs/1701.02369v3 |
http://arxiv.org/pdf/1701.02369v3.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-based-embodied-agents |
Repo | |
Framework | |
Learning Deep NBNN Representations for Robust Place Categorization
Title | Learning Deep NBNN Representations for Robust Place Categorization |
Authors | Massimiliano Mancini, Samuel Rota Bulò, Elisa Ricci, Barbara Caputo |
Abstract | This paper presents an approach for semantic place categorization using data obtained from RGB cameras. Previous studies on visual place recognition and classification have shown that, by considering features derived from pre-trained Convolutional Neural Networks (CNNs) in combination with part-based classification models, high recognition accuracy can be achieved, even in presence of occlusions and severe viewpoint changes. Inspired by these works, we propose to exploit local deep representations, representing images as set of regions applying a Na"{i}ve Bayes Nearest Neighbor (NBNN) model for image classification. As opposed to previous methods where CNNs are merely used as feature extractors, our approach seamlessly integrates the NBNN model into a fully-convolutional neural network. Experimental results show that the proposed algorithm outperforms previous methods based on pre-trained CNN models and that, when employed in challenging robot place recognition tasks, it is robust to occlusions, environmental and sensor changes. |
Tasks | Image Classification, Visual Place Recognition |
Published | 2017-02-25 |
URL | http://arxiv.org/abs/1702.07898v2 |
http://arxiv.org/pdf/1702.07898v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-nbnn-representations-for-robust |
Repo | |
Framework | |
Thresholding Bandit for Dose-ranging: The Impact of Monotonicity
Title | Thresholding Bandit for Dose-ranging: The Impact of Monotonicity |
Authors | Aurélien Garivier, Pierre Ménard, Laurent Rossi, Pierre Menard |
Abstract | We analyze the sample complexity of the thresholding bandit problem, with and without the assumption that the mean values of the arms are increasing. In each case, we provide a lower bound valid for any risk $\delta$ and any $\delta$-correct algorithm; in addition, we propose an algorithm whose sample complexity is of the same order of magnitude for small risks. This work is motivated by phase 1 clinical trials, a practically important setting where the arm means are increasing by nature, and where no satisfactory solution is available so far. |
Tasks | |
Published | 2017-11-13 |
URL | http://arxiv.org/abs/1711.04454v2 |
http://arxiv.org/pdf/1711.04454v2.pdf | |
PWC | https://paperswithcode.com/paper/thresholding-bandit-for-dose-ranging-the |
Repo | |
Framework | |
A unified decision making framework for supply and demand management in microgrid networks
Title | A unified decision making framework for supply and demand management in microgrid networks |
Authors | Diddigi Raghuram Bharadwaj, Sai Koti Reddy Danda, Krishnasuri Narayanam, Shalabh Bhatnagar |
Abstract | This paper considers two important problems – on the supply-side and demand-side respectively and studies both in a unified framework. On the supply side, we study the problem of energy sharing among microgrids with the goal of maximizing profit obtained from selling power while at the same time not deviating much from the customer demand. On the other hand, under shortage of power, this problem becomes one of deciding the amount of power to be bought with dynamically varying prices. On the demand side, we consider the problem of optimally scheduling the time-adjustable demand - i.e., of loads with flexible time windows in which they can be scheduled. While previous works have treated these two problems in isolation, we combine these problems together and provide a unified Markov decision process (MDP) framework for these problems. We then apply the Q-learning algorithm, a popular model-free reinforcement learning technique, to obtain the optimal policy. Through simulations, we show that the policy obtained by solving our MDP model provides more profit to the microgrids. |
Tasks | Decision Making, Q-Learning |
Published | 2017-11-14 |
URL | https://arxiv.org/abs/1711.05078v2 |
https://arxiv.org/pdf/1711.05078v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-decision-making-framework-for |
Repo | |
Framework | |
A method of limiting performance loss of CNNs in noisy environments
Title | A method of limiting performance loss of CNNs in noisy environments |
Authors | James R. Geraci, Parichay Kapoor |
Abstract | Convolutional Neural Network (CNN) recognition rates drop in the presence of noise. We demonstrate a novel method of counteracting this drop in recognition rate by adjusting the biases of the neurons in the convolutional layers according to the noise conditions encountered at runtime. We compare our technique to training one network for all possible noise levels, dehazing via preprocessing a signal with a denoising autoencoder, and training a network specifically for each noise level. Our system compares favorably in terms of robustness, computational complexity and recognition rate. |
Tasks | Denoising |
Published | 2017-02-03 |
URL | http://arxiv.org/abs/1702.00932v1 |
http://arxiv.org/pdf/1702.00932v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-of-limiting-performance-loss-of-cnns |
Repo | |
Framework | |
Matching Media Contents with User Profiles by means of the Dempster-Shafer Theory
Title | Matching Media Contents with User Profiles by means of the Dempster-Shafer Theory |
Authors | Luigi Troiano, Irene Díaz, Ciro Gaglione |
Abstract | The media industry is increasingly personalizing the offering of contents in attempt to better target the audience. This requires to analyze the relationships that goes established between users and content they enjoy, looking at one side to the content characteristics and on the other to the user profile, in order to find the best match between the two. In this paper we suggest to build that relationship using the Dempster-Shafer’s Theory of Evidence, proposing a reference model and illustrating its properties by means of a toy example. Finally we suggest possible applications of the model for tasks that are common in the modern media industry. |
Tasks | |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.03048v1 |
http://arxiv.org/pdf/1704.03048v1.pdf | |
PWC | https://paperswithcode.com/paper/matching-media-contents-with-user-profiles-by |
Repo | |
Framework | |
A novel agent-based simulation framework for sensing in complex adaptive environments
Title | A novel agent-based simulation framework for sensing in complex adaptive environments |
Authors | Muaz A. Niazi, Amir Hussain |
Abstract | In this paper we present a novel Formal Agent-Based Simulation framework (FABS). FABS uses formal specification as a means of clear description of wireless sensor networks (WSN) sensing a Complex Adaptive Environment. This specification model is then used to develop an agent-based model of both the wireless sensor network as well as the environment. As proof of concept, we demonstrate the application of FABS to a boids model of self-organized flocking of animals monitored by a random deployment of proximity sensors. |
Tasks | |
Published | 2017-08-19 |
URL | http://arxiv.org/abs/1708.05875v1 |
http://arxiv.org/pdf/1708.05875v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-agent-based-simulation-framework-for |
Repo | |
Framework | |
Fair Inference On Outcomes
Title | Fair Inference On Outcomes |
Authors | Razieh Nabi, Ilya Shpitser |
Abstract | In this paper, we consider the problem of fair statistical inference involving outcome variables. Examples include classification and regression problems, and estimating treatment effects in randomized trials or observational data. The issue of fairness arises in such problems where some covariates or treatments are “sensitive,” in the sense of having potential of creating discrimination. In this paper, we argue that the presence of discrimination can be formalized in a sensible way as the presence of an effect of a sensitive covariate on the outcome along certain causal pathways, a view which generalizes (Pearl, 2009). A fair outcome model can then be learned by solving a constrained optimization problem. We discuss a number of complications that arise in classical statistical inference due to this view and provide workarounds based on recent work in causal and semi-parametric inference. |
Tasks | |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10378v4 |
http://arxiv.org/pdf/1705.10378v4.pdf | |
PWC | https://paperswithcode.com/paper/fair-inference-on-outcomes |
Repo | |
Framework | |
Curve-Structure Segmentation from Depth Maps: A CNN-based Approach and Its Application to Exploring Cultural Heritage Objects
Title | Curve-Structure Segmentation from Depth Maps: A CNN-based Approach and Its Application to Exploring Cultural Heritage Objects |
Authors | Yuhang Lu, Jun Zhou, Jing Wang, Jun Chen, Karen Smith, Colin Wilder, Song Wang |
Abstract | Motivated by the important archaeological application of exploring cultural heritage objects, in this paper we study the challenging problem of automatically segmenting curve structures that are very weakly stamped or carved on an object surface in the form of a highly noisy depth map. Different from most classical low-level image segmentation methods that are known to be very sensitive to the noise and occlusions, we propose a new supervised learning algorithm based on Convolutional Neural Network (CNN) to implicitly learn and utilize more curve geometry and pattern information for addressing this challenging problem. More specifically, we first propose a Fully Convolutional Network (FCN) to estimate the skeleton of curve structures and at each skeleton pixel, a scale value is estimated to reflect the local curve width. Then we propose a dense prediction network to refine the estimated curve skeletons. Based on the estimated scale values, we finally develop an adaptive thresholding algorithm to achieve the final segmentation of curve structures. In the experiment, we validate the performance of the proposed method on a dataset of depth images scanned from unearthed pottery sherds dating to the Woodland period of Southeastern North America. |
Tasks | Semantic Segmentation |
Published | 2017-11-07 |
URL | http://arxiv.org/abs/1711.02718v2 |
http://arxiv.org/pdf/1711.02718v2.pdf | |
PWC | https://paperswithcode.com/paper/curve-structure-segmentation-from-depth-maps |
Repo | |
Framework | |
Characterizing Driving Context from Driver Behavior
Title | Characterizing Driving Context from Driver Behavior |
Authors | Sobhan Moosavi, Behrooz Omidvar-Tehrani, R. Bruce Craig, Arnab Nandi, Rajiv Ramnath |
Abstract | Because of the increasing availability of spatiotemporal data, a variety of data-analytic applications have become possible. Characterizing driving context, where context may be thought of as a combination of location and time, is a new challenging application. An example of such a characterization is finding the correlation between driving behavior and traffic conditions. This contextual information enables analysts to validate observation-based hypotheses about the driving of an individual. In this paper, we present DriveContext, a novel framework to find the characteristics of a context, by extracting significant driving patterns (e.g., a slow-down), and then identifying the set of potential causes behind patterns (e.g., traffic congestion). Our experimental results confirm the feasibility of the framework in identifying meaningful driving patterns, with improvements in comparison with the state-of-the-art. We also demonstrate how the framework derives interesting characteristics for different contexts, through real-world examples. |
Tasks | |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.05733v2 |
http://arxiv.org/pdf/1710.05733v2.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-driving-context-from-driver |
Repo | |
Framework | |
Ensemble of Neural Classifiers for Scoring Knowledge Base Triples
Title | Ensemble of Neural Classifiers for Scoring Knowledge Base Triples |
Authors | Ikuya Yamada, Motoki Sato, Hiroyuki Shindo |
Abstract | This paper describes our approach for the triple scoring task at the WSDM Cup 2017. The task required participants to assign a relevance score for each pair of entities and their types in a knowledge base in order to enhance the ranking results in entity retrieval tasks. We propose an approach wherein the outputs of multiple neural network classifiers are combined using a supervised machine learning model. The experimental results showed that our proposed method achieved the best performance in one out of three measures (i.e., Kendall’s tau), and performed competitively in the other two measures (i.e., accuracy and average score difference). |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.04914v2 |
http://arxiv.org/pdf/1703.04914v2.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-of-neural-classifiers-for-scoring |
Repo | |
Framework | |