Paper Group ANR 446
Recognizing Objects In-the-wild: Where Do We Stand?. A Study of Cross-domain Generative Models applied to Cartoon Series. Controlling Linguistic Style Aspects in Neural Language Generation. Detecting early signs of depressive and manic episodes in patients with bipolar disorder using the signature-based model. Binary Voting with Delegable Proxy: An …
Recognizing Objects In-the-wild: Where Do We Stand?
Title | Recognizing Objects In-the-wild: Where Do We Stand? |
Authors | Mohammad Reza Loghmani, Barbara Caputo, Markus Vincze |
Abstract | The ability to recognize objects is an essential skill for a robotic system acting in human-populated environments. Despite decades of effort from the robotic and vision research communities, robots are still missing good visual perceptual systems, preventing the use of autonomous agents for real-world applications. The progress is slowed down by the lack of a testbed able to accurately represent the world perceived by the robot in-the-wild. In order to fill this gap, we introduce a large-scale, multi-view object dataset collected with an RGB-D camera mounted on a mobile robot. The dataset embeds the challenges faced by a robot in a real-life application and provides a useful tool for validating object recognition algorithms. Besides describing the characteristics of the dataset, the paper evaluates the performance of a collection of well-established deep convolutional networks on the new dataset and analyzes the transferability of deep representations from Web images to robotic data. Despite the promising results obtained with such representations, the experiments demonstrate that object classification with real-life robotic data is far from being solved. Finally, we provide a comparative study to analyze and highlight the open challenges in robot vision, explaining the discrepancies in the performance. |
Tasks | Object Classification, Object Recognition |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.05862v2 |
http://arxiv.org/pdf/1709.05862v2.pdf | |
PWC | https://paperswithcode.com/paper/recognizing-objects-in-the-wild-where-do-we |
Repo | |
Framework | |
A Study of Cross-domain Generative Models applied to Cartoon Series
Title | A Study of Cross-domain Generative Models applied to Cartoon Series |
Authors | Eman T. Hassan, David J. Crandall |
Abstract | We investigate Generative Adversarial Networks (GANs) to model one particular kind of image: frames from TV cartoons. Cartoons are particularly interesting because their visual appearance emphasizes the important semantic information about a scene while abstracting out the less important details, but each cartoon series has a distinctive artistic style that performs this abstraction in different ways. We consider a dataset consisting of images from two popular television cartoon series, Family Guy and The Simpsons. We examine the ability of GANs to generate images from each of these two domains, when trained independently as well as on both domains jointly. We find that generative models may be capable of finding semantic-level correspondences between these two image domains despite the unsupervised setting, even when the training data does not give labeled alignments between them. |
Tasks | |
Published | 2017-09-29 |
URL | http://arxiv.org/abs/1710.00755v1 |
http://arxiv.org/pdf/1710.00755v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-cross-domain-generative-models |
Repo | |
Framework | |
Controlling Linguistic Style Aspects in Neural Language Generation
Title | Controlling Linguistic Style Aspects in Neural Language Generation |
Authors | Jessica Ficler, Yoav Goldberg |
Abstract | Most work on neural natural language generation (NNLG) focus on controlling the content of the generated text. We experiment with controlling several stylistic aspects of the generated text, in addition to its content. The method is based on conditioned RNN language model, where the desired content as well as the stylistic parameters serve as conditioning contexts. We demonstrate the approach on the movie reviews domain and show that it is successful in generating coherent sentences corresponding to the required linguistic style and content. |
Tasks | Language Modelling, Text Generation |
Published | 2017-07-09 |
URL | http://arxiv.org/abs/1707.02633v1 |
http://arxiv.org/pdf/1707.02633v1.pdf | |
PWC | https://paperswithcode.com/paper/controlling-linguistic-style-aspects-in |
Repo | |
Framework | |
Detecting early signs of depressive and manic episodes in patients with bipolar disorder using the signature-based model
Title | Detecting early signs of depressive and manic episodes in patients with bipolar disorder using the signature-based model |
Authors | Andrey Kormilitzin, Kate E. A. Saunders, Paul J. Harrison, John R. Geddes, Terry Lyons |
Abstract | Recurrent major mood episodes and subsyndromal mood instability cause substantial disability in patients with bipolar disorder. Early identification of mood episodes enabling timely mood stabilisation is an important clinical goal. Recent technological advances allow the prospective reporting of mood in real time enabling more accurate, efficient data capture. The complex nature of these data streams in combination with challenge of deriving meaning from missing data mean pose a significant analytic challenge. The signature method is derived from stochastic analysis and has the ability to capture important properties of complex ordered time series data. To explore whether the onset of episodes of mania and depression can be identified using self-reported mood data. |
Tasks | Time Series |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.01206v1 |
http://arxiv.org/pdf/1708.01206v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-early-signs-of-depressive-and-manic |
Repo | |
Framework | |
Binary Voting with Delegable Proxy: An Analysis of Liquid Democracy
Title | Binary Voting with Delegable Proxy: An Analysis of Liquid Democracy |
Authors | Zoé Christoff, Davide Grossi |
Abstract | The paper provides an analysis of the voting method known as delegable proxy voting, or liquid democracy. The analysis first positions liquid democracy within the theory of binary aggregation. It then focuses on two issues of the system: the occurrence of delegation cycles; and the effect of delegations on individual rationality when voting on logically interdependent propositions. It finally points to proposals on how the system may be modified in order to address the above issues. |
Tasks | |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.08741v1 |
http://arxiv.org/pdf/1707.08741v1.pdf | |
PWC | https://paperswithcode.com/paper/binary-voting-with-delegable-proxy-an |
Repo | |
Framework | |
Coordinated Multi-Agent Imitation Learning
Title | Coordinated Multi-Agent Imitation Learning |
Authors | Hoang M. Le, Yisong Yue, Peter Carr, Patrick Lucey |
Abstract | We study the problem of imitation learning from demonstrations of multiple coordinating agents. One key challenge in this setting is that learning a good model of coordination can be difficult, since coordination is often implicit in the demonstrations and must be inferred as a latent variable. We propose a joint approach that simultaneously learns a latent coordination model along with the individual policies. In particular, our method integrates unsupervised structure learning with conventional imitation learning. We illustrate the power of our approach on a difficult problem of learning multiple policies for fine-grained behavior modeling in team sports, where different players occupy different roles in the coordinated team strategy. We show that having a coordination model to infer the roles of players yields substantially improved imitation loss compared to conventional baselines. |
Tasks | Imitation Learning |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03121v2 |
http://arxiv.org/pdf/1703.03121v2.pdf | |
PWC | https://paperswithcode.com/paper/coordinated-multi-agent-imitation-learning |
Repo | |
Framework | |
ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
Title | ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models |
Authors | Minsuk Kahng, Pierre Y. Andrews, Aditya Kalro, Duen Horng Chau |
Abstract | While deep learning models have achieved state-of-the-art accuracies for many prediction tasks, understanding these models remains a challenge. Despite the recent interest in developing visual tools to help users interpret deep learning models, the complexity and wide variety of models deployed in industry, and the large-scale datasets that they used, pose unique design challenges that are inadequately addressed by existing work. Through participatory design sessions with over 15 researchers and engineers at Facebook, we have developed, deployed, and iteratively improved ActiVis, an interactive visualization system for interpreting large-scale deep learning models and results. By tightly integrating multiple coordinated views, such as a computation graph overview of the model architecture, and a neuron activation view for pattern discovery and comparison, users can explore complex deep neural network models at both the instance- and subset-level. ActiVis has been deployed on Facebook’s machine learning platform. We present case studies with Facebook researchers and engineers, and usage scenarios of how ActiVis may work with different models. |
Tasks | |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01942v2 |
http://arxiv.org/pdf/1704.01942v2.pdf | |
PWC | https://paperswithcode.com/paper/activis-visual-exploration-of-industry-scale |
Repo | |
Framework | |
Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies
Title | Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies |
Authors | Pieter Libin, Timothy Verstraeten, Diederik M. Roijers, Jelena Grujic, Kristof Theys, Philippe Lemey, Ann Nowé |
Abstract | Pandemic influenza has the epidemic potential to kill millions of people. While various preventive measures exist (i.a., vaccination and school closures), deciding on strategies that lead to their most effective and efficient use remains challenging. To this end, individual-based epidemiological models are essential to assist decision makers in determining the best strategy to curb epidemic spread. However, individual-based models are computationally intensive and it is therefore pivotal to identify the optimal strategy using a minimal amount of model evaluations. Additionally, as epidemiological modeling experiments need to be planned, a computational budget needs to be specified a priori. Consequently, we present a new sampling technique to optimize the evaluation of preventive strategies using fixed budget best-arm identification algorithms. We use epidemiological modeling theory to derive knowledge about the reward distribution which we exploit using Bayesian best-arm identification algorithms (i.e., Top-two Thompson sampling and BayesGap). We evaluate these algorithms in a realistic experimental setting and demonstrate that it is possible to identify the optimal strategy using only a limited number of model evaluations, i.e., 2-to-3 times faster compared to the uniform sampling method, the predominant technique used for epidemiological decision making in the literature. Finally, we contribute and evaluate a statistic for Top-two Thompson sampling to inform the decision makers about the confidence of an arm recommendation. |
Tasks | Decision Making |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.06299v2 |
http://arxiv.org/pdf/1711.06299v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-best-arm-identification-for |
Repo | |
Framework | |
2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation
Title | 2D-3D Pose Consistency-based Conditional Random Fields for 3D Human Pose Estimation |
Authors | Ju Yong Chang, Kyoung Mu Lee |
Abstract | This study considers the 3D human pose estimation problem in a single RGB image by proposing a conditional random field (CRF) model over 2D poses, in which the 3D pose is obtained as a byproduct of the inference process. The unary term of the proposed CRF model is defined based on a powerful heat-map regression network, which has been proposed for 2D human pose estimation. This study also presents a regression network for lifting the 2D pose to 3D pose and proposes the prior term based on the consistency between the estimated 3D pose and the 2D pose. To obtain the approximate solution of the proposed CRF model, the N-best strategy is adopted. The proposed inference algorithm can be viewed as sequential processes of bottom-up generation of 2D and 3D pose proposals from the input 2D image based on deep networks and top-down verification of such proposals by checking their consistencies. To evaluate the proposed method, we use two large-scale datasets: Human3.6M and HumanEva. Experimental results show that the proposed method achieves the state-of-the-art 3D human pose estimation performance. |
Tasks | 3D Human Pose Estimation, Pose Estimation |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.03986v2 |
http://arxiv.org/pdf/1704.03986v2.pdf | |
PWC | https://paperswithcode.com/paper/2d-3d-pose-consistency-based-conditional |
Repo | |
Framework | |
Fine-Grained Entity Typing with High-Multiplicity Assignments
Title | Fine-Grained Entity Typing with High-Multiplicity Assignments |
Authors | Maxim Rabinovich, Dan Klein |
Abstract | As entity type systems become richer and more fine-grained, we expect the number of types assigned to a given entity to increase. However, most fine-grained typing work has focused on datasets that exhibit a low degree of type multiplicity. In this paper, we consider the high-multiplicity regime inherent in data sources such as Wikipedia that have semi-open type systems. We introduce a set-prediction approach to this problem and show that our model outperforms unstructured baselines on a new Wikipedia-based fine-grained typing corpus. |
Tasks | Entity Typing |
Published | 2017-04-25 |
URL | http://arxiv.org/abs/1704.07751v1 |
http://arxiv.org/pdf/1704.07751v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-entity-typing-with-high |
Repo | |
Framework | |
Evaluating race and sex diversity in the world’s largest companies using deep neural networks
Title | Evaluating race and sex diversity in the world’s largest companies using deep neural networks |
Authors | Konstantin Chekanov, Polina Mamoshina, Roman V. Yampolskiy, Radu Timofte, Morten Scheibye-Knudsen, Alex Zhavoronkov |
Abstract | Diversity is one of the fundamental properties for the survival of species, populations, and organizations. Recent advances in deep learning allow for the rapid and automatic assessment of organizational diversity and possible discrimination by race, sex, age and other parameters. Automating the process of assessing the organizational diversity using the deep neural networks and eliminating the human factor may provide a set of real-time unbiased reports to all stakeholders. In this pilot study we applied the deep-learned predictors of race and sex to the executive management and board member profiles of the 500 largest companies from the 2016 Forbes Global 2000 list and compared the predicted ratios to the ratios within each company’s country of origin and ranked them by the sex-, age- and race- diversity index (DI). While the study has many limitations and no claims are being made concerning the individual companies, it demonstrates a method for the rapid and impartial assessment of organizational diversity using deep neural networks. |
Tasks | |
Published | 2017-07-09 |
URL | http://arxiv.org/abs/1707.02353v1 |
http://arxiv.org/pdf/1707.02353v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-race-and-sex-diversity-in-the |
Repo | |
Framework | |
Count-Based Exploration in Feature Space for Reinforcement Learning
Title | Count-Based Exploration in Feature Space for Reinforcement Learning |
Authors | Jarryd Martin, Suraj Narayanan Sasikumar, Tom Everitt, Marcus Hutter |
Abstract | We introduce a new count-based optimistic exploration algorithm for Reinforcement Learning (RL) that is feasible in environments with high-dimensional state-action spaces. The success of RL algorithms in these domains depends crucially on generalisation from limited training experience. Function approximation techniques enable RL agents to generalise in order to estimate the value of unvisited states, but at present few methods enable generalisation regarding uncertainty. This has prevented the combination of scalable RL algorithms with efficient exploration strategies that drive the agent to reduce its uncertainty. We present a new method for computing a generalised state visit-count, which allows the agent to estimate the uncertainty associated with any state. Our \phi-pseudocount achieves generalisation by exploiting same feature representation of the state space that is used for value function approximation. States that have less frequently observed features are deemed more uncertain. The \phi-Exploration-Bonus algorithm rewards the agent for exploring in feature space rather than in the untransformed state space. The method is simpler and less computationally expensive than some previous proposals, and achieves near state-of-the-art results on high-dimensional RL benchmarks. |
Tasks | Atari Games, Efficient Exploration |
Published | 2017-06-25 |
URL | http://arxiv.org/abs/1706.08090v1 |
http://arxiv.org/pdf/1706.08090v1.pdf | |
PWC | https://paperswithcode.com/paper/count-based-exploration-in-feature-space-for |
Repo | |
Framework | |
Focusing Attention: Towards Accurate Text Recognition in Natural Images
Title | Focusing Attention: Towards Accurate Text Recognition in Natural Images |
Authors | Zhanzhan Cheng, Fan Bai, Yunlu Xu, Gang Zheng, Shiliang Pu, Shuigeng Zhou |
Abstract | Scene text recognition has been a hot research topic in computer vision due to its various applications. The state of the art is the attention-based encoder-decoder framework that learns the mapping between input images and output sequences in a purely data-driven way. However, we observe that existing attention-based methods perform poorly on complicated and/or low-quality images. One major reason is that existing methods cannot get accurate alignments between feature areas and targets for such images. We call this phenomenon “attention drift”. To tackle this problem, in this paper we propose the FAN (the abbreviation of Focusing Attention Network) method that employs a focusing attention mechanism to automatically draw back the drifted attention. FAN consists of two major components: an attention network (AN) that is responsible for recognizing character targets as in the existing methods, and a focusing network (FN) that is responsible for adjusting attention by evaluating whether AN pays attention properly on the target areas in the images. Furthermore, different from the existing methods, we adopt a ResNet-based network to enrich deep representations of scene text images. Extensive experiments on various benchmarks, including the IIIT5k, SVT and ICDAR datasets, show that the FAN method substantially outperforms the existing methods. |
Tasks | Scene Text Recognition |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02054v3 |
http://arxiv.org/pdf/1709.02054v3.pdf | |
PWC | https://paperswithcode.com/paper/focusing-attention-towards-accurate-text |
Repo | |
Framework | |
Observable dictionary learning for high-dimensional statistical inference
Title | Observable dictionary learning for high-dimensional statistical inference |
Authors | Lionel Mathelin, Kévin Kasper, Hisham Abou-Kandil |
Abstract | This paper introduces a method for efficiently inferring a high-dimensional distributed quantity from a few observations. The quantity of interest (QoI) is approximated in a basis (dictionary) learned from a training set. The coefficients associated with the approximation of the QoI in the basis are determined by minimizing the misfit with the observations. To obtain a probabilistic estimate of the quantity of interest, a Bayesian approach is employed. The QoI is treated as a random field endowed with a hierarchical prior distribution so that closed-form expressions can be obtained for the posterior distribution. The main contribution of the present work lies in the derivation of \emph{a representation basis consistent with the observation chain} used to infer the associated coefficients. The resulting dictionary is then tailored to be both observable by the sensors and accurate in approximating the posterior mean. An algorithm for deriving such an observable dictionary is presented. The method is illustrated with the estimation of the velocity field of an open cavity flow from a handful of wall-mounted point sensors. Comparison with standard estimation approaches relying on Principal Component Analysis and K-SVD dictionaries is provided and illustrates the superior performance of the present approach. |
Tasks | Dictionary Learning |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05289v2 |
http://arxiv.org/pdf/1702.05289v2.pdf | |
PWC | https://paperswithcode.com/paper/observable-dictionary-learning-for-high |
Repo | |
Framework | |
Differentially Private Testing of Identity and Closeness of Discrete Distributions
Title | Differentially Private Testing of Identity and Closeness of Discrete Distributions |
Authors | Jayadev Acharya, Ziteng Sun, Huanyu Zhang |
Abstract | We study the fundamental problems of identity testing (goodness of fit), and closeness testing (two sample test) of distributions over $k$ elements, under differential privacy. While the problems have a long history in statistics, finite sample bounds for these problems have only been established recently. In this work, we derive upper and lower bounds on the sample complexity of both the problems under $(\varepsilon, \delta)$-differential privacy. We provide optimal sample complexity algorithms for identity testing problem for all parameter ranges, and the first results for closeness testing. Our closeness testing bounds are optimal in the sparse regime where the number of samples is at most $k$. Our upper bounds are obtained by privatizing non-private estimators for these problems. The non-private estimators are chosen to have small sensitivity. We propose a general framework to establish lower bounds on the sample complexity of statistical tasks under differential privacy. We show a bound on differentially private algorithms in terms of a coupling between the two hypothesis classes we aim to test. By constructing carefully chosen priors over the hypothesis classes, and using Le Cam’s two point theorem we provide a general mechanism for proving lower bounds. We believe that the framework can be used to obtain strong lower bounds for other statistical tasks under privacy. |
Tasks | |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05128v3 |
http://arxiv.org/pdf/1707.05128v3.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-testing-of-identity |
Repo | |
Framework | |