May 5, 2019

1206 words 6 mins read

Paper Group ANR 572

Paper Group ANR 572

Quantum Neural Machine Learning - Backpropagation and Dynamics. Analyzing the Behavior of Visual Question Answering Models. Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models. EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras (Extended Abstract). Lip Reading Sentences in the Wild. …

Quantum Neural Machine Learning - Backpropagation and Dynamics

Title Quantum Neural Machine Learning - Backpropagation and Dynamics
Authors Carlos Pedro Gonçalves
Abstract The current work addresses quantum machine learning in the context of Quantum Artificial Neural Networks such that the networks’ processing is divided in two stages: the learning stage, where the network converges to a specific quantum circuit, and the backpropagation stage where the network effectively works as a self-programing quantum computing system that selects the quantum circuits to solve computing problems. The results are extended to general architectures including recurrent networks that interact with an environment, coupling with it in the neural links’ activation order, and self-organizing in a dynamical regime that intermixes patterns of dynamical stochasticity and persistent quasiperiodic dynamics, making emerge a form of noise resilient dynamical record.
Tasks Quantum Machine Learning
Published 2016-09-22
URL http://arxiv.org/abs/1609.06935v1
PDF http://arxiv.org/pdf/1609.06935v1.pdf
PWC https://paperswithcode.com/paper/quantum-neural-machine-learning
Repo
Framework

Analyzing the Behavior of Visual Question Answering Models

Title Analyzing the Behavior of Visual Question Answering Models
Authors Aishwarya Agrawal, Dhruv Batra, Devi Parikh
Abstract Recently, a number of deep-learning based models have been proposed for the task of Visual Question Answering (VQA). The performance of most models is clustered around 60-70%. In this paper we propose systematic methods to analyze the behavior of these models as a first step towards recognizing their strengths and weaknesses, and identifying the most fruitful directions for progress. We analyze two models, one each from two major classes of VQA models – with-attention and without-attention and show the similarities and differences in the behavior of these models. We also analyze the winning entry of the VQA Challenge 2016. Our behavior analysis reveals that despite recent progress, today’s VQA models are “myopic” (tend to fail on sufficiently novel instances), often “jump to conclusions” (converge on a predicted answer after ‘listening’ to just half the question), and are “stubborn” (do not change their answers across images).
Tasks Question Answering, Visual Question Answering
Published 2016-06-23
URL http://arxiv.org/abs/1606.07356v2
PDF http://arxiv.org/pdf/1606.07356v2.pdf
PWC https://paperswithcode.com/paper/analyzing-the-behavior-of-visual-question
Repo
Framework

Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models

Title Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models
Authors Pan Xu, Lu Tian, Quanquan Gu
Abstract We propose communication-efficient distributed estimation and inference methods for the transelliptical graphical model, a semiparametric extension of the elliptical distribution in the high dimensional regime. In detail, the proposed method distributes the $d$-dimensional data of size $N$ generated from a transelliptical graphical model into $m$ worker machines, and estimates the latent precision matrix on each worker machine based on the data of size $n=N/m$. It then debiases the local estimators on the worker machines and send them back to the master machine. Finally, on the master machine, it aggregates the debiased local estimators by averaging and hard thresholding. We show that the aggregated estimator attains the same statistical rate as the centralized estimator based on all the data, provided that the number of machines satisfies $m \lesssim \min{N\log d/d,\sqrt{N/(s^2\log d)}}$, where $s$ is the maximum number of nonzero entries in each column of the latent precision matrix. It is worth noting that our algorithm and theory can be directly applied to Gaussian graphical models, Gaussian copula graphical models and elliptical graphical models, since they are all special cases of transelliptical graphical models. Thorough experiments on synthetic data back up our theory.
Tasks
Published 2016-12-29
URL http://arxiv.org/abs/1612.09297v1
PDF http://arxiv.org/pdf/1612.09297v1.pdf
PWC https://paperswithcode.com/paper/communication-efficient-distributed-2
Repo
Framework

EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras (Extended Abstract)

Title EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras (Extended Abstract)
Authors Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov, Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, Christian Theobalt
Abstract Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort by possibly needed marker suits, and their recording volume is severely restricted and often constrained to indoor scenes with controlled backgrounds. We therefore propose a new method for real-time, marker-less and egocentric motion capture which estimates the full-body skeleton pose from a lightweight stereo pair of fisheye cameras that are attached to a helmet or virtual-reality headset. It combines the strength of a new generative pose estimation framework for fisheye views with a ConvNet-based body-part detector trained on a new automatically annotated and augmented dataset. Our inside-in method captures full-body motion in general indoor and outdoor scenes, and also crowded scenes.
Tasks Motion Capture, Pose Estimation
Published 2016-12-31
URL http://arxiv.org/abs/1701.00142v1
PDF http://arxiv.org/pdf/1701.00142v1.pdf
PWC https://paperswithcode.com/paper/egocap-egocentric-marker-less-motion-capture
Repo
Framework

Lip Reading Sentences in the Wild

Title Lip Reading Sentences in the Wild
Authors Joon Son Chung, Andrew Senior, Oriol Vinyals, Andrew Zisserman
Abstract The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem - unconstrained natural language sentences, and in the wild videos. Our key contributions are: (1) a ‘Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to characters; (2) a curriculum learning strategy to accelerate training and to reduce overfitting; (3) a ‘Lip Reading Sentences’ (LRS) dataset for visual speech recognition, consisting of over 100,000 natural sentences from British television. The WLAS model trained on the LRS dataset surpasses the performance of all previous work on standard lip reading benchmark datasets, often by a significant margin. This lip reading performance beats a professional lip reader on videos from BBC television, and we also demonstrate that visual information helps to improve speech recognition performance even when the audio is available.
Tasks Speech Recognition, Visual Speech Recognition
Published 2016-11-16
URL http://arxiv.org/abs/1611.05358v2
PDF http://arxiv.org/pdf/1611.05358v2.pdf
PWC https://paperswithcode.com/paper/lip-reading-sentences-in-the-wild
Repo
Framework

Complex-valued Gaussian Process Regression for Time Series Analysis

Title Complex-valued Gaussian Process Regression for Time Series Analysis
Authors Luca Ambrogioni, Eric Maris
Abstract The construction of synthetic complex-valued signals from real-valued observations is an important step in many time series analysis techniques. The most widely used approach is based on the Hilbert transform, which maps the real-valued signal into its quadrature component. In this paper, we define a probabilistic generalization of this approach. We model the observable real-valued signal as the real part of a latent complex-valued Gaussian process. In order to obtain the appropriate statistical relationship between its real and imaginary parts, we define two new classes of complex-valued covariance functions. Through an analysis of simulated chirplets and stochastic oscillations, we show that the resulting Gaussian process complex-valued signal provides a better estimate of the instantaneous amplitude and frequency than the established approaches. Furthermore, the complex-valued Gaussian process regression allows to incorporate prior information about the structure in signal and noise and thereby to tailor the analysis to the features of the signal. As a example, we analyze the non-stationary dynamics of brain oscillations in the alpha band, as measured using magneto-encephalography.
Tasks Time Series, Time Series Analysis
Published 2016-11-30
URL http://arxiv.org/abs/1611.10073v2
PDF http://arxiv.org/pdf/1611.10073v2.pdf
PWC https://paperswithcode.com/paper/complex-valued-gaussian-process-regression
Repo
Framework
comments powered by Disqus