Paper Group ANR 572
Quantum Neural Machine Learning - Backpropagation and Dynamics. Analyzing the Behavior of Visual Question Answering Models. Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models. EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras (Extended Abstract). Lip Reading Sentences in the Wild. …
Quantum Neural Machine Learning - Backpropagation and Dynamics
Title | Quantum Neural Machine Learning - Backpropagation and Dynamics |
Authors | Carlos Pedro Gonçalves |
Abstract | The current work addresses quantum machine learning in the context of Quantum Artificial Neural Networks such that the networks’ processing is divided in two stages: the learning stage, where the network converges to a specific quantum circuit, and the backpropagation stage where the network effectively works as a self-programing quantum computing system that selects the quantum circuits to solve computing problems. The results are extended to general architectures including recurrent networks that interact with an environment, coupling with it in the neural links’ activation order, and self-organizing in a dynamical regime that intermixes patterns of dynamical stochasticity and persistent quasiperiodic dynamics, making emerge a form of noise resilient dynamical record. |
Tasks | Quantum Machine Learning |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.06935v1 |
http://arxiv.org/pdf/1609.06935v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-neural-machine-learning |
Repo | |
Framework | |
Analyzing the Behavior of Visual Question Answering Models
Title | Analyzing the Behavior of Visual Question Answering Models |
Authors | Aishwarya Agrawal, Dhruv Batra, Devi Parikh |
Abstract | Recently, a number of deep-learning based models have been proposed for the task of Visual Question Answering (VQA). The performance of most models is clustered around 60-70%. In this paper we propose systematic methods to analyze the behavior of these models as a first step towards recognizing their strengths and weaknesses, and identifying the most fruitful directions for progress. We analyze two models, one each from two major classes of VQA models – with-attention and without-attention and show the similarities and differences in the behavior of these models. We also analyze the winning entry of the VQA Challenge 2016. Our behavior analysis reveals that despite recent progress, today’s VQA models are “myopic” (tend to fail on sufficiently novel instances), often “jump to conclusions” (converge on a predicted answer after ‘listening’ to just half the question), and are “stubborn” (do not change their answers across images). |
Tasks | Question Answering, Visual Question Answering |
Published | 2016-06-23 |
URL | http://arxiv.org/abs/1606.07356v2 |
http://arxiv.org/pdf/1606.07356v2.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-the-behavior-of-visual-question |
Repo | |
Framework | |
Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models
Title | Communication-efficient Distributed Estimation and Inference for Transelliptical Graphical Models |
Authors | Pan Xu, Lu Tian, Quanquan Gu |
Abstract | We propose communication-efficient distributed estimation and inference methods for the transelliptical graphical model, a semiparametric extension of the elliptical distribution in the high dimensional regime. In detail, the proposed method distributes the $d$-dimensional data of size $N$ generated from a transelliptical graphical model into $m$ worker machines, and estimates the latent precision matrix on each worker machine based on the data of size $n=N/m$. It then debiases the local estimators on the worker machines and send them back to the master machine. Finally, on the master machine, it aggregates the debiased local estimators by averaging and hard thresholding. We show that the aggregated estimator attains the same statistical rate as the centralized estimator based on all the data, provided that the number of machines satisfies $m \lesssim \min{N\log d/d,\sqrt{N/(s^2\log d)}}$, where $s$ is the maximum number of nonzero entries in each column of the latent precision matrix. It is worth noting that our algorithm and theory can be directly applied to Gaussian graphical models, Gaussian copula graphical models and elliptical graphical models, since they are all special cases of transelliptical graphical models. Thorough experiments on synthetic data back up our theory. |
Tasks | |
Published | 2016-12-29 |
URL | http://arxiv.org/abs/1612.09297v1 |
http://arxiv.org/pdf/1612.09297v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-distributed-2 |
Repo | |
Framework | |
EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras (Extended Abstract)
Title | EgoCap: Egocentric Marker-less Motion Capture with Two Fisheye Cameras (Extended Abstract) |
Authors | Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov, Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, Christian Theobalt |
Abstract | Marker-based and marker-less optical skeletal motion-capture methods use an outside-in arrangement of cameras placed around a scene, with viewpoints converging on the center. They often create discomfort by possibly needed marker suits, and their recording volume is severely restricted and often constrained to indoor scenes with controlled backgrounds. We therefore propose a new method for real-time, marker-less and egocentric motion capture which estimates the full-body skeleton pose from a lightweight stereo pair of fisheye cameras that are attached to a helmet or virtual-reality headset. It combines the strength of a new generative pose estimation framework for fisheye views with a ConvNet-based body-part detector trained on a new automatically annotated and augmented dataset. Our inside-in method captures full-body motion in general indoor and outdoor scenes, and also crowded scenes. |
Tasks | Motion Capture, Pose Estimation |
Published | 2016-12-31 |
URL | http://arxiv.org/abs/1701.00142v1 |
http://arxiv.org/pdf/1701.00142v1.pdf | |
PWC | https://paperswithcode.com/paper/egocap-egocentric-marker-less-motion-capture |
Repo | |
Framework | |
Lip Reading Sentences in the Wild
Title | Lip Reading Sentences in the Wild |
Authors | Joon Son Chung, Andrew Senior, Oriol Vinyals, Andrew Zisserman |
Abstract | The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio. Unlike previous works that have focussed on recognising a limited number of words or phrases, we tackle lip reading as an open-world problem - unconstrained natural language sentences, and in the wild videos. Our key contributions are: (1) a ‘Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to characters; (2) a curriculum learning strategy to accelerate training and to reduce overfitting; (3) a ‘Lip Reading Sentences’ (LRS) dataset for visual speech recognition, consisting of over 100,000 natural sentences from British television. The WLAS model trained on the LRS dataset surpasses the performance of all previous work on standard lip reading benchmark datasets, often by a significant margin. This lip reading performance beats a professional lip reader on videos from BBC television, and we also demonstrate that visual information helps to improve speech recognition performance even when the audio is available. |
Tasks | Speech Recognition, Visual Speech Recognition |
Published | 2016-11-16 |
URL | http://arxiv.org/abs/1611.05358v2 |
http://arxiv.org/pdf/1611.05358v2.pdf | |
PWC | https://paperswithcode.com/paper/lip-reading-sentences-in-the-wild |
Repo | |
Framework | |
Complex-valued Gaussian Process Regression for Time Series Analysis
Title | Complex-valued Gaussian Process Regression for Time Series Analysis |
Authors | Luca Ambrogioni, Eric Maris |
Abstract | The construction of synthetic complex-valued signals from real-valued observations is an important step in many time series analysis techniques. The most widely used approach is based on the Hilbert transform, which maps the real-valued signal into its quadrature component. In this paper, we define a probabilistic generalization of this approach. We model the observable real-valued signal as the real part of a latent complex-valued Gaussian process. In order to obtain the appropriate statistical relationship between its real and imaginary parts, we define two new classes of complex-valued covariance functions. Through an analysis of simulated chirplets and stochastic oscillations, we show that the resulting Gaussian process complex-valued signal provides a better estimate of the instantaneous amplitude and frequency than the established approaches. Furthermore, the complex-valued Gaussian process regression allows to incorporate prior information about the structure in signal and noise and thereby to tailor the analysis to the features of the signal. As a example, we analyze the non-stationary dynamics of brain oscillations in the alpha band, as measured using magneto-encephalography. |
Tasks | Time Series, Time Series Analysis |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1611.10073v2 |
http://arxiv.org/pdf/1611.10073v2.pdf | |
PWC | https://paperswithcode.com/paper/complex-valued-gaussian-process-regression |
Repo | |
Framework | |