July 29, 2019

3307 words 16 mins read

Paper Group ANR 10

Paper Group ANR 10

Semi-Latent GAN: Learning to generate and modify facial images from attributes. How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence. Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation. Proxy Templates for …

Semi-Latent GAN: Learning to generate and modify facial images from attributes

Title Semi-Latent GAN: Learning to generate and modify facial images from attributes
Authors Weidong Yin, Yanwei Fu, Leonid Sigal, Xiangyang Xue
Abstract Generating and manipulating human facial images using high-level attributal controls are important and interesting problems. The models proposed in previous work can solve one of these two problems (generation or manipulation), but not both coherently. This paper proposes a novel model that learns how to both generate and modify the facial image from high-level semantic attributes. Our key idea is to formulate a Semi-Latent Facial Attribute Space (SL-FAS) to systematically learn relationship between user-defined and latent attributes, as well as between those attributes and RGB imagery. As part of this newly formulated space, we propose a new model — SL-GAN which is a specific form of Generative Adversarial Network. Finally, we present an iterative training algorithm for SL-GAN. The experiments on recent CelebA and CASIA-WebFace datasets validate the effectiveness of our proposed framework. We will also make data, pre-trained models and code available.
Tasks
Published 2017-04-07
URL http://arxiv.org/abs/1704.02166v1
PDF http://arxiv.org/pdf/1704.02166v1.pdf
PWC https://paperswithcode.com/paper/semi-latent-gan-learning-to-generate-and
Repo
Framework

How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence

Title How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence
Authors Clemente Rubio-Manzano, Tomas Lermanda Senoceain
Abstract Artificial Intelligence is a central topic in the computer science curriculum. From the year 2011 a project-based learning methodology based on computer games has been designed and implemented into the intelligence artificial course at the University of the Bio-Bio. The project aims to develop software-controlled agents (bots) which are programmed by using heuristic algorithms seen during the course. This methodology allows us to obtain good learning results, however several challenges have been founded during its implementation. In this paper we show how linguistic descriptions of data can help to provide students and teachers with technical and personalized feedback about the learned algorithms. Algorithm behavior profile and a new Turing test for computer games bots based on linguistic modelling of complex phenomena are also proposed in order to deal with such challenges. In order to show and explore the possibilities of this new technology, a web platform has been designed and implemented by one of authors and its incorporation in the process of assessment allows us to improve the teaching learning process.
Tasks
Published 2017-11-27
URL http://arxiv.org/abs/1711.09744v3
PDF http://arxiv.org/pdf/1711.09744v3.pdf
PWC https://paperswithcode.com/paper/how-linguistic-descriptions-of-data-can-help
Repo
Framework

Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation

Title Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation
Authors Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang
Abstract In this paper, we propose an alternative method to estimate room layouts of cluttered indoor scenes. This method enjoys the benefits of two novel techniques. The first one is semantic transfer (ST), which is: (1) a formulation to integrate the relationship between scene clutter and room layout into convolutional neural networks; (2) an architecture that can be end-to-end trained; (3) a practical strategy to initialize weights for very deep networks under unbalanced training data distribution. ST allows us to extract highly robust features under various circumstances, and in order to address the computation redundance hidden in these features we develop a principled and efficient inference scheme named physics inspired optimization (PIO). PIO’s basic idea is to formulate some phenomena observed in ST features into mechanics concepts. Evaluations on public datasets LSUN and Hedau show that the proposed method is more accurate than state-of-the-art methods.
Tasks Room Layout Estimation
Published 2017-07-03
URL http://arxiv.org/abs/1707.00383v1
PDF http://arxiv.org/pdf/1707.00383v1.pdf
PWC https://paperswithcode.com/paper/physics-inspired-optimization-on-semantic
Repo
Framework

Proxy Templates for Inverse Compositional Photometric Bundle Adjustment

Title Proxy Templates for Inverse Compositional Photometric Bundle Adjustment
Authors Christopher Ham, Simon Lucey, Surya Singh
Abstract Recent advances in 3D vision have demonstrated the strengths of photometric bundle adjustment. By directly minimizing reprojected pixel errors, instead of geometric reprojection errors, such methods can achieve sub-pixel alignment accuracy in both high and low textured regions. Typically, these problems are solved using a forwards compositional Lucas-Kanade formulation parameterized by 6-DoF rigid camera poses and a depth per point in the structure. For large problems the most CPU-intensive component of the pipeline is the creation and factorization of the Hessian matrix at each iteration. For many warps, the inverse compositional formulation can offer significant speed-ups since the Hessian need only be inverted once. In this paper, we show that an ordinary inverse compositional formulation does not work for warps of this type of parameterization due to ill-conditioning of its partial derivatives. However, we show that it is possible to overcome this limitation by introducing the concept of a proxy template image. We show an order of magnitude improvement in speed, with little effect on quality, going from forwards to inverse compositional in our own photometric bundle adjustment method designed for object-centric structure from motion. This means less processing time for large systems or denser reconstructions under the same real-time constraints. We additionally show that this theory can be readily applied to existing methods by integrating it with the recently released Direct Sparse Odometry SLAM algorithm.
Tasks
Published 2017-04-23
URL http://arxiv.org/abs/1704.06967v1
PDF http://arxiv.org/pdf/1704.06967v1.pdf
PWC https://paperswithcode.com/paper/proxy-templates-for-inverse-compositional
Repo
Framework

Effective Tensor Sketching via Sparsification

Title Effective Tensor Sketching via Sparsification
Authors Dong Xia, Ming Yuan
Abstract In this paper, we investigate effective sketching schemes via sparsification for high dimensional multilinear arrays or tensors. More specifically, we propose a novel tensor sparsification algorithm that retains a subset of the entries of a tensor in a judicious way, and prove that it can attain a given level of approximation accuracy in terms of tensor spectral norm with a much smaller sample complexity when compared with existing approaches. In particular, we show that for a $k$th order $d\times\cdots\times d$ cubic tensor of {\it stable rank} $r_s$, the sample size requirement for achieving a relative error $\varepsilon$ is, up to a logarithmic factor, of the order $r_s^{1/2} d^{k/2} /\varepsilon$ when $\varepsilon$ is relatively large, and $r_s d /\varepsilon^2$ and essentially optimal when $\varepsilon$ is sufficiently small. It is especially noteworthy that the sample size requirement for achieving a high accuracy is of an order independent of $k$. To further demonstrate the utility of our techniques, we also study how higher order singular value decomposition (HOSVD) of large tensors can be efficiently approximated via sparsification.
Tasks
Published 2017-10-31
URL http://arxiv.org/abs/1710.11298v3
PDF http://arxiv.org/pdf/1710.11298v3.pdf
PWC https://paperswithcode.com/paper/effective-tensor-sketching-via-sparsification
Repo
Framework

Appearance invariance in convolutional networks with neighborhood similarity

Title Appearance invariance in convolutional networks with neighborhood similarity
Authors Tolga Tasdizen, Mehdi Sajjadi, Mehran Javanmardi, Nisha Ramesh
Abstract We present a neighborhood similarity layer (NSL) which induces appearance invariance in a network when used in conjunction with convolutional layers. We are motivated by the observation that, even though convolutional networks have low generalization error, their generalization capability does not extend to samples which are not represented by the training data. For instance, while novel appearances of learned concepts pose no problem for the human visual system, feedforward convolutional networks are generally not successful in such situations. Motivated by the Gestalt principle of grouping with respect to similarity, the proposed NSL transforms its input feature map using the feature vectors at each pixel as a frame of reference, i.e. center of attention, for its surrounding neighborhood. This transformation is spatially varying, hence not a convolution. It is differentiable; therefore, networks including the proposed layer can be trained in an end-to-end manner. We analyze the invariance of NSL to significant changes in appearance that are not represented in the training data. We also demonstrate its advantages for digit recognition, semantic labeling and cell detection problems.
Tasks
Published 2017-07-03
URL http://arxiv.org/abs/1707.00755v1
PDF http://arxiv.org/pdf/1707.00755v1.pdf
PWC https://paperswithcode.com/paper/appearance-invariance-in-convolutional
Repo
Framework

A Framework for Inferring Causality from Multi-Relational Observational Data using Conditional Independence

Title A Framework for Inferring Causality from Multi-Relational Observational Data using Conditional Independence
Authors Sudeepa Roy, Babak Salimi
Abstract The study of causality or causal inference - how much a given treatment causally affects a given outcome in a population - goes way beyond correlation or association analysis of variables, and is critical in making sound data driven decisions and policies in a multitude of applications. The gold standard in causal inference is performing “controlled experiments”, which often is not possible due to logistical or ethical reasons. As an alternative, inferring causality on “observational data” based on the “Neyman-Rubin potential outcome model” has been extensively used in statistics, economics, and social sciences over several decades. In this paper, we present a formal framework for sound causal analysis on observational datasets that are given as multiple relations and where the population under study is obtained by joining these base relations. We study a crucial condition for inferring causality from observational data, called the “strong ignorability assumption” (the treatment and outcome variables should be independent in the joined relation given the observed covariates), using known conditional independences that hold in the base relations. We also discuss how the structure of the conditional independences in base relations given as graphical models help infer new conditional independences in the joined relation. The proposed framework combines concepts from databases, statistics, and graphical models, and aims to initiate new research directions spanning these fields to facilitate powerful data-driven decisions in today’s big data world.
Tasks Causal Inference
Published 2017-08-08
URL http://arxiv.org/abs/1708.02536v1
PDF http://arxiv.org/pdf/1708.02536v1.pdf
PWC https://paperswithcode.com/paper/a-framework-for-inferring-causality-from
Repo
Framework

An evaluation of large-scale methods for image instance and class discovery

Title An evaluation of large-scale methods for image instance and class discovery
Authors Matthijs Douze, Hervé Jégou, Jeff Johnson
Abstract This paper aims at discovering meaningful subsets of related images from large image collections without annotations. We search groups of images related at different levels of semantic, i.e., either instances or visual classes. While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the art, such as the Markov Clustering algorithm. We report results on the ImageNet and the Paris500k instance dataset, both enlarged with images from YFCC100M. We evaluate our methods with a labelling cost that reflects how much effort a human would require to correct the generated clusters. Our analysis highlights several properties. First, when powered with an efficient GPU implementation, the cost of the discovery process is small compared to computing the image descriptors, even for collections as large as 100 million images. Second, we show that descriptions selected for instance search improve the discovery of object classes. Third, the Markov Clustering technique consistently outperforms other methods; to our knowledge it has never been considered in this large scale scenario.
Tasks Instance Search
Published 2017-08-09
URL http://arxiv.org/abs/1708.02898v1
PDF http://arxiv.org/pdf/1708.02898v1.pdf
PWC https://paperswithcode.com/paper/an-evaluation-of-large-scale-methods-for
Repo
Framework

Predicting Citywide Crowd Flows Using Deep Spatio-Temporal Residual Networks

Title Predicting Citywide Crowd Flows Using Deep Spatio-Temporal Residual Networks
Authors Junbo Zhang, Yu Zheng, Dekang Qi, Ruiyuan Li, Xiuwen Yi, Tianrui Li
Abstract Forecasting the flow of crowds is of great importance to traffic management and public safety, and very challenging as it is affected by many complex factors, including spatial dependencies (nearby and distant), temporal dependencies (closeness, period, trend), and external conditions (e.g., weather and events). We propose a deep-learning-based approach, called ST-ResNet, to collectively forecast two types of crowd flows (i.e. inflow and outflow) in each and every region of a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the residual neural network framework to model the temporal closeness, period, and trend properties of crowd traffic. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We have developed a real-time system based on Microsoft Azure Cloud, called UrbanFlow, providing the crowd flow monitoring and forecasting in Guiyang City of China. In addition, we present an extensive experimental evaluation using two types of crowd flows in Beijing and New York City (NYC), where ST-ResNet outperforms nine well-known baselines.
Tasks
Published 2017-01-10
URL http://arxiv.org/abs/1701.02543v1
PDF http://arxiv.org/pdf/1701.02543v1.pdf
PWC https://paperswithcode.com/paper/predicting-citywide-crowd-flows-using-deep
Repo
Framework

Stochastic Optimization with Bandit Sampling

Title Stochastic Optimization with Bandit Sampling
Authors Farnood Salehi, L. Elisa Celis, Patrick Thiran
Abstract Many stochastic optimization algorithms work by estimating the gradient of the cost function on the fly by sampling datapoints uniformly at random from a training set. However, the estimator might have a large variance, which inadvertently slows down the convergence rate of the algorithms. One way to reduce this variance is to sample the datapoints from a carefully selected non-uniform distribution. In this work, we propose a novel non-uniform sampling approach that uses the multi-armed bandit framework. Theoretically, we show that our algorithm asymptotically approximates the optimal variance within a factor of 3. Empirically, we show that using this datapoint-selection technique results in a significant reduction in the convergence time and variance of several stochastic optimization algorithms such as SGD, SVRG and SAGA. This approach for sampling datapoints is general, and can be used in conjunction with any algorithm that uses an unbiased gradient estimation – we expect it to have broad applicability beyond the specific examples explored in this work.
Tasks Stochastic Optimization
Published 2017-08-08
URL http://arxiv.org/abs/1708.02544v2
PDF http://arxiv.org/pdf/1708.02544v2.pdf
PWC https://paperswithcode.com/paper/stochastic-optimization-with-bandit-sampling
Repo
Framework

Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition

Title Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition
Authors Shankar Kumar, Michael Nirschl, Daniel Holtmann-Rice, Hank Liao, Ananda Theertha Suresh, Felix Yu
Abstract Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally more expensive than N-gram LMs for decoding, and thus, challenging to integrate into speech recognizers. Recent research has proposed the use of lattice-rescoring algorithms using RNNLMs and LSTMLMs as an efficient strategy to integrate these models into a speech recognition system. In this paper, we evaluate existing lattice rescoring algorithms along with new variants on a YouTube speech recognition task. Lattice rescoring using LSTMLMs reduces the word error rate (WER) for this task by 8% relative to the WER obtained using an N-gram LM.
Tasks Speech Recognition
Published 2017-11-15
URL http://arxiv.org/abs/1711.05448v1
PDF http://arxiv.org/pdf/1711.05448v1.pdf
PWC https://paperswithcode.com/paper/lattice-rescoring-strategies-for-long-short
Repo
Framework

Non-Associative Learning Representation in the Nervous System of the Nematode Caenorhabditis elegans

Title Non-Associative Learning Representation in the Nervous System of the Nematode Caenorhabditis elegans
Authors Ramin M. Hasani, Magdalena Fuchs, Victoria Beneder, Radu Grosu
Abstract Caenorhabditis elegans (C. elegans) illustrated remarkable behavioral plasticities including complex non-associative and associative learning representations. Understanding the principles of such mechanisms presumably leads to constructive inspirations for the design of efficient learning algorithms. In the present study, we postulate a novel approach on modeling single neurons and synapses to study the mechanisms underlying learning in the C. elegans nervous system. In this regard, we construct a precise mathematical model of sensory neurons where we include multi-scale details from genes, ion channels and ion pumps, together with a dynamic model of synapses comprised of neurotransmitters and receptors kinetics. We recapitulate mechanosensory habituation mechanism, a non-associative learning process, in which elements of the neural network tune their parameters as a result of repeated input stimuli. Accordingly, we quantitatively demonstrate the roots of such plasticity in the neuronal and synaptic-level representations. Our findings can potentially give rise to the development of new bio-inspired learning algorithms.
Tasks
Published 2017-03-18
URL http://arxiv.org/abs/1703.06264v3
PDF http://arxiv.org/pdf/1703.06264v3.pdf
PWC https://paperswithcode.com/paper/non-associative-learning-representation-in
Repo
Framework

WAYLA - Generating Images from Eye Movements

Title WAYLA - Generating Images from Eye Movements
Authors Bingqing Yu, James J. Clark
Abstract We present a method for reconstructing images viewed by observers based only on their eye movements. By exploring the relationships between gaze patterns and image stimuli, the “What Are You Looking At?” (WAYLA) system learns to synthesize photo-realistic images that are similar to the original pictures being viewed. The WAYLA approach is based on the Conditional Generative Adversarial Network (Conditional GAN) image-to-image translation technique of Isola et al. We consider two specific applications - the first, of reconstructing newspaper images from gaze heat maps, and the second, of detailed reconstruction of images containing only text. The newspaper image reconstruction process is divided into two image-to-image translation operations, the first mapping gaze heat maps into image segmentations, and the second mapping the generated segmentation into a newspaper image. We validate the performance of our approach using various evaluation metrics, along with human visual inspection. All results confirm the ability of our network to perform image generation tasks using eye tracking data.
Tasks Eye Tracking, Image Generation, Image Reconstruction, Image-to-Image Translation
Published 2017-11-21
URL http://arxiv.org/abs/1711.07974v1
PDF http://arxiv.org/pdf/1711.07974v1.pdf
PWC https://paperswithcode.com/paper/wayla-generating-images-from-eye-movements
Repo
Framework

AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms

Title AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms
Authors Marco F. Cusumano-Towner, Vikash K. Mansinghka
Abstract Approximate probabilistic inference algorithms are central to many fields. Examples include sequential Monte Carlo inference in robotics, variational inference in machine learning, and Markov chain Monte Carlo inference in statistics. A key problem faced by practitioners is measuring the accuracy of an approximate inference algorithm on a specific data set. This paper introduces the auxiliary inference divergence estimator (AIDE), an algorithm for measuring the accuracy of approximate inference algorithms. AIDE is based on the observation that inference algorithms can be treated as probabilistic models and the random variables used within the inference algorithm can be viewed as auxiliary variables. This view leads to a new estimator for the symmetric KL divergence between the approximating distributions of two inference algorithms. The paper illustrates application of AIDE to algorithms for inference in regression, hidden Markov, and Dirichlet process mixture models. The experiments show that AIDE captures the qualitative behavior of a broad class of inference algorithms and can detect failure modes of inference algorithms that are missed by standard heuristics.
Tasks
Published 2017-05-19
URL http://arxiv.org/abs/1705.07224v2
PDF http://arxiv.org/pdf/1705.07224v2.pdf
PWC https://paperswithcode.com/paper/aide-an-algorithm-for-measuring-the-accuracy
Repo
Framework

A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management

Title A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management
Authors Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić
Abstract Dialogue assistants are rapidly becoming an indispensable daily aid. To avoid the significant effort needed to hand-craft the required dialogue flow, the Dialogue Management (DM) module can be cast as a continuous Markov Decision Process (MDP) and trained through Reinforcement Learning (RL). Several RL models have been investigated over recent years. However, the lack of a common benchmarking framework makes it difficult to perform a fair comparison between different models and their capability to generalise to different environments. Therefore, this paper proposes a set of challenging simulated environments for dialogue model development and evaluation. To provide some baselines, we investigate a number of representative parametric algorithms, namely deep reinforcement learning algorithms - DQN, A2C and Natural Actor-Critic and compare them to a non-parametric model, GP-SARSA. Both the environments and policy models are implemented using the publicly available PyDial toolkit and released on-line, in order to establish a testbed framework for further experiments and to facilitate experimental reproducibility.
Tasks Dialogue Management
Published 2017-11-29
URL http://arxiv.org/abs/1711.11023v2
PDF http://arxiv.org/pdf/1711.11023v2.pdf
PWC https://paperswithcode.com/paper/a-benchmarking-environment-for-reinforcement
Repo
Framework
comments powered by Disqus