July 29, 2019

3307 words 16 mins read

Paper Group ANR 10

Semi-Latent GAN: Learning to generate and modify facial images from attributes. How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence. Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation. Proxy Templates for …

Semi-Latent GAN: Learning to generate and modify facial images from attributes


Title	Semi-Latent GAN: Learning to generate and modify facial images from attributes
Authors	Weidong Yin, Yanwei Fu, Leonid Sigal, Xiangyang Xue
Abstract	Generating and manipulating human facial images using high-level attributal controls are important and interesting problems. The models proposed in previous work can solve one of these two problems (generation or manipulation), but not both coherently. This paper proposes a novel model that learns how to both generate and modify the facial image from high-level semantic attributes. Our key idea is to formulate a Semi-Latent Facial Attribute Space (SL-FAS) to systematically learn relationship between user-defined and latent attributes, as well as between those attributes and RGB imagery. As part of this newly formulated space, we propose a new model — SL-GAN which is a specific form of Generative Adversarial Network. Finally, we present an iterative training algorithm for SL-GAN. The experiments on recent CelebA and CASIA-WebFace datasets validate the effectiveness of our proposed framework. We will also make data, pre-trained models and code available.
Tasks
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02166v1
PDF	http://arxiv.org/pdf/1704.02166v1.pdf
PWC	https://paperswithcode.com/paper/semi-latent-gan-learning-to-generate-and
Repo
Framework

How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence


Title	How linguistic descriptions of data can help to the teaching-learning process in higher education, case of study: artificial intelligence
Authors	Clemente Rubio-Manzano, Tomas Lermanda Senoceain
Abstract	Artificial Intelligence is a central topic in the computer science curriculum. From the year 2011 a project-based learning methodology based on computer games has been designed and implemented into the intelligence artificial course at the University of the Bio-Bio. The project aims to develop software-controlled agents (bots) which are programmed by using heuristic algorithms seen during the course. This methodology allows us to obtain good learning results, however several challenges have been founded during its implementation. In this paper we show how linguistic descriptions of data can help to provide students and teachers with technical and personalized feedback about the learned algorithms. Algorithm behavior profile and a new Turing test for computer games bots based on linguistic modelling of complex phenomena are also proposed in order to deal with such challenges. In order to show and explore the possibilities of this new technology, a web platform has been designed and implemented by one of authors and its incorporation in the process of assessment allows us to improve the teaching learning process.
Tasks
Published	2017-11-27
URL	http://arxiv.org/abs/1711.09744v3
PDF	http://arxiv.org/pdf/1711.09744v3.pdf
PWC	https://paperswithcode.com/paper/how-linguistic-descriptions-of-data-can-help
Repo
Framework

Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation


Title	Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation
Authors	Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang
Abstract	In this paper, we propose an alternative method to estimate room layouts of cluttered indoor scenes. This method enjoys the benefits of two novel techniques. The first one is semantic transfer (ST), which is: (1) a formulation to integrate the relationship between scene clutter and room layout into convolutional neural networks; (2) an architecture that can be end-to-end trained; (3) a practical strategy to initialize weights for very deep networks under unbalanced training data distribution. ST allows us to extract highly robust features under various circumstances, and in order to address the computation redundance hidden in these features we develop a principled and efficient inference scheme named physics inspired optimization (PIO). PIO’s basic idea is to formulate some phenomena observed in ST features into mechanics concepts. Evaluations on public datasets LSUN and Hedau show that the proposed method is more accurate than state-of-the-art methods.
Tasks	Room Layout Estimation
Published	2017-07-03
URL	http://arxiv.org/abs/1707.00383v1
PDF	http://arxiv.org/pdf/1707.00383v1.pdf
PWC	https://paperswithcode.com/paper/physics-inspired-optimization-on-semantic
Repo
Framework

Proxy Templates for Inverse Compositional Photometric Bundle Adjustment


Title	Proxy Templates for Inverse Compositional Photometric Bundle Adjustment
Authors	Christopher Ham, Simon Lucey, Surya Singh
Abstract	Recent advances in 3D vision have demonstrated the strengths of photometric bundle adjustment. By directly minimizing reprojected pixel errors, instead of geometric reprojection errors, such methods can achieve sub-pixel alignment accuracy in both high and low textured regions. Typically, these problems are solved using a forwards compositional Lucas-Kanade formulation parameterized by 6-DoF rigid camera poses and a depth per point in the structure. For large problems the most CPU-intensive component of the pipeline is the creation and factorization of the Hessian matrix at each iteration. For many warps, the inverse compositional formulation can offer significant speed-ups since the Hessian need only be inverted once. In this paper, we show that an ordinary inverse compositional formulation does not work for warps of this type of parameterization due to ill-conditioning of its partial derivatives. However, we show that it is possible to overcome this limitation by introducing the concept of a proxy template image. We show an order of magnitude improvement in speed, with little effect on quality, going from forwards to inverse compositional in our own photometric bundle adjustment method designed for object-centric structure from motion. This means less processing time for large systems or denser reconstructions under the same real-time constraints. We additionally show that this theory can be readily applied to existing methods by integrating it with the recently released Direct Sparse Odometry SLAM algorithm.
Tasks
Published	2017-04-23
URL	http://arxiv.org/abs/1704.06967v1
PDF	http://arxiv.org/pdf/1704.06967v1.pdf
PWC	https://paperswithcode.com/paper/proxy-templates-for-inverse-compositional
Repo
Framework

Effective Tensor Sketching via Sparsification


Title	Effective Tensor Sketching via Sparsification
Authors	Dong Xia, Ming Yuan
Abstract	In this paper, we investigate effective sketching schemes via sparsification for high dimensional multilinear arrays or tensors. More specifically, we propose a novel tensor sparsification algorithm that retains a subset of the entries of a tensor in a judicious way, and prove that it can attain a given level of approximation accuracy in terms of tensor spectral norm with a much smaller sample complexity when compared with existing approaches. In particular, we show that for a $k$th order $d\times\cdots\times d$ cubic tensor of {\it stable rank} $r_s$, the sample size requirement for achieving a relative error $\varepsilon$ is, up to a logarithmic factor, of the order $r_s^{1/2} d^{k/2} /\varepsilon$ when $\varepsilon$ is relatively large, and $r_s d /\varepsilon^2$ and essentially optimal when $\varepsilon$ is sufficiently small. It is especially noteworthy that the sample size requirement for achieving a high accuracy is of an order independent of $k$. To further demonstrate the utility of our techniques, we also study how higher order singular value decomposition (HOSVD) of large tensors can be efficiently approximated via sparsification.
Tasks
Published	2017-10-31
URL	http://arxiv.org/abs/1710.11298v3
PDF	http://arxiv.org/pdf/1710.11298v3.pdf
PWC	https://paperswithcode.com/paper/effective-tensor-sketching-via-sparsification
Repo
Framework

Appearance invariance in convolutional networks with neighborhood similarity


Title	Appearance invariance in convolutional networks with neighborhood similarity
Authors	Tolga Tasdizen, Mehdi Sajjadi, Mehran Javanmardi, Nisha Ramesh
Abstract	We present a neighborhood similarity layer (NSL) which induces appearance invariance in a network when used in conjunction with convolutional layers. We are motivated by the observation that, even though convolutional networks have low generalization error, their generalization capability does not extend to samples which are not represented by the training data. For instance, while novel appearances of learned concepts pose no problem for the human visual system, feedforward convolutional networks are generally not successful in such situations. Motivated by the Gestalt principle of grouping with respect to similarity, the proposed NSL transforms its input feature map using the feature vectors at each pixel as a frame of reference, i.e. center of attention, for its surrounding neighborhood. This transformation is spatially varying, hence not a convolution. It is differentiable; therefore, networks including the proposed layer can be trained in an end-to-end manner. We analyze the invariance of NSL to significant changes in appearance that are not represented in the training data. We also demonstrate its advantages for digit recognition, semantic labeling and cell detection problems.
Tasks
Published	2017-07-03
URL	http://arxiv.org/abs/1707.00755v1
PDF	http://arxiv.org/pdf/1707.00755v1.pdf
PWC	https://paperswithcode.com/paper/appearance-invariance-in-convolutional
Repo
Framework

A Framework for Inferring Causality from Multi-Relational Observational Data using Conditional Independence


Title	A Framework for Inferring Causality from Multi-Relational Observational Data using Conditional Independence
Authors	Sudeepa Roy, Babak Salimi
Abstract	The study of causality or causal inference - how much a given treatment causally affects a given outcome in a population - goes way beyond correlation or association analysis of variables, and is critical in making sound data driven decisions and policies in a multitude of applications. The gold standard in causal inference is performing “controlled experiments”, which often is not possible due to logistical or ethical reasons. As an alternative, inferring causality on “observational data” based on the “Neyman-Rubin potential outcome model” has been extensively used in statistics, economics, and social sciences over several decades. In this paper, we present a formal framework for sound causal analysis on observational datasets that are given as multiple relations and where the population under study is obtained by joining these base relations. We study a crucial condition for inferring causality from observational data, called the “strong ignorability assumption” (the treatment and outcome variables should be independent in the joined relation given the observed covariates), using known conditional independences that hold in the base relations. We also discuss how the structure of the conditional independences in base relations given as graphical models help infer new conditional independences in the joined relation. The proposed framework combines concepts from databases, statistics, and graphical models, and aims to initiate new research directions spanning these fields to facilitate powerful data-driven decisions in today’s big data world.
Tasks	Causal Inference
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02536v1
PDF	http://arxiv.org/pdf/1708.02536v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-inferring-causality-from
Repo
Framework

An evaluation of large-scale methods for image instance and class discovery


Title	An evaluation of large-scale methods for image instance and class discovery
Authors	Matthijs Douze, Hervé Jégou, Jeff Johnson
Abstract	This paper aims at discovering meaningful subsets of related images from large image collections without annotations. We search groups of images related at different levels of semantic, i.e., either instances or visual classes. While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the art, such as the Markov Clustering algorithm. We report results on the ImageNet and the Paris500k instance dataset, both enlarged with images from YFCC100M. We evaluate our methods with a labelling cost that reflects how much effort a human would require to correct the generated clusters. Our analysis highlights several properties. First, when powered with an efficient GPU implementation, the cost of the discovery process is small compared to computing the image descriptors, even for collections as large as 100 million images. Second, we show that descriptions selected for instance search improve the discovery of object classes. Third, the Markov Clustering technique consistently outperforms other methods; to our knowledge it has never been considered in this large scale scenario.
Tasks	Instance Search
Published	2017-08-09
URL	http://arxiv.org/abs/1708.02898v1
PDF	http://arxiv.org/pdf/1708.02898v1.pdf
PWC	https://paperswithcode.com/paper/an-evaluation-of-large-scale-methods-for
Repo
Framework

Predicting Citywide Crowd Flows Using Deep Spatio-Temporal Residual Networks


Title	Predicting Citywide Crowd Flows Using Deep Spatio-Temporal Residual Networks
Authors	Junbo Zhang, Yu Zheng, Dekang Qi, Ruiyuan Li, Xiuwen Yi, Tianrui Li
Abstract	Forecasting the flow of crowds is of great importance to traffic management and public safety, and very challenging as it is affected by many complex factors, including spatial dependencies (nearby and distant), temporal dependencies (closeness, period, trend), and external conditions (e.g., weather and events). We propose a deep-learning-based approach, called ST-ResNet, to collectively forecast two types of crowd flows (i.e. inflow and outflow) in each and every region of a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the residual neural network framework to model the temporal closeness, period, and trend properties of crowd traffic. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We have developed a real-time system based on Microsoft Azure Cloud, called UrbanFlow, providing the crowd flow monitoring and forecasting in Guiyang City of China. In addition, we present an extensive experimental evaluation using two types of crowd flows in Beijing and New York City (NYC), where ST-ResNet outperforms nine well-known baselines.
Tasks
Published	2017-01-10
URL	http://arxiv.org/abs/1701.02543v1
PDF	http://arxiv.org/pdf/1701.02543v1.pdf
PWC	https://paperswithcode.com/paper/predicting-citywide-crowd-flows-using-deep
Repo
Framework

Stochastic Optimization with Bandit Sampling


Title	Stochastic Optimization with Bandit Sampling
Authors	Farnood Salehi, L. Elisa Celis, Patrick Thiran
Abstract	Many stochastic optimization algorithms work by estimating the gradient of the cost function on the fly by sampling datapoints uniformly at random from a training set. However, the estimator might have a large variance, which inadvertently slows down the convergence rate of the algorithms. One way to reduce this variance is to sample the datapoints from a carefully selected non-uniform distribution. In this work, we propose a novel non-uniform sampling approach that uses the multi-armed bandit framework. Theoretically, we show that our algorithm asymptotically approximates the optimal variance within a factor of 3. Empirically, we show that using this datapoint-selection technique results in a significant reduction in the convergence time and variance of several stochastic optimization algorithms such as SGD, SVRG and SAGA. This approach for sampling datapoints is general, and can be used in conjunction with any algorithm that uses an unbiased gradient estimation – we expect it to have broad applicability beyond the specific examples explored in this work.
Tasks	Stochastic Optimization
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02544v2
PDF	http://arxiv.org/pdf/1708.02544v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-optimization-with-bandit-sampling
Repo
Framework

Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition


Title	Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition
Authors	Shankar Kumar, Michael Nirschl, Daniel Holtmann-Rice, Hank Liao, Ananda Theertha Suresh, Felix Yu
Abstract	Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally more expensive than N-gram LMs for decoding, and thus, challenging to integrate into speech recognizers. Recent research has proposed the use of lattice-rescoring algorithms using RNNLMs and LSTMLMs as an efficient strategy to integrate these models into a speech recognition system. In this paper, we evaluate existing lattice rescoring algorithms along with new variants on a YouTube speech recognition task. Lattice rescoring using LSTMLMs reduces the word error rate (WER) for this task by 8% relative to the WER obtained using an N-gram LM.
Tasks	Speech Recognition
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05448v1
PDF	http://arxiv.org/pdf/1711.05448v1.pdf
PWC	https://paperswithcode.com/paper/lattice-rescoring-strategies-for-long-short
Repo
Framework

Non-Associative Learning Representation in the Nervous System of the Nematode Caenorhabditis elegans


Title	Non-Associative Learning Representation in the Nervous System of the Nematode Caenorhabditis elegans
Authors	Ramin M. Hasani, Magdalena Fuchs, Victoria Beneder, Radu Grosu
Abstract	Caenorhabditis elegans (C. elegans) illustrated remarkable behavioral plasticities including complex non-associative and associative learning representations. Understanding the principles of such mechanisms presumably leads to constructive inspirations for the design of efficient learning algorithms. In the present study, we postulate a novel approach on modeling single neurons and synapses to study the mechanisms underlying learning in the C. elegans nervous system. In this regard, we construct a precise mathematical model of sensory neurons where we include multi-scale details from genes, ion channels and ion pumps, together with a dynamic model of synapses comprised of neurotransmitters and receptors kinetics. We recapitulate mechanosensory habituation mechanism, a non-associative learning process, in which elements of the neural network tune their parameters as a result of repeated input stimuli. Accordingly, we quantitatively demonstrate the roots of such plasticity in the neuronal and synaptic-level representations. Our findings can potentially give rise to the development of new bio-inspired learning algorithms.
Tasks
Published	2017-03-18
URL	http://arxiv.org/abs/1703.06264v3
PDF	http://arxiv.org/pdf/1703.06264v3.pdf
PWC	https://paperswithcode.com/paper/non-associative-learning-representation-in
Repo
Framework

WAYLA - Generating Images from Eye Movements


Title	WAYLA - Generating Images from Eye Movements
Authors	Bingqing Yu, James J. Clark
Abstract	We present a method for reconstructing images viewed by observers based only on their eye movements. By exploring the relationships between gaze patterns and image stimuli, the “What Are You Looking At?” (WAYLA) system learns to synthesize photo-realistic images that are similar to the original pictures being viewed. The WAYLA approach is based on the Conditional Generative Adversarial Network (Conditional GAN) image-to-image translation technique of Isola et al. We consider two specific applications - the first, of reconstructing newspaper images from gaze heat maps, and the second, of detailed reconstruction of images containing only text. The newspaper image reconstruction process is divided into two image-to-image translation operations, the first mapping gaze heat maps into image segmentations, and the second mapping the generated segmentation into a newspaper image. We validate the performance of our approach using various evaluation metrics, along with human visual inspection. All results confirm the ability of our network to perform image generation tasks using eye tracking data.
Tasks	Eye Tracking, Image Generation, Image Reconstruction, Image-to-Image Translation
Published	2017-11-21
URL	http://arxiv.org/abs/1711.07974v1
PDF	http://arxiv.org/pdf/1711.07974v1.pdf
PWC	https://paperswithcode.com/paper/wayla-generating-images-from-eye-movements
Repo
Framework

AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms


Title	AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms
Authors	Marco F. Cusumano-Towner, Vikash K. Mansinghka
Abstract	Approximate probabilistic inference algorithms are central to many fields. Examples include sequential Monte Carlo inference in robotics, variational inference in machine learning, and Markov chain Monte Carlo inference in statistics. A key problem faced by practitioners is measuring the accuracy of an approximate inference algorithm on a specific data set. This paper introduces the auxiliary inference divergence estimator (AIDE), an algorithm for measuring the accuracy of approximate inference algorithms. AIDE is based on the observation that inference algorithms can be treated as probabilistic models and the random variables used within the inference algorithm can be viewed as auxiliary variables. This view leads to a new estimator for the symmetric KL divergence between the approximating distributions of two inference algorithms. The paper illustrates application of AIDE to algorithms for inference in regression, hidden Markov, and Dirichlet process mixture models. The experiments show that AIDE captures the qualitative behavior of a broad class of inference algorithms and can detect failure modes of inference algorithms that are missed by standard heuristics.
Tasks
Published	2017-05-19
URL	http://arxiv.org/abs/1705.07224v2
PDF	http://arxiv.org/pdf/1705.07224v2.pdf
PWC	https://paperswithcode.com/paper/aide-an-algorithm-for-measuring-the-accuracy
Repo
Framework

A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management


Title	A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management
Authors	Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić
Abstract	Dialogue assistants are rapidly becoming an indispensable daily aid. To avoid the significant effort needed to hand-craft the required dialogue flow, the Dialogue Management (DM) module can be cast as a continuous Markov Decision Process (MDP) and trained through Reinforcement Learning (RL). Several RL models have been investigated over recent years. However, the lack of a common benchmarking framework makes it difficult to perform a fair comparison between different models and their capability to generalise to different environments. Therefore, this paper proposes a set of challenging simulated environments for dialogue model development and evaluation. To provide some baselines, we investigate a number of representative parametric algorithms, namely deep reinforcement learning algorithms - DQN, A2C and Natural Actor-Critic and compare them to a non-parametric model, GP-SARSA. Both the environments and policy models are implemented using the publicly available PyDial toolkit and released on-line, in order to establish a testbed framework for further experiments and to facilitate experimental reproducibility.
Tasks	Dialogue Management
Published	2017-11-29
URL	http://arxiv.org/abs/1711.11023v2
PDF	http://arxiv.org/pdf/1711.11023v2.pdf
PWC	https://paperswithcode.com/paper/a-benchmarking-environment-for-reinforcement
Repo
Framework