Paper Group ANR 457
Generative Adversarial Network-based Synthesis of Visible Faces from Polarimetric Thermal Faces. Show and Recall: Learning What Makes Videos Memorable. Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection. CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images. Setting Players’ Beha …
Generative Adversarial Network-based Synthesis of Visible Faces from Polarimetric Thermal Faces
Title | Generative Adversarial Network-based Synthesis of Visible Faces from Polarimetric Thermal Faces |
Authors | He Zhang, Vishal M. Patel, Benjamin S. Riggan, Shuowen Hu |
Abstract | The large domain discrepancy between faces captured in polarimetric (or conventional) thermal and visible domain makes cross-domain face recognition quite a challenging problem for both human-examiners and computer vision algorithms. Previous approaches utilize a two-step procedure (visible feature estimation and visible image reconstruction) to synthesize the visible image given the corresponding polarimetric thermal image. However, these are regarded as two disjoint steps and hence may hinder the performance of visible face reconstruction. We argue that joint optimization would be a better way to reconstruct more photo-realistic images for both computer vision algorithms and human-examiners to examine. To this end, this paper proposes a Generative Adversarial Network-based Visible Face Synthesis (GAN-VFS) method to synthesize more photo-realistic visible face images from their corresponding polarimetric images. To ensure that the encoded visible-features contain more semantically meaningful information in reconstructing the visible face image, a guidance sub-network is involved into the training procedure. To achieve photo realistic property while preserving discriminative characteristics for the reconstructed outputs, an identity loss combined with the perceptual loss are optimized in the framework. Multiple experiments evaluated on different experimental protocols demonstrate that the proposed method achieves state-of-the-art performance. |
Tasks | Face Generation, Face Recognition, Face Reconstruction, Image Reconstruction |
Published | 2017-08-08 |
URL | http://arxiv.org/abs/1708.02681v1 |
http://arxiv.org/pdf/1708.02681v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-network-based-1 |
Repo | |
Framework | |
Show and Recall: Learning What Makes Videos Memorable
Title | Show and Recall: Learning What Makes Videos Memorable |
Authors | Sumit Shekhar, Dhruv Singal, Harvineet Singh, Manav Kedia, Akhil Shetty |
Abstract | With the explosion of video content on the Internet, there is a need for research on methods for video analysis which take human cognition into account. One such cognitive measure is memorability, or the ability to recall visual content after watching it. Prior research has looked into image memorability and shown that it is intrinsic to visual content, but the problem of modeling video memorability has not been addressed sufficiently. In this work, we develop a prediction model for video memorability, including complexities of video content in it. Detailed feature analysis reveals that the proposed method correlates well with existing findings on memorability. We also describe a novel experiment of predicting video sub-shot memorability and show that our approach improves over current memorability methods in this task. Experiments on standard datasets demonstrate that the proposed metric can achieve results on par or better than the state-of-the art methods for video summarization. |
Tasks | Video Summarization |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05357v3 |
http://arxiv.org/pdf/1707.05357v3.pdf | |
PWC | https://paperswithcode.com/paper/show-and-recall-learning-what-makes-videos |
Repo | |
Framework | |
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
Title | Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection |
Authors | Haw-Shiuan Chang, ZiYun Wang, Luke Vilnis, Andrew McCallum |
Abstract | Modeling hypernymy, such as poodle is-a dog, is an important generalization aid to many NLP tasks, such as entailment, coreference, relation extraction, and question answering. Supervised learning from labeled hypernym sources, such as WordNet, limits the coverage of these models, which can be addressed by learning hypernyms from unlabeled text. Existing unsupervised methods either do not scale to large vocabularies or yield unacceptably poor accuracy. This paper introduces distributional inclusion vector embedding (DIVE), a simple-to-implement unsupervised method of hypernym discovery via per-word non-negative vector embeddings which preserve the inclusion property of word contexts in a low-dimensional and interpretable space. In experimental evaluations more comprehensive than any previous literature of which we are aware-evaluating on 11 datasets using multiple existing as well as newly proposed scoring functions-we find that our method provides up to double the precision of previous unsupervised embeddings, and the highest average performance, using a much more compact word representation, and yielding many new state-of-the-art results. |
Tasks | Hypernym Discovery, Question Answering, Relation Extraction |
Published | 2017-10-02 |
URL | http://arxiv.org/abs/1710.00880v3 |
http://arxiv.org/pdf/1710.00880v3.pdf | |
PWC | https://paperswithcode.com/paper/distributional-inclusion-vector-embedding-for |
Repo | |
Framework | |
CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images
Title | CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images |
Authors | Yudong Guo, Juyong Zhang, Jianfei Cai, Boyi Jiang, Jianmin Zheng |
Abstract | With the powerfulness of convolution neural networks (CNN), CNN based face reconstruction has recently shown promising performance in reconstructing detailed face shape from 2D face images. The success of CNN-based methods relies on a large number of labeled data. The state-of-the-art synthesizes such data using a coarse morphable face model, which however has difficulty to generate detailed photo-realistic images of faces (with wrinkles). This paper presents a novel face data generation method. Specifically, we render a large number of photo-realistic face images with different attributes based on inverse rendering. Furthermore, we construct a fine-detailed face image dataset by transferring different scales of details from one image to another. We also construct a large number of video-type adjacent frame pairs by simulating the distribution of real video data. With these nicely constructed datasets, we propose a coarse-to-fine learning framework consisting of three convolutional networks. The networks are trained for real-time detailed 3D face reconstruction from monocular video as well as from a single image. Extensive experimental results demonstrate that our framework can produce high-quality reconstruction but with much less computation time compared to the state-of-the-art. Moreover, our method is robust to pose, expression and lighting due to the diversity of data. |
Tasks | 3D Face Reconstruction, Face Reconstruction |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.00980v3 |
http://arxiv.org/pdf/1708.00980v3.pdf | |
PWC | https://paperswithcode.com/paper/cnn-based-real-time-dense-face-reconstruction |
Repo | |
Framework | |
Setting Players’ Behaviors in World of Warcraft through Semi-Supervised Learning
Title | Setting Players’ Behaviors in World of Warcraft through Semi-Supervised Learning |
Authors | Marcelo Souza Nery, Roque Anderson Teixeira, Victor do Nascimento Silva, Adriano Alonso Veloso |
Abstract | Digital games are one of the major and most important fields on the entertainment domain, which also involves cinema and music. Numerous attempts have been done to improve the quality of the games including more realistic artistic production and computer science. Assessing the player’s behavior, a task known as player modeling, is currently the need of the hour which leads to possible improvements in terms of: (i) better game interaction experience, (ii) better exploitation of the relationship between players, and (iii) increasing/maintaining the number of players interested in the game. In this paper we model players using the basic four behaviors proposed in \cite{BartleArtigo}, namely: achiever, explorer, socializer and killer. Our analysis is carried out using data obtained from the game “World of Warcraft” over 3 years (2006 $-$ 2009). We employ a semi-supervised learning technique in order to find out characteristics that possibly impact player’s behavior. |
Tasks | |
Published | 2017-06-08 |
URL | http://arxiv.org/abs/1706.02780v1 |
http://arxiv.org/pdf/1706.02780v1.pdf | |
PWC | https://paperswithcode.com/paper/setting-players-behaviors-in-world-of |
Repo | |
Framework | |
Concentration of tempered posteriors and of their variational approximations
Title | Concentration of tempered posteriors and of their variational approximations |
Authors | Pierre Alquier, James Ridgway |
Abstract | While Bayesian methods are extremely popular in statistics and machine learning, their application to massive datasets is often challenging, when possible at all. Indeed, the classical MCMC algorithms are prohibitively slow when both the model dimension and the sample size are large. Variational Bayesian methods aim at approximating the posterior by a distribution in a tractable family. Thus, MCMC are replaced by an optimization algorithm which is orders of magnitude faster. VB methods have been applied in such computationally demanding applications as including collaborative filtering, image and video processing, NLP and text processing… However, despite very nice results in practice, the theoretical properties of these approximations are usually not known. In this paper, we propose a general approach to prove the concentration of variational approximations of fractional posteriors. We apply our theory to two examples: matrix completion, and Gaussian VB. |
Tasks | Matrix Completion |
Published | 2017-06-28 |
URL | http://arxiv.org/abs/1706.09293v3 |
http://arxiv.org/pdf/1706.09293v3.pdf | |
PWC | https://paperswithcode.com/paper/concentration-of-tempered-posteriors-and-of |
Repo | |
Framework | |
Empirical Bayes Estimators for High-Dimensional Sparse Vectors
Title | Empirical Bayes Estimators for High-Dimensional Sparse Vectors |
Authors | Pavan Srinath, Ramji Venkataramanan |
Abstract | The problem of estimating a high-dimensional sparse vector $\boldsymbol{\theta} \in \mathbb{R}^n$ from an observation in i.i.d. Gaussian noise is considered. The performance is measured using squared-error loss. An empirical Bayes shrinkage estimator, derived using a Bernoulli-Gaussian prior, is analyzed and compared with the well-known soft-thresholding estimator. We obtain concentration inequalities for the Stein’s unbiased risk estimate and the loss function of both estimators. The results show that for large $n$, both the risk estimate and the loss function concentrate on deterministic values close to the true risk. Depending on the underlying $\boldsymbol{\theta}$, either the proposed empirical Bayes (eBayes) estimator or soft-thresholding may have smaller loss. We consider a hybrid estimator that attempts to pick the better of the soft-thresholding estimator and the eBayes estimator by comparing their risk estimates. It is shown that: i) the loss of the hybrid estimator concentrates on the minimum of the losses of the two competing estimators, and ii) the risk of the hybrid estimator is within order $\frac{1}{\sqrt{n}}$ of the minimum of the two risks. Simulation results are provided to support the theoretical results. Finally, we use the eBayes and hybrid estimators as denoisers in the approximate message passing (AMP) algorithm for compressed sensing, and show that their performance is superior to the soft-thresholding denoiser in a wide range of settings. |
Tasks | |
Published | 2017-07-28 |
URL | http://arxiv.org/abs/1707.09161v3 |
http://arxiv.org/pdf/1707.09161v3.pdf | |
PWC | https://paperswithcode.com/paper/empirical-bayes-estimators-for-high |
Repo | |
Framework | |
A Further Analysis of The Role of Heterogeneity in Coevolutionary Spatial Games
Title | A Further Analysis of The Role of Heterogeneity in Coevolutionary Spatial Games |
Authors | Marcos Cardinot, Josephine Griffith, Colm O’Riordan |
Abstract | Heterogeneity has been studied as one of the most common explanations of the puzzle of cooperation in social dilemmas. A large number of papers have been published discussing the effects of increasing heterogeneity in structured populations of agents, where it has been established that heterogeneity may favour cooperative behaviour if it supports agents to locally coordinate their strategies. In this paper, assuming an existing model of a heterogeneous weighted network, we aim to further this analysis by exploring the relationship (if any) between heterogeneity and cooperation. We adopt a weighted network which is fully populated by agents playing both the Prisoner’s Dilemma or the Optional Prisoner’s Dilemma games with coevolutionary rules, i.e., not only the strategies but also the link weights evolve over time. Surprisingly, results show that the heterogeneity of link weights (states) on their own does not always promote cooperation; rather cooperation is actually favoured by the increase in the number of overlapping states and not by the heterogeneity itself. We believe that these results can guide further research towards a more accurate analysis of the role of heterogeneity in social dilemmas. |
Tasks | |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03417v1 |
http://arxiv.org/pdf/1711.03417v1.pdf | |
PWC | https://paperswithcode.com/paper/a-further-analysis-of-the-role-of |
Repo | |
Framework | |
Insensitive Stochastic Gradient Twin Support Vector Machine for Large Scale Problems
Title | Insensitive Stochastic Gradient Twin Support Vector Machine for Large Scale Problems |
Authors | Zhen Wang, Yuan-Hai Shao, Lan Bai, Li-Ming Liu, Nai-Yang Deng |
Abstract | Stochastic gradient descent algorithm has been successfully applied on support vector machines (called PEGASOS) for many classification problems. In this paper, stochastic gradient descent algorithm is investigated to twin support vector machines for classification. Compared with PEGASOS, the proposed stochastic gradient twin support vector machines (SGTSVM) is insensitive on stochastic sampling for stochastic gradient descent algorithm. In theory, we prove the convergence of SGTSVM instead of almost sure convergence of PEGASOS. For uniformly sampling, the approximation between SGTSVM and twin support vector machines is also given, while PEGASOS only has an opportunity to obtain an approximation of support vector machines. In addition, the nonlinear SGTSVM is derived directly from its linear case. Experimental results on both artificial datasets and large scale problems show the stable performance of SGTSVM with a fast learning speed. |
Tasks | |
Published | 2017-04-19 |
URL | http://arxiv.org/abs/1704.05596v2 |
http://arxiv.org/pdf/1704.05596v2.pdf | |
PWC | https://paperswithcode.com/paper/insensitive-stochastic-gradient-twin-support |
Repo | |
Framework | |
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Title | Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping |
Authors | Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke |
Abstract | Instrumenting and collecting annotated visual grasping datasets to train modern machine learning algorithms can be extremely time-consuming and expensive. An appealing alternative is to use off-the-shelf simulators to render synthetic data for which ground-truth annotations are generated automatically. Unfortunately, models trained purely on simulated data often fail to generalize to the real world. We study how randomized simulated environments and domain adaptation methods can be extended to train a grasping system to grasp novel objects from raw monocular RGB images. We extensively evaluate our approaches with a total of more than 25,000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN. We show that, by using synthetic data and domain adaptation, we are able to reduce the number of real-world samples needed to achieve a given level of performance by up to 50 times, using only randomly generated simulated objects. We also show that by using only unlabeled real-world data and our GraspGAN methodology, we obtain real-world grasping performance without any real-world labels that is similar to that achieved with 939,777 labeled real-world samples. |
Tasks | Domain Adaptation, Robotic Grasping |
Published | 2017-09-22 |
URL | http://arxiv.org/abs/1709.07857v2 |
http://arxiv.org/pdf/1709.07857v2.pdf | |
PWC | https://paperswithcode.com/paper/using-simulation-and-domain-adaptation-to |
Repo | |
Framework | |
Image Compression Based on Compressive Sensing: End-to-End Comparison with JPEG
Title | Image Compression Based on Compressive Sensing: End-to-End Comparison with JPEG |
Authors | Xin Yuan, Raziel Haimi-Cohen |
Abstract | We present an end-to-end image compression system based on compressive sensing. The presented system integrates the conventional scheme of compressive sampling and reconstruction with quantization and entropy coding. The compression performance, in terms of decoded image quality versus data rate, is shown to be comparable with JPEG and significantly better at the low rate range. We study the parameters that influence the system performance, including (i) the choice of sensing matrix, (ii) the trade-off between quantization and compression ratio, and (iii) the reconstruction algorithms. We propose an effective method to jointly control the quantization step and compression ratio in order to achieve near optimal quality at any given bit rate. Furthermore, our proposed image compression system can be directly used in the compressive sensing camera, e.g. the single pixel camera, to construct a hardware compressive sampling system. |
Tasks | Compressive Sensing, Image Compression, Quantization |
Published | 2017-06-03 |
URL | https://arxiv.org/abs/1706.01000v3 |
https://arxiv.org/pdf/1706.01000v3.pdf | |
PWC | https://paperswithcode.com/paper/image-compression-based-on-compressive |
Repo | |
Framework | |
Structured Production System (extended abstract)
Title | Structured Production System (extended abstract) |
Authors | Yi Zhou |
Abstract | In this extended abstract, we propose Structured Production Systems (SPS), which extend traditional production systems with well-formed syntactic structures. Due to the richness of structures, structured production systems significantly enhance the expressive power as well as the flexibility of production systems, for instance, to handle uncertainty. We show that different rule application strategies can be reduced into the basic one by utilizing structures. Also, many fundamental approaches in computer science, including automata, grammar and logic, can be captured by structured production systems. |
Tasks | |
Published | 2017-04-26 |
URL | http://arxiv.org/abs/1704.07950v1 |
http://arxiv.org/pdf/1704.07950v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-production-system-extended |
Repo | |
Framework | |
Deep Local Video Feature for Action Recognition
Title | Deep Local Video Feature for Action Recognition |
Authors | Zhenzhong Lan, Yi Zhu, Alexander G. Hauptmann |
Abstract | We investigate the problem of representing an entire video using CNN features for human action recognition. Currently, limited by GPU memory, we have not been able to feed a whole video into CNN/RNNs for end-to-end learning. A common practice is to use sampled frames as inputs and video labels as supervision. One major problem of this popular approach is that the local samples may not contain the information indicated by global labels. To deal with this problem, we propose to treat the deep networks trained on local inputs as local feature extractors. After extracting local features, we aggregate them into global features and train another mapping function on the same training data to map the global features into global labels. We study a set of problems regarding this new type of local features such as how to aggregate them into global features. Experimental results on HMDB51 and UCF101 datasets show that, for these new local features, a simple maximum pooling on the sparsely sampled features lead to significant performance improvement. |
Tasks | Temporal Action Localization |
Published | 2017-01-25 |
URL | http://arxiv.org/abs/1701.07368v2 |
http://arxiv.org/pdf/1701.07368v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-local-video-feature-for-action |
Repo | |
Framework | |
A Locally Adapting Technique for Boundary Detection using Image Segmentation
Title | A Locally Adapting Technique for Boundary Detection using Image Segmentation |
Authors | Marylesa Howard, Margaret C. Hock, B. T. Meehan, Leora Dresselhaus-Cooper |
Abstract | Rapid growth in the field of quantitative digital image analysis is paving the way for researchers to make precise measurements about objects in an image. To compute quantities from the image such as the density of compressed materials or the velocity of a shockwave, we must determine object boundaries. Images containing regions that each have a spatial trend in intensity are of particular interest. We present a supervised image segmentation method that incorporates spatial information to locate boundaries between regions with overlapping intensity histograms. The segmentation of a pixel is determined by comparing its intensity to distributions from local, nearby pixel intensities. Because of the statistical nature of the algorithm, we use maximum likelihood estimation theory to quantify uncertainty about each boundary. We demonstrate the success of this algorithm on a radiograph of a multicomponent cylinder and on an optical image of a laser-induced shockwave, and we provide final boundary locations with associated bands of uncertainty. |
Tasks | Boundary Detection, Semantic Segmentation |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.09030v1 |
http://arxiv.org/pdf/1707.09030v1.pdf | |
PWC | https://paperswithcode.com/paper/a-locally-adapting-technique-for-boundary |
Repo | |
Framework | |
On the nonparametric maximum likelihood estimator for Gaussian location mixture densities with application to Gaussian denoising
Title | On the nonparametric maximum likelihood estimator for Gaussian location mixture densities with application to Gaussian denoising |
Authors | Sujayam Saha, Adityanand Guntuboyina |
Abstract | We study the Nonparametric Maximum Likelihood Estimator (NPMLE) for estimating Gaussian location mixture densities in $d$-dimensions from independent observations. Unlike usual likelihood-based methods for fitting mixtures, NPMLEs are based on convex optimization. We prove finite sample results on the Hellinger accuracy of every NPMLE. Our results imply, in particular, that every NPMLE achieves near parametric risk (up to logarithmic multiplicative factors) when the true density is a discrete Gaussian mixture without any prior information on the number of mixture components. NPMLEs can naturally be used to yield empirical Bayes estimates of the Oracle Bayes estimator in the Gaussian denoising problem. We prove bounds for the accuracy of the empirical Bayes estimate as an approximation to the Oracle Bayes estimator. Here our results imply that the empirical Bayes estimator performs at nearly the optimal level (up to logarithmic multiplicative factors) for denoising in clustering situations without any prior knowledge of the number of clusters. |
Tasks | Denoising |
Published | 2017-12-06 |
URL | https://arxiv.org/abs/1712.02009v2 |
https://arxiv.org/pdf/1712.02009v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-nonparametric-maximum-likelihood |
Repo | |
Framework | |