July 27, 2019

3198 words 16 mins read

Paper Group ANR 645

Micro Fourier Transform Profilometry ($μ$FTP): 3D shape measurement at 10,000 frames per second. Structure-measure: A New Way to Evaluate Foreground Maps. Locality Preserving Projections for Grassmann manifold. 3D Reconstruction of Simple Objects from A Single View Silhouette Image. Localized LRR on Grassmann Manifolds: An Extrinsic View. Fast Regr …

Micro Fourier Transform Profilometry ($μ$FTP): 3D shape measurement at 10,000 frames per second


Title	Micro Fourier Transform Profilometry ($μ$FTP): 3D shape measurement at 10,000 frames per second
Authors	Chao Zuo, Tianyang Tao, Shijie Feng, Lei Huang, Anand Asundi, Qian Chen
Abstract	Recent advances in imaging sensors and digital light projection technology have facilitated a rapid progress in 3D optical sensing, enabling 3D surfaces of complex-shaped objects to be captured with improved resolution and accuracy. However, due to the large number of projection patterns required for phase recovery and disambiguation, the maximum fame rates of current 3D shape measurement techniques are still limited to the range of hundreds of frames per second (fps). Here, we demonstrate a new 3D dynamic imaging technique, Micro Fourier Transform Profilometry ($\mu$FTP), which can capture 3D surfaces of transient events at up to 10,000 fps based on our newly developed high-speed fringe projection system. Compared with existing techniques, $\mu$FTP has the prominent advantage of recovering an accurate, unambiguous, and dense 3D point cloud with only two projected patterns. Furthermore, the phase information is encoded within a single high-frequency fringe image, thereby allowing motion-artifact-free reconstruction of transient events with temporal resolution of 50 microseconds. To show $\mu$FTP’s broad utility, we use it to reconstruct 3D videos of 4 transient scenes: vibrating cantilevers, rotating fan blades, bullet fired from a toy gun, and balloon’s explosion triggered by a flying dart, which were previously difficult or even unable to be captured with conventional approaches.
Tasks
Published	2017-05-31
URL	http://arxiv.org/abs/1705.10930v1
PDF	http://arxiv.org/pdf/1705.10930v1.pdf
PWC	https://paperswithcode.com/paper/micro-fourier-transform-profilometry-ftp-3d
Repo
Framework

Structure-measure: A New Way to Evaluate Foreground Maps


Title	Structure-measure: A New Way to Evaluate Foreground Maps
Authors	Deng-Ping Fan, Ming-Ming Cheng, Yun Liu, Tao Li, Ali Borji
Abstract	Foreground map evaluation is crucial for gauging the progress of object segmentation algorithms, in particular in the filed of salient object detection where the purpose is to accurately detect and segment the most salient object in a scene. Several widely-used measures such as Area Under the Curve (AUC), Average Precision (AP) and the recently proposed Fbw have been utilized to evaluate the similarity between a non-binary saliency map (SM) and a ground-truth (GT) map. These measures are based on pixel-wise errors and often ignore the structural similarities. Behavioral vision studies, however, have shown that the human visual system is highly sensitive to structures in scenes. Here, we propose a novel, efficient, and easy to calculate measure known an structural similarity measure (Structure-measure) to evaluate non-binary foreground maps. Our new measure simultaneously evaluates region-aware and object-aware structural similarity between a SM and a GT map. We demonstrate superiority of our measure over existing ones using 5 meta-measures on 5 benchmark datasets.
Tasks	Object Detection, Salient Object Detection, Semantic Segmentation
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00786v1
PDF	http://arxiv.org/pdf/1708.00786v1.pdf
PWC	https://paperswithcode.com/paper/structure-measure-a-new-way-to-evaluate
Repo
Framework

Locality Preserving Projections for Grassmann manifold


Title	Locality Preserving Projections for Grassmann manifold
Authors	Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Haoran Chen, Baocai Yin
Abstract	Learning on Grassmann manifold has become popular in many computer vision tasks, with the strong capability to extract discriminative information for imagesets and videos. However, such learning algorithms particularly on high-dimensional Grassmann manifold always involve with significantly high computational cost, which seriously limits the applicability of learning on Grassmann manifold in more wide areas. In this research, we propose an unsupervised dimensionality reduction algorithm on Grassmann manifold based on the Locality Preserving Projections (LPP) criterion. LPP is a commonly used dimensionality reduction algorithm for vector-valued data, aiming to preserve local structure of data in the dimension-reduced space. The strategy is to construct a mapping from higher dimensional Grassmann manifold into the one in a relative low-dimensional with more discriminative capability. The proposed method can be optimized as a basic eigenvalue problem. The performance of our proposed method is assessed on several classification and clustering tasks and the experimental results show its clear advantages over other Grassmann based algorithms.
Tasks	Dimensionality Reduction
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08458v1
PDF	http://arxiv.org/pdf/1704.08458v1.pdf
PWC	https://paperswithcode.com/paper/locality-preserving-projections-for-grassmann
Repo
Framework

3D Reconstruction of Simple Objects from A Single View Silhouette Image


Title	3D Reconstruction of Simple Objects from A Single View Silhouette Image
Authors	Xinhan Di, Pengqian Yu
Abstract	While recent deep neural networks have achieved promising results for 3D reconstruction from a single-view image, these rely on the availability of RGB textures in images and extra information as supervision. In this work, we propose novel stacked hierarchical networks and an end to end training strategy to tackle a more challenging task for the first time, 3D reconstruction from a single-view 2D silhouette image. We demonstrate that our model is able to conduct 3D reconstruction from a single-view silhouette image both qualitatively and quantitatively. Evaluation is performed using Shapenet for the single-view reconstruction and results are presented in comparison with a single network, to highlight the improvements obtained with the proposed stacked networks and the end to end training strategy. Furthermore, 3D re- construction in forms of IoU is compared with the state of art 3D reconstruction from a single-view RGB image, and the proposed model achieves higher IoU than the state of art of reconstruction from a single view RGB image.
Tasks	3D Reconstruction
Published	2017-01-17
URL	http://arxiv.org/abs/1701.04752v1
PDF	http://arxiv.org/pdf/1701.04752v1.pdf
PWC	https://paperswithcode.com/paper/3d-reconstruction-of-simple-objects-from-a
Repo
Framework

Localized LRR on Grassmann Manifolds: An Extrinsic View


Title	Localized LRR on Grassmann Manifolds: An Extrinsic View
Authors	Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin
Abstract	Subspace data representation has recently become a common practice in many computer vision tasks. It demands generalizing classical machine learning algorithms for subspace data. Low-Rank Representation (LRR) is one of the most successful models for clustering vectorial data according to their subspace structures. This paper explores the possibility of extending LRR for subspace data on Grassmann manifolds. Rather than directly embedding the Grassmann manifolds into the symmetric matrix space, an extrinsic view is taken to build the LRR self-representation in the local area of the tangent space at each Grassmannian point, resulting in a localized LRR method on Grassmann manifolds. A novel algorithm for solving the proposed model is investigated and implemented. The performance of the new clustering algorithm is assessed through experiments on several real-world datasets including MNIST handwritten digits, ballet video clips, SKIG action clips, DynTex++ dataset and highway traffic video clips. The experimental results show the new method outperforms a number of state-of-the-art clustering methods
Tasks
Published	2017-05-17
URL	http://arxiv.org/abs/1705.06599v1
PDF	http://arxiv.org/pdf/1705.06599v1.pdf
PWC	https://paperswithcode.com/paper/localized-lrr-on-grassmann-manifolds-an
Repo
Framework

Fast Regression with an $\ell_\infty$ Guarantee


Title	Fast Regression with an $\ell_\infty$ Guarantee
Authors	Eric Price, Zhao Song, David P. Woodruff
Abstract	Sketching has emerged as a powerful technique for speeding up problems in numerical linear algebra, such as regression. In the overconstrained regression problem, one is given an $n \times d$ matrix $A$, with $n \gg d$, as well as an $n \times 1$ vector $b$, and one wants to find a vector $\hat{x}$ so as to minimize the residual error $\Ax-b_2$. Using the sketch and solve paradigm, one first computes $S \cdot A$ and $S \cdot b$ for a randomly chosen matrix $S$, then outputs $x’ = (SA)^{\dagger} Sb$ so as to minimize $\SAx’ - Sb_2$. The sketch-and-solve paradigm gives a bound on $\x’-x^_2$ when $A$ is well-conditioned. Our main result is that, when $S$ is the subsampled randomized Fourier/Hadamard transform, the error $x’ - x^$ behaves as if it lies in a “random” direction within this bound: for any fixed direction $a\in \mathbb{R}^d$, we have with $1 - d^{-c}$ probability that [ \langle a, x’-x^\rangle \lesssim \frac{\a_2\x’-x^_2}{d^{\frac{1}{2}-\gamma}}, \quad (1) ] where $c, \gamma > 0$ are arbitrary constants. This implies $\x’-x^_{\infty}$ is a factor $d^{\frac{1}{2}-\gamma}$ smaller than $\x’-x^_2$. It also gives a better bound on the generalization of $x'$ to new examples: if rows of $A$ correspond to examples and columns to features, then our result gives a better bound for the error introduced by sketch-and-solve when classifying fresh examples. We show that not all oblivious subspace embeddings $S$ satisfy these properties. In particular, we give counterexamples showing that matrices based on Count-Sketch or leverage score sampling do not satisfy these properties. We also provide lower bounds, both on how small $\x’-x^*_2$ can be, and for our new guarantee (1), showing that the subsampled randomized Fourier/Hadamard transform is nearly optimal.
Tasks
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10723v1
PDF	http://arxiv.org/pdf/1705.10723v1.pdf
PWC	https://paperswithcode.com/paper/fast-regression-with-an-ell_infty-guarantee
Repo
Framework

Shallow reading with Deep Learning: Predicting popularity of online content using only its title


Title	Shallow reading with Deep Learning: Predicting popularity of online content using only its title
Authors	Wociech Stokowiec, Tomasz Trzcinski, Krzysztof Wolk, Krzysztof Marasek, Przemyslaw Rokita
Abstract	With the ever decreasing attention span of contemporary Internet users, the title of online content (such as a news article or video) can be a major factor in determining its popularity. To take advantage of this phenomenon, we propose a new method based on a bidirectional Long Short-Term Memory (LSTM) neural network designed to predict the popularity of online content using only its title. We evaluate the proposed architecture on two distinct datasets of news articles and news videos distributed in social media that contain over 40,000 samples in total. On those datasets, our approach improves the performance over traditional shallow approaches by a margin of 15%. Additionally, we show that using pre-trained word vectors in the embedding layer improves the results of LSTM models, especially when the training set is small. To our knowledge, this is the first attempt of applying popularity prediction using only textual information from the title.
Tasks
Published	2017-07-21
URL	http://arxiv.org/abs/1707.06806v1
PDF	http://arxiv.org/pdf/1707.06806v1.pdf
PWC	https://paperswithcode.com/paper/shallow-reading-with-deep-learning-predicting
Repo
Framework

Pulsar Candidate Identification with Artificial Intelligence Techniques


Title	Pulsar Candidate Identification with Artificial Intelligence Techniques
Authors	Ping Guo, Fuqing Duan, Pei Wang, Yao Yao, Qian Yin, Xin Xin
Abstract	Discovering pulsars is a significant and meaningful research topic in the field of radio astronomy. With the advent of astronomical instruments such as he Five-hundred-meter Aperture Spherical Telescope (FAST) in China, data volumes and data rates are exponentially growing. This fact necessitates a focus on artificial intelligence (AI) technologies that can perform the automatic pulsar candidate identification to mine large astronomical data sets. Automatic pulsar candidate identification can be considered as a task of determining potential candidates for further investigation and eliminating noises of radio frequency interferences or other non-pulsar signals. It is very hard to raise the performance of DCNN-based pulsar identification because the limited training samples restrict network structure to be designed deep enough for learning good features as well as the crucial class imbalance problem due to very limited number of real pulsar samples. To address these problems, we proposed a framework which combines deep convolution generative adversarial network (DCGAN) with support vector machine (SVM) to deal with imbalance class problem and to improve pulsar identification accuracy. DCGAN is used as sample generation and feature learning model, and SVM is adopted as the classifier for predicting candidate’s labels in the inference stage. The proposed framework is a novel technique which not only can solve imbalance class problem but also can learn discriminative feature representations of pulsar candidates instead of computing hand-crafted features in preprocessing steps too, which makes it more accurate for automatic pulsar candidate selection. Experiments on two pulsar datasets verify the effectiveness and efficiency of our proposed method.
Tasks
Published	2017-11-27
URL	https://arxiv.org/abs/1711.10339v2
PDF	https://arxiv.org/pdf/1711.10339v2.pdf
PWC	https://paperswithcode.com/paper/pulsar-candidate-identification-with
Repo
Framework

Reasoning with shapes: profiting cognitive susceptibilities to infer linear mapping transformations between shapes


Title	Reasoning with shapes: profiting cognitive susceptibilities to infer linear mapping transformations between shapes
Authors	Vahid Jalili
Abstract	Visual information plays an indispensable role in our daily interactions with environment. Such information is manipulated for a wide range of purposes spanning from basic object and material perception to complex gesture interpretations. There have been novel studies in cognitive science for in-depth understanding of visual information manipulation, which lead to answer questions such as: how we infer 2D/3D motion from a sequence of 2D images? how we understand a motion from a single image frame? how we see forest avoiding trees? Leveraging on congruence, linear mapping transformation determination between a set of shapes facilitate motion perception. Present study methodizes recent discoveries of human cognitive ability for scene understanding. The proposed method processes images hierarchically, that is an iterative analysis of scene abstractions using a rapidly converging heuristic iterative method. The method hierarchically abstracts images; the abstractions are represented in polar coordinate system, and any two consecutive abstractions have incremental level of details. The method then creates a graph of approximated linear mapping transformations based on circular shift permutations of hierarchical abstractions. The graph is then traversed in best-first fashion to find best linear mapping transformation. The accuracy of the proposed method is assessed using normal, noisy, and deformed images. Additionally, the present study deduces (i) the possibility of determining optimal mapping linear transformations in logarithmic iterations with respect to the precision of results, and (ii) computational cost is independent from the resolution of input shapes.
Tasks	Scene Understanding
Published	2017-09-01
URL	http://arxiv.org/abs/1709.00158v1
PDF	http://arxiv.org/pdf/1709.00158v1.pdf
PWC	https://paperswithcode.com/paper/reasoning-with-shapes-profiting-cognitive
Repo
Framework

Geometrical Insights for Implicit Generative Modeling


Title	Geometrical Insights for Implicit Generative Modeling
Authors	Leon Bottou, Martin Arjovsky, David Lopez-Paz, Maxime Oquab
Abstract	Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion. A careful look at the geometries induced by these distances on the space of probability measures reveals interesting differences. In particular, we can establish surprising approximate global convergence guarantees for the $1$-Wasserstein distance,even when the parametric generator has a nonconvex parametrization.
Tasks
Published	2017-12-21
URL	https://arxiv.org/abs/1712.07822v3
PDF	https://arxiv.org/pdf/1712.07822v3.pdf
PWC	https://paperswithcode.com/paper/geometrical-insights-for-implicit-generative
Repo
Framework

Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions


Title	Toward Real-Time Decentralized Reinforcement Learning using Finite Support Basis Functions
Authors	Kenzo Lobos-Tsunekawa, David L. Leottau, Javier Ruiz-del-Solar
Abstract	This paper addresses the design and implementation of complex Reinforcement Learning (RL) behaviors where multi-dimensional action spaces are involved, as well as the need to execute the behaviors in real-time using robotic platforms with limited computational resources and training times. For this purpose, we propose the use of decentralized RL, in combination with finite support basis functions as alternatives to Gaussian RBF, in order to alleviate the effects of the curse of dimensionality on the action and state spaces respectively, and to reduce the computation time. As testbed, a RL based controller for the in-walk kick in NAO robots, a challenging and critical problem for soccer robotics, is used. The reported experiments show empirically that our solution saves up to 99.94% of execution time and 98.82% of memory consumption during execution, without diminishing performance compared to classical approaches.
Tasks
Published	2017-06-20
URL	http://arxiv.org/abs/1706.06695v1
PDF	http://arxiv.org/pdf/1706.06695v1.pdf
PWC	https://paperswithcode.com/paper/toward-real-time-decentralized-reinforcement
Repo
Framework

Alpha-Divergences in Variational Dropout


Title	Alpha-Divergences in Variational Dropout
Authors	Bogdan Mazoure, Riashat Islam
Abstract	We investigate the use of alternative divergences to Kullback-Leibler (KL) in variational inference(VI), based on the Variational Dropout \cite{kingma2015}. Stochastic gradient variational Bayes (SGVB) \cite{aevb} is a general framework for estimating the evidence lower bound (ELBO) in Variational Bayes. In this work, we extend the SGVB estimator with using Alpha-Divergences, which are alternative to divergences to VI’ KL objective. The Gaussian dropout can be seen as a local reparametrization trick of the SGVB objective. We extend the Variational Dropout to use alpha divergences for variational inference. Our results compare $\alpha$-divergence variational dropout with standard variational dropout with correlated and uncorrelated weight noise. We show that the $\alpha$-divergence with $\alpha \rightarrow 1$ (or KL divergence) is still a good measure for use in variational inference, in spite of the efficient use of Alpha-divergences for Dropout VI \cite{Li17}. $\alpha \rightarrow 1$ can yield the lowest training error, and optimizes a good lower bound for the evidence lower bound (ELBO) among all values of the parameter $\alpha \in [0,\infty)$.
Tasks
Published	2017-11-12
URL	http://arxiv.org/abs/1711.04345v1
PDF	http://arxiv.org/pdf/1711.04345v1.pdf
PWC	https://paperswithcode.com/paper/alpha-divergences-in-variational-dropout
Repo
Framework

Enhanced Biologically Inspired Model for Image Recognition Based on a Novel Patch Selection Method with Moment


Title	Enhanced Biologically Inspired Model for Image Recognition Based on a Novel Patch Selection Method with Moment
Authors	Yan-Feng Lu, Li-Hao Jia, Hong Qaio, Yi Li
Abstract	Biologically inspired model (BIM) for image recognition is a robust computational architecture, which has attracted widespread attention. BIM can be described as a four-layer structure based on the mechanisms of the visual cortex. Although the performance of BIM for image recognition is robust, it takes the randomly selected ways for the patch selection, which is sightless, and results in heavy computing burden. To address this issue, we propose a novel patch selection method with oriented Gaussian-Hermite moment (PSGHM), and we enhanced the BIM based on the proposed PSGHM, named as PBIM. In contrast to the conventional BIM which adopts the random method to select patches within the feature representation layers processed by multi-scale Gabor filter banks, the proposed PBIM takes the PSGHM way to extract a small number of representation features while offering promising distinctiveness. To show the effectiveness of the proposed PBIM, experimental studies on object categorization are conducted on the CalTech05, TU Darmstadt (TUD), and GRAZ01 databases. Experimental results demonstrate that the performance of PBIM is a significant improvement on that of the conventional BIM.
Tasks
Published	2017-10-27
URL	http://arxiv.org/abs/1710.10188v1
PDF	http://arxiv.org/pdf/1710.10188v1.pdf
PWC	https://paperswithcode.com/paper/enhanced-biologically-inspired-model-for
Repo
Framework

Simplified End-to-End MMI Training and Voting for ASR


Title	Simplified End-to-End MMI Training and Voting for ASR
Authors	Lior Fritz, David Burshtein
Abstract	A simplified speech recognition system that uses the maximum mutual information (MMI) criterion is considered. End-to-end training using gradient descent is suggested, similarly to the training of connectionist temporal classification (CTC). We use an MMI criterion with a simple language model in the training stage, and a standard HMM decoder. Our method compares favorably to CTC in terms of performance, robustness, decoding time, disk footprint and quality of alignments. The good alignments enable the use of a straightforward ensemble method, obtained by simply averaging the predictions of several neural network models, that were trained separately end-to-end. The ensemble method yields a considerable reduction in the word error rate.
Tasks	Language Modelling, Speech Recognition
Published	2017-03-30
URL	http://arxiv.org/abs/1703.10356v2
PDF	http://arxiv.org/pdf/1703.10356v2.pdf
PWC	https://paperswithcode.com/paper/simplified-end-to-end-mmi-training-and-voting
Repo
Framework

Reinforcement Learning for Transition-Based Mention Detection


Title	Reinforcement Learning for Transition-Based Mention Detection
Authors	Georgiana Dinu, Wael Hamza, Radu Florian
Abstract	This paper describes an application of reinforcement learning to the mention detection task. We define a novel action-based formulation for the mention detection task, in which a model can flexibly revise past labeling decisions by grouping together tokens and assigning partial mention labels. We devise a method to create mention-level episodes and we train a model by rewarding correctly labeled complete mentions, irrespective of the inner structure created. The model yields results which are on par with a competitive supervised counterpart while being more flexible in terms of achieving targeted behavior through reward modeling and generating internal mention structure, especially on longer mentions.
Tasks
Published	2017-03-13
URL	http://arxiv.org/abs/1703.04489v1
PDF	http://arxiv.org/pdf/1703.04489v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-for-transition-based
Repo
Framework