May 5, 2019

3187 words 15 mins read

Paper Group ANR 556

Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints. Position paper: Towards an observer-oriented theory of shape comparison. A statistical model of tristimulus measurements within and between OLED displays. Geometry of Polysemy. Kannada Spell Checker with Sandhi Splitter. Density-based Denoising of Point Clou …


Title	Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints
Authors	Shayan Modiri Assari, Haroon Idrees, Mubarak Shah
Abstract	This paper addresses the problem of human re-identification across non-overlapping cameras in crowds.Re-identification in crowded scenes is a challenging problem due to large number of people and frequent occlusions, coupled with changes in their appearance due to different properties and exposure of cameras. To solve this problem, we model multiple Personal, Social and Environmental (PSE) constraints on human motion across cameras. The personal constraints include appearance and preferred speed of each individual assumed to be similar across the non-overlapping cameras. The social influences (constraints) are quadratic in nature, i.e. occur between pairs of individuals, and modeled through grouping and collision avoidance. Finally, the environmental constraints capture the transition probabilities between gates (entrances / exits) in different cameras, defined as multi-modal distributions of transition time and destination between all pairs of gates. We incorporate these constraints into an energy minimization framework for solving human re-identification. Assigning $1-1$ correspondence while modeling PSE constraints is NP-hard. We present a stochastic local search algorithm to restrict the search space of hypotheses, and obtain $1-1$ solution in the presence of linear and quadratic PSE constraints. Moreover, we present an alternate optimization using Frank-Wolfe algorithm that solves the convex approximation of the objective function with linear relaxation on binary variables, and yields an order of magnitude speed up over stochastic local search with minor drop in performance. We evaluate our approach using Cumulative Matching Curves as well $1-1$ assignment on several thousand frames of Grand Central, PRID and DukeMTMC datasets, and obtain significantly better results compared to existing re-identification methods.
Tasks	Person Re-Identification
Published	2016-12-07
URL	http://arxiv.org/abs/1612.02155v1
PDF	http://arxiv.org/pdf/1612.02155v1.pdf
PWC	https://paperswithcode.com/paper/re-identification-of-humans-in-crowds-using
Repo
Framework

Position paper: Towards an observer-oriented theory of shape comparison


Title	Position paper: Towards an observer-oriented theory of shape comparison
Authors	Patrizio Frosini
Abstract	In this position paper we suggest a possible metric approach to shape comparison that is based on a mathematical formalization of the concept of observer, seen as a collection of suitable operators acting on a metric space of functions. These functions represent the set of data that are accessible to the observer, while the operators describe the way the observer elaborates the data and enclose the invariance that he/she associates with them. We expose this model and illustrate some theoretical reasons that justify its possible use for shape comparison.
Tasks
Published	2016-03-07
URL	http://arxiv.org/abs/1603.02008v1
PDF	http://arxiv.org/pdf/1603.02008v1.pdf
PWC	https://paperswithcode.com/paper/position-paper-towards-an-observer-oriented
Repo
Framework

A statistical model of tristimulus measurements within and between OLED displays


Title	A statistical model of tristimulus measurements within and between OLED displays
Authors	Matti Raitoharju, Samu Kallio, Matti Pellikka
Abstract	We present an empirical model for noises in color measurements from OLED displays. According to measured data the noise is not isotropic in the XYZ space, instead most of the noise is along an axis that is parallel to a vector from origin to measured XYZ vector. The presented empirical model is simple and depends only on the measured XYZ values. Our tests show that the variations between multiple panels of the same type have similar distribution as the temporal noise in measurements from a single panel, but a larger magnitude.
Tasks
Published	2016-08-28
URL	http://arxiv.org/abs/1608.08596v2
PDF	http://arxiv.org/pdf/1608.08596v2.pdf
PWC	https://paperswithcode.com/paper/a-statistical-model-of-tristimulus
Repo
Framework

Geometry of Polysemy


Title	Geometry of Polysemy
Authors	Jiaqi Mu, Suma Bhat, Pramod Viswanath
Abstract	Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call $K$-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus – yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results.
Tasks
Published	2016-10-24
URL	http://arxiv.org/abs/1610.07569v1
PDF	http://arxiv.org/pdf/1610.07569v1.pdf
PWC	https://paperswithcode.com/paper/geometry-of-polysemy
Repo
Framework

Kannada Spell Checker with Sandhi Splitter


Title	Kannada Spell Checker with Sandhi Splitter
Authors	A N Akshatha, Chandana G Upadhyaya, Rajashekara S Murthy
Abstract	Spelling errors are introduced in text either during typing, or when the user does not know the correct phoneme or grapheme. If a language contains complex words like sandhi where two or more morphemes join based on some rules, spell checking becomes very tedious. In such situations, having a spell checker with sandhi splitter which alerts the user by flagging the errors and providing suggestions is very useful. A novel algorithm of sandhi splitting is proposed in this paper. The sandhi splitter can split about 7000 most common sandhi words in Kannada language used as test samples. The sandhi splitter was integrated with a Kannada spell checker and a mechanism for generating suggestions was added. A comprehensive, platform independent, standalone spell checker with sandhi splitter application software was thus developed and tested extensively for its efficiency and correctness. A comparative analysis of this spell checker with sandhi splitter was made and results concluded that the Kannada spell checker with sandhi splitter has an improved performance. It is twice as fast, 200 times more space efficient, and it is 90% accurate in case of complex nouns and 50% accurate for complex verbs. Such a spell checker with sandhi splitter will be of foremost significance in machine translation systems, voice processing, etc. This is the first sandhi splitter in Kannada and the advantage of the novel algorithm is that, it can be extended to all Indian languages.
Tasks	Machine Translation
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08358v1
PDF	http://arxiv.org/pdf/1611.08358v1.pdf
PWC	https://paperswithcode.com/paper/kannada-spell-checker-with-sandhi-splitter
Repo
Framework

Density-based Denoising of Point Cloud


Title	Density-based Denoising of Point Cloud
Authors	Faisal Zaman, Ya Ping Wong, Boon Yian Ng
Abstract	Point cloud source data for surface reconstruction is usually contaminated with noise and outliers. To overcome this deficiency, a density-based point cloud denoising method is presented to remove outliers and noisy points. First, particle-swam optimization technique is employed for automatically approximating optimal bandwidth of multivariate kernel density estimation to ensure the robust performance of density estimation. Then, mean-shift based clustering technique is used to remove outliers through a thresholding scheme. After removing outliers from the point cloud, bilateral mesh filtering is applied to smooth the remaining points. The experimental results show that this approach, comparably, is robust and efficient.
Tasks	Denoising, Density Estimation
Published	2016-02-17
URL	http://arxiv.org/abs/1602.05312v1
PDF	http://arxiv.org/pdf/1602.05312v1.pdf
PWC	https://paperswithcode.com/paper/density-based-denoising-of-point-cloud
Repo
Framework

3-D Hand Pose Estimation from Kinect’s Point Cloud Using Appearance Matching


Title	3-D Hand Pose Estimation from Kinect’s Point Cloud Using Appearance Matching
Authors	Pasquale Coscia, Francesco A. N. Palmieri, Francesco Castaldo, Alberto Cavallo
Abstract	We present a novel appearance-based approach for pose estimation of a human hand using the point clouds provided by the low-cost Microsoft Kinect sensor. Both the free-hand case, in which the hand is isolated from the surrounding environment, and the hand-object case, in which the different types of interactions are classified, have been considered. The hand-object case is clearly the most challenging task having to deal with multiple tracks. The approach proposed here belongs to the class of partial pose estimation where the estimated pose in a frame is used for the initialization of the next one. The pose estimation is obtained by applying a modified version of the Iterative Closest Point (ICP) algorithm to synthetic models to obtain the rigid transformation that aligns each model with respect to the input data. The proposed framework uses a “pure” point cloud as provided by the Kinect sensor without any other information such as RGB values or normal vector components. For this reason, the proposed method can also be applied to data obtained from other types of depth sensor, or RGB-D camera.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2016-04-07
URL	http://arxiv.org/abs/1604.02032v1
PDF	http://arxiv.org/pdf/1604.02032v1.pdf
PWC	https://paperswithcode.com/paper/3-d-hand-pose-estimation-from-kinects-point
Repo
Framework

Algorithms for Generalized Cluster-wise Linear Regression


Title	Algorithms for Generalized Cluster-wise Linear Regression
Authors	Young Woong Park, Yan Jiang, Diego Klabjan, Loren Williams
Abstract	Cluster-wise linear regression (CLR), a clustering problem intertwined with regression, is to find clusters of entities such that the overall sum of squared errors from regressions performed over these clusters is minimized, where each cluster may have different variances. We generalize the CLR problem by allowing each entity to have more than one observation, and refer to it as generalized CLR. We propose an exact mathematical programming based approach relying on column generation, a column generation based heuristic algorithm that clusters predefined groups of entities, a metaheuristic genetic algorithm with adapted Lloyd’s algorithm for K-means clustering, a two-stage approach, and a modified algorithm of Sp{"a}th \cite{Spath1979} for solving generalized CLR. We examine the performance of our algorithms on a stock keeping unit (SKU) clustering problem employed in forecasting halo and cannibalization effects in promotions using real-world retail data from a large supermarket chain. In the SKU clustering problem, the retailer needs to cluster SKUs based on their seasonal effects in response to promotions. The seasonal effects are the results of regressions with predictors being promotion mechanisms and seasonal dummies performed over clusters generated. We compare the performance of all proposed algorithms for the SKU problem with real-world and synthetic data.
Tasks
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01417v2
PDF	http://arxiv.org/pdf/1607.01417v2.pdf
PWC	https://paperswithcode.com/paper/algorithms-for-generalized-cluster-wise
Repo
Framework

Learning to Navigate the Energy Landscape


Title	Learning to Navigate the Energy Landscape
Authors	Julien Valentin, Angela Dai, Matthias Nießner, Pushmeet Kohli, Philip Torr, Shahram Izadi, Cem Keskin
Abstract	In this paper, we present a novel and efficient architecture for addressing computer vision problems that use `Analysis by Synthesis’. Analysis by synthesis involves the minimization of the reconstruction error which is typically a non-convex function of the latent target variables. State-of-the-art methods adopt a hybrid scheme where discriminatively trained predictors like Random Forests or Convolutional Neural Networks are used to initialize local search algorithms. While these methods have been shown to produce promising results, they often get stuck in local optima. Our method goes beyond the conventional hybrid architecture by not only proposing multiple accurate initial solutions but by also defining a navigational structure over the solution space that can be used for extremely efficient gradient-free local search. We demonstrate the efficacy of our approach on the challenging problem of RGB Camera Relocalization. To make the RGB camera relocalization problem particularly challenging, we introduce a new dataset of 3D environments which are significantly larger than those found in other publicly-available datasets. Our experiments reveal that the proposed method is able to achieve state-of-the-art camera relocalization results. We also demonstrate the generalizability of our approach on Hand Pose Estimation and Image Retrieval tasks. \|
Tasks	Camera Relocalization, Hand Pose Estimation, Image Retrieval, Pose Estimation
Published	2016-03-18
URL	http://arxiv.org/abs/1603.05772v1
PDF	http://arxiv.org/pdf/1603.05772v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-navigate-the-energy-landscape
Repo
Framework

Streaming Recommender Systems


Title	Streaming Recommender Systems
Authors	Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Yin, Yi Chang, Mark A. Hasegawa-Johnson, Thomas S. Huang
Abstract	The increasing popularity of real-world recommender systems produces data continuously and rapidly, and it becomes more realistic to study recommender systems under streaming scenarios. Data streams present distinct properties such as temporally ordered, continuous and high-velocity, which poses tremendous challenges to traditional recommender systems. In this paper, we investigate the problem of recommendation with stream inputs. In particular, we provide a principled framework termed sRec, which provides explicit continuous-time random process models of the creation of users and topics, and of the evolution of their interests. A variational Bayesian approach called recursive meanfield approximation is proposed, which permits computationally efficient instantaneous on-line inference. Experimental results on several real-world datasets demonstrate the advantages of our sRec over other state-of-the-arts.
Tasks	Recommendation Systems
Published	2016-07-21
URL	http://arxiv.org/abs/1607.06182v1
PDF	http://arxiv.org/pdf/1607.06182v1.pdf
PWC	https://paperswithcode.com/paper/streaming-recommender-systems
Repo
Framework

Fully DNN-based Multi-label regression for audio tagging


Title	Fully DNN-based Multi-label regression for audio tagging
Authors	Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley
Abstract	Acoustic event detection for content analysis in most cases relies on lots of labeled data. However, manually annotating data is a time-consuming task, which thus makes few annotated resources available so far. Unlike audio event detection, automatic audio tagging, a multi-label acoustic event classification task, only relies on weakly labeled data. This is highly desirable to some practical applications using audio analysis. In this paper we propose to use a fully deep neural network (DNN) framework to handle the multi-label classification task in a regression way. Considering that only chunk-level rather than frame-level labels are available, the whole or almost whole frames of the chunk were fed into the DNN to perform a multi-label regression for the expected tags. The fully DNN, which is regarded as an encoding function, can well map the audio features sequence to a multi-tag vector. A deep pyramid structure was also designed to extract more robust high-level features related to the target tags. Further improved methods were adopted, such as the Dropout and background noise aware training, to enhance its generalization capability for new audio recordings in mismatched environments. Compared with the conventional Gaussian Mixture Model (GMM) and support vector machine (SVM) methods, the proposed fully DNN-based method could well utilize the long-term temporal information with the whole chunk as the input. The results show that our approach obtained a 15% relative improvement compared with the official GMM-based method of DCASE 2016 challenge.
Tasks	Audio Tagging, Multi-Label Classification
Published	2016-06-24
URL	http://arxiv.org/abs/1606.07695v2
PDF	http://arxiv.org/pdf/1606.07695v2.pdf
PWC	https://paperswithcode.com/paper/fully-dnn-based-multi-label-regression-for
Repo
Framework

Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods


Title	Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods
Authors	Xiaojie Jin, Xiaotong Yuan, Jiashi Feng, Shuicheng Yan
Abstract	Deep neural networks have achieved remarkable success in a wide range of practical problems. However, due to the inherent large parameter space, deep models are notoriously prone to overfitting and difficult to be deployed in portable devices with limited memory. In this paper, we propose an iterative hard thresholding (IHT) approach to train Skinny Deep Neural Networks (SDNNs). An SDNN has much fewer parameters yet can achieve competitive or even better performance than its full CNN counterpart. More concretely, the IHT approach trains an SDNN through following two alternative phases: (I) perform hard thresholding to drop connections with small activations and fine-tune the other significant filters; (II)~re-activate the frozen connections and train the entire network to improve its overall discriminative capability. We verify the superiority of SDNNs in terms of efficiency and classification performance on four benchmark object recognition datasets, including CIFAR-10, CIFAR-100, MNIST and ImageNet. Experimental results clearly demonstrate that IHT can be applied for training SDNN based on various CNN architectures such as NIN and AlexNet.
Tasks	Object Recognition
Published	2016-07-19
URL	http://arxiv.org/abs/1607.05423v1
PDF	http://arxiv.org/pdf/1607.05423v1.pdf
PWC	https://paperswithcode.com/paper/training-skinny-deep-neural-networks-with
Repo
Framework

Robust Regression For Image Binarization Under Heavy Noises and Nonuniform Background


Title	Robust Regression For Image Binarization Under Heavy Noises and Nonuniform Background
Authors	Garret Vo, Chiwoo Park
Abstract	This paper presents a robust regression approach for image binarization under significant background variations and observation noises. The work is motivated by the need of identifying foreground regions in noisy microscopic image or degraded document images, where significant background variation and severe noise make an image binarization challenging. The proposed method first estimates the background of an input image, subtracts the estimated background from the input image, and apply a global thresholding to the subtracted outcome for achieving a binary image of foregrounds. A robust regression approach was proposed to estimate the background intensity surface with minimal effects of foreground intensities and noises, and a global threshold selector was proposed on the basis of a model selection criterion in a sparse regression. The proposed approach was validated using 26 test images and the corresponding ground truths, and the outcomes of the proposed work were compared with those from nine existing image binarization methods. The approach was also combined with three state-of-the-art morphological segmentation methods to show how the proposed approach can improve their image segmentation outcomes.
Tasks	Model Selection, Semantic Segmentation
Published	2016-09-26
URL	http://arxiv.org/abs/1609.08078v4
PDF	http://arxiv.org/pdf/1609.08078v4.pdf
PWC	https://paperswithcode.com/paper/robust-regression-for-image-binarization
Repo
Framework

Chess Player by Co-Evolutionary Algorithm


Title	Chess Player by Co-Evolutionary Algorithm
Authors	Nuno Ramos, Sergio Salgado, Agostinho C Rosa
Abstract	A co-evolutionary algorithm (CA) based chess player is presented. Implementation details of the algorithms, namely coding, population, variation operators are described. The alpha-beta or mini-max like behaviour of the player is achieved through two competitive or cooperative populations. Special attention is given to the fitness function evaluation (the heart of the solution). Test results on algorithms vs. algorithms or human player is provided.
Tasks
Published	2016-05-21
URL	http://arxiv.org/abs/1605.06710v1
PDF	http://arxiv.org/pdf/1605.06710v1.pdf
PWC	https://paperswithcode.com/paper/chess-player-by-co-evolutionary-algorithm
Repo
Framework

Automatic labeling of molecular biomarkers of whole slide immunohistochemistry images using fully convolutional networks


Title	Automatic labeling of molecular biomarkers of whole slide immunohistochemistry images using fully convolutional networks
Authors	Fahime Sheikhzadeh, Martial Guillaud, Rabab K. Ward
Abstract	This paper addresses the problem of quantifying biomarkers in multi-stained tissues, based on color and spatial information. A deep learning based method that can automatically localize and quantify the cells expressing biomarker(s) in a whole slide image is proposed. The deep learning network is a fully convolutional network (FCN) whose input is the true RGB color image of a tissue and output is a map of the different biomarkers. The FCN relies on a convolutional neural network (CNN) that classifies each cell separately according to the biomarker it expresses. In this study, images of immunohistochemistry (IHC) stained slides were collected and used. More than 4,500 RGB images of cells were manually labeled based on the expressing biomarkers. The labeled cell images were used to train the CNN (obtaining an accuracy of 92% in a test set). The trained CNN is then extended to an FCN that generates a map of all biomarkers in the whole slide image acquired by the scanner (instead of classifying every cell image). To evaluate our method, we manually labeled all nuclei expressing different biomarkers in two whole slide images and used theses as the ground truth. Our proposed method for immunohistochemical analysis compares well with the manual labeling by humans (average F-score of 0.96).
Tasks
Published	2016-12-30
URL	http://arxiv.org/abs/1612.09420v1
PDF	http://arxiv.org/pdf/1612.09420v1.pdf
PWC	https://paperswithcode.com/paper/automatic-labeling-of-molecular-biomarkers-of
Repo
Framework