Paper Group ANR 556
Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints. Position paper: Towards an observer-oriented theory of shape comparison. A statistical model of tristimulus measurements within and between OLED displays. Geometry of Polysemy. Kannada Spell Checker with Sandhi Splitter. Density-based Denoising of Point Clou …
Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints
Title | Re-identification of Humans in Crowds using Personal, Social and Environmental Constraints |
Authors | Shayan Modiri Assari, Haroon Idrees, Mubarak Shah |
Abstract | This paper addresses the problem of human re-identification across non-overlapping cameras in crowds.Re-identification in crowded scenes is a challenging problem due to large number of people and frequent occlusions, coupled with changes in their appearance due to different properties and exposure of cameras. To solve this problem, we model multiple Personal, Social and Environmental (PSE) constraints on human motion across cameras. The personal constraints include appearance and preferred speed of each individual assumed to be similar across the non-overlapping cameras. The social influences (constraints) are quadratic in nature, i.e. occur between pairs of individuals, and modeled through grouping and collision avoidance. Finally, the environmental constraints capture the transition probabilities between gates (entrances / exits) in different cameras, defined as multi-modal distributions of transition time and destination between all pairs of gates. We incorporate these constraints into an energy minimization framework for solving human re-identification. Assigning $1-1$ correspondence while modeling PSE constraints is NP-hard. We present a stochastic local search algorithm to restrict the search space of hypotheses, and obtain $1-1$ solution in the presence of linear and quadratic PSE constraints. Moreover, we present an alternate optimization using Frank-Wolfe algorithm that solves the convex approximation of the objective function with linear relaxation on binary variables, and yields an order of magnitude speed up over stochastic local search with minor drop in performance. We evaluate our approach using Cumulative Matching Curves as well $1-1$ assignment on several thousand frames of Grand Central, PRID and DukeMTMC datasets, and obtain significantly better results compared to existing re-identification methods. |
Tasks | Person Re-Identification |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02155v1 |
http://arxiv.org/pdf/1612.02155v1.pdf | |
PWC | https://paperswithcode.com/paper/re-identification-of-humans-in-crowds-using |
Repo | |
Framework | |
Position paper: Towards an observer-oriented theory of shape comparison
Title | Position paper: Towards an observer-oriented theory of shape comparison |
Authors | Patrizio Frosini |
Abstract | In this position paper we suggest a possible metric approach to shape comparison that is based on a mathematical formalization of the concept of observer, seen as a collection of suitable operators acting on a metric space of functions. These functions represent the set of data that are accessible to the observer, while the operators describe the way the observer elaborates the data and enclose the invariance that he/she associates with them. We expose this model and illustrate some theoretical reasons that justify its possible use for shape comparison. |
Tasks | |
Published | 2016-03-07 |
URL | http://arxiv.org/abs/1603.02008v1 |
http://arxiv.org/pdf/1603.02008v1.pdf | |
PWC | https://paperswithcode.com/paper/position-paper-towards-an-observer-oriented |
Repo | |
Framework | |
A statistical model of tristimulus measurements within and between OLED displays
Title | A statistical model of tristimulus measurements within and between OLED displays |
Authors | Matti Raitoharju, Samu Kallio, Matti Pellikka |
Abstract | We present an empirical model for noises in color measurements from OLED displays. According to measured data the noise is not isotropic in the XYZ space, instead most of the noise is along an axis that is parallel to a vector from origin to measured XYZ vector. The presented empirical model is simple and depends only on the measured XYZ values. Our tests show that the variations between multiple panels of the same type have similar distribution as the temporal noise in measurements from a single panel, but a larger magnitude. |
Tasks | |
Published | 2016-08-28 |
URL | http://arxiv.org/abs/1608.08596v2 |
http://arxiv.org/pdf/1608.08596v2.pdf | |
PWC | https://paperswithcode.com/paper/a-statistical-model-of-tristimulus |
Repo | |
Framework | |
Geometry of Polysemy
Title | Geometry of Polysemy |
Authors | Jiaqi Mu, Suma Bhat, Pramod Viswanath |
Abstract | Vector representations of words have heralded a transformational approach to classical problems in NLP; the most popular example is word2vec. However, a single vector does not suffice to model the polysemous nature of many (frequent) words, i.e., words with multiple meanings. In this paper, we propose a three-fold approach for unsupervised polysemy modeling: (a) context representations, (b) sense induction and disambiguation and (c) lexeme (as a word and sense pair) representations. A key feature of our work is the finding that a sentence containing a target word is well represented by a low rank subspace, instead of a point in a vector space. We then show that the subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace), which we use to disambiguate senses using a clustering algorithm that harnesses the Grassmannian geometry of the representations. The disambiguation algorithm, which we call $K$-Grassmeans, leads to a procedure to label the different senses of the target word in the corpus – yielding lexeme vector representations, all in an unsupervised manner starting from a large (Wikipedia) corpus in English. Apart from several prototypical target (word,sense) examples and a host of empirical studies to intuit and justify the various geometric representations, we validate our algorithms on standard sense induction and disambiguation datasets and present new state-of-the-art results. |
Tasks | |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07569v1 |
http://arxiv.org/pdf/1610.07569v1.pdf | |
PWC | https://paperswithcode.com/paper/geometry-of-polysemy |
Repo | |
Framework | |
Kannada Spell Checker with Sandhi Splitter
Title | Kannada Spell Checker with Sandhi Splitter |
Authors | A N Akshatha, Chandana G Upadhyaya, Rajashekara S Murthy |
Abstract | Spelling errors are introduced in text either during typing, or when the user does not know the correct phoneme or grapheme. If a language contains complex words like sandhi where two or more morphemes join based on some rules, spell checking becomes very tedious. In such situations, having a spell checker with sandhi splitter which alerts the user by flagging the errors and providing suggestions is very useful. A novel algorithm of sandhi splitting is proposed in this paper. The sandhi splitter can split about 7000 most common sandhi words in Kannada language used as test samples. The sandhi splitter was integrated with a Kannada spell checker and a mechanism for generating suggestions was added. A comprehensive, platform independent, standalone spell checker with sandhi splitter application software was thus developed and tested extensively for its efficiency and correctness. A comparative analysis of this spell checker with sandhi splitter was made and results concluded that the Kannada spell checker with sandhi splitter has an improved performance. It is twice as fast, 200 times more space efficient, and it is 90% accurate in case of complex nouns and 50% accurate for complex verbs. Such a spell checker with sandhi splitter will be of foremost significance in machine translation systems, voice processing, etc. This is the first sandhi splitter in Kannada and the advantage of the novel algorithm is that, it can be extended to all Indian languages. |
Tasks | Machine Translation |
Published | 2016-11-25 |
URL | http://arxiv.org/abs/1611.08358v1 |
http://arxiv.org/pdf/1611.08358v1.pdf | |
PWC | https://paperswithcode.com/paper/kannada-spell-checker-with-sandhi-splitter |
Repo | |
Framework | |
Density-based Denoising of Point Cloud
Title | Density-based Denoising of Point Cloud |
Authors | Faisal Zaman, Ya Ping Wong, Boon Yian Ng |
Abstract | Point cloud source data for surface reconstruction is usually contaminated with noise and outliers. To overcome this deficiency, a density-based point cloud denoising method is presented to remove outliers and noisy points. First, particle-swam optimization technique is employed for automatically approximating optimal bandwidth of multivariate kernel density estimation to ensure the robust performance of density estimation. Then, mean-shift based clustering technique is used to remove outliers through a thresholding scheme. After removing outliers from the point cloud, bilateral mesh filtering is applied to smooth the remaining points. The experimental results show that this approach, comparably, is robust and efficient. |
Tasks | Denoising, Density Estimation |
Published | 2016-02-17 |
URL | http://arxiv.org/abs/1602.05312v1 |
http://arxiv.org/pdf/1602.05312v1.pdf | |
PWC | https://paperswithcode.com/paper/density-based-denoising-of-point-cloud |
Repo | |
Framework | |
3-D Hand Pose Estimation from Kinect’s Point Cloud Using Appearance Matching
Title | 3-D Hand Pose Estimation from Kinect’s Point Cloud Using Appearance Matching |
Authors | Pasquale Coscia, Francesco A. N. Palmieri, Francesco Castaldo, Alberto Cavallo |
Abstract | We present a novel appearance-based approach for pose estimation of a human hand using the point clouds provided by the low-cost Microsoft Kinect sensor. Both the free-hand case, in which the hand is isolated from the surrounding environment, and the hand-object case, in which the different types of interactions are classified, have been considered. The hand-object case is clearly the most challenging task having to deal with multiple tracks. The approach proposed here belongs to the class of partial pose estimation where the estimated pose in a frame is used for the initialization of the next one. The pose estimation is obtained by applying a modified version of the Iterative Closest Point (ICP) algorithm to synthetic models to obtain the rigid transformation that aligns each model with respect to the input data. The proposed framework uses a “pure” point cloud as provided by the Kinect sensor without any other information such as RGB values or normal vector components. For this reason, the proposed method can also be applied to data obtained from other types of depth sensor, or RGB-D camera. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2016-04-07 |
URL | http://arxiv.org/abs/1604.02032v1 |
http://arxiv.org/pdf/1604.02032v1.pdf | |
PWC | https://paperswithcode.com/paper/3-d-hand-pose-estimation-from-kinects-point |
Repo | |
Framework | |
Algorithms for Generalized Cluster-wise Linear Regression
Title | Algorithms for Generalized Cluster-wise Linear Regression |
Authors | Young Woong Park, Yan Jiang, Diego Klabjan, Loren Williams |
Abstract | Cluster-wise linear regression (CLR), a clustering problem intertwined with regression, is to find clusters of entities such that the overall sum of squared errors from regressions performed over these clusters is minimized, where each cluster may have different variances. We generalize the CLR problem by allowing each entity to have more than one observation, and refer to it as generalized CLR. We propose an exact mathematical programming based approach relying on column generation, a column generation based heuristic algorithm that clusters predefined groups of entities, a metaheuristic genetic algorithm with adapted Lloyd’s algorithm for K-means clustering, a two-stage approach, and a modified algorithm of Sp{"a}th \cite{Spath1979} for solving generalized CLR. We examine the performance of our algorithms on a stock keeping unit (SKU) clustering problem employed in forecasting halo and cannibalization effects in promotions using real-world retail data from a large supermarket chain. In the SKU clustering problem, the retailer needs to cluster SKUs based on their seasonal effects in response to promotions. The seasonal effects are the results of regressions with predictors being promotion mechanisms and seasonal dummies performed over clusters generated. We compare the performance of all proposed algorithms for the SKU problem with real-world and synthetic data. |
Tasks | |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01417v2 |
http://arxiv.org/pdf/1607.01417v2.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-for-generalized-cluster-wise |
Repo | |
Framework | |
Learning to Navigate the Energy Landscape
Title | Learning to Navigate the Energy Landscape |
Authors | Julien Valentin, Angela Dai, Matthias Nießner, Pushmeet Kohli, Philip Torr, Shahram Izadi, Cem Keskin |
Abstract | In this paper, we present a novel and efficient architecture for addressing computer vision problems that use `Analysis by Synthesis’. Analysis by synthesis involves the minimization of the reconstruction error which is typically a non-convex function of the latent target variables. State-of-the-art methods adopt a hybrid scheme where discriminatively trained predictors like Random Forests or Convolutional Neural Networks are used to initialize local search algorithms. While these methods have been shown to produce promising results, they often get stuck in local optima. Our method goes beyond the conventional hybrid architecture by not only proposing multiple accurate initial solutions but by also defining a navigational structure over the solution space that can be used for extremely efficient gradient-free local search. We demonstrate the efficacy of our approach on the challenging problem of RGB Camera Relocalization. To make the RGB camera relocalization problem particularly challenging, we introduce a new dataset of 3D environments which are significantly larger than those found in other publicly-available datasets. Our experiments reveal that the proposed method is able to achieve state-of-the-art camera relocalization results. We also demonstrate the generalizability of our approach on Hand Pose Estimation and Image Retrieval tasks. | |
Tasks | Camera Relocalization, Hand Pose Estimation, Image Retrieval, Pose Estimation |
Published | 2016-03-18 |
URL | http://arxiv.org/abs/1603.05772v1 |
http://arxiv.org/pdf/1603.05772v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-navigate-the-energy-landscape |
Repo | |
Framework | |
Streaming Recommender Systems
Title | Streaming Recommender Systems |
Authors | Shiyu Chang, Yang Zhang, Jiliang Tang, Dawei Yin, Yi Chang, Mark A. Hasegawa-Johnson, Thomas S. Huang |
Abstract | The increasing popularity of real-world recommender systems produces data continuously and rapidly, and it becomes more realistic to study recommender systems under streaming scenarios. Data streams present distinct properties such as temporally ordered, continuous and high-velocity, which poses tremendous challenges to traditional recommender systems. In this paper, we investigate the problem of recommendation with stream inputs. In particular, we provide a principled framework termed sRec, which provides explicit continuous-time random process models of the creation of users and topics, and of the evolution of their interests. A variational Bayesian approach called recursive meanfield approximation is proposed, which permits computationally efficient instantaneous on-line inference. Experimental results on several real-world datasets demonstrate the advantages of our sRec over other state-of-the-arts. |
Tasks | Recommendation Systems |
Published | 2016-07-21 |
URL | http://arxiv.org/abs/1607.06182v1 |
http://arxiv.org/pdf/1607.06182v1.pdf | |
PWC | https://paperswithcode.com/paper/streaming-recommender-systems |
Repo | |
Framework | |
Fully DNN-based Multi-label regression for audio tagging
Title | Fully DNN-based Multi-label regression for audio tagging |
Authors | Yong Xu, Qiang Huang, Wenwu Wang, Philip J. B. Jackson, Mark D. Plumbley |
Abstract | Acoustic event detection for content analysis in most cases relies on lots of labeled data. However, manually annotating data is a time-consuming task, which thus makes few annotated resources available so far. Unlike audio event detection, automatic audio tagging, a multi-label acoustic event classification task, only relies on weakly labeled data. This is highly desirable to some practical applications using audio analysis. In this paper we propose to use a fully deep neural network (DNN) framework to handle the multi-label classification task in a regression way. Considering that only chunk-level rather than frame-level labels are available, the whole or almost whole frames of the chunk were fed into the DNN to perform a multi-label regression for the expected tags. The fully DNN, which is regarded as an encoding function, can well map the audio features sequence to a multi-tag vector. A deep pyramid structure was also designed to extract more robust high-level features related to the target tags. Further improved methods were adopted, such as the Dropout and background noise aware training, to enhance its generalization capability for new audio recordings in mismatched environments. Compared with the conventional Gaussian Mixture Model (GMM) and support vector machine (SVM) methods, the proposed fully DNN-based method could well utilize the long-term temporal information with the whole chunk as the input. The results show that our approach obtained a 15% relative improvement compared with the official GMM-based method of DCASE 2016 challenge. |
Tasks | Audio Tagging, Multi-Label Classification |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07695v2 |
http://arxiv.org/pdf/1606.07695v2.pdf | |
PWC | https://paperswithcode.com/paper/fully-dnn-based-multi-label-regression-for |
Repo | |
Framework | |
Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods
Title | Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods |
Authors | Xiaojie Jin, Xiaotong Yuan, Jiashi Feng, Shuicheng Yan |
Abstract | Deep neural networks have achieved remarkable success in a wide range of practical problems. However, due to the inherent large parameter space, deep models are notoriously prone to overfitting and difficult to be deployed in portable devices with limited memory. In this paper, we propose an iterative hard thresholding (IHT) approach to train Skinny Deep Neural Networks (SDNNs). An SDNN has much fewer parameters yet can achieve competitive or even better performance than its full CNN counterpart. More concretely, the IHT approach trains an SDNN through following two alternative phases: (I) perform hard thresholding to drop connections with small activations and fine-tune the other significant filters; (II)~re-activate the frozen connections and train the entire network to improve its overall discriminative capability. We verify the superiority of SDNNs in terms of efficiency and classification performance on four benchmark object recognition datasets, including CIFAR-10, CIFAR-100, MNIST and ImageNet. Experimental results clearly demonstrate that IHT can be applied for training SDNN based on various CNN architectures such as NIN and AlexNet. |
Tasks | Object Recognition |
Published | 2016-07-19 |
URL | http://arxiv.org/abs/1607.05423v1 |
http://arxiv.org/pdf/1607.05423v1.pdf | |
PWC | https://paperswithcode.com/paper/training-skinny-deep-neural-networks-with |
Repo | |
Framework | |
Robust Regression For Image Binarization Under Heavy Noises and Nonuniform Background
Title | Robust Regression For Image Binarization Under Heavy Noises and Nonuniform Background |
Authors | Garret Vo, Chiwoo Park |
Abstract | This paper presents a robust regression approach for image binarization under significant background variations and observation noises. The work is motivated by the need of identifying foreground regions in noisy microscopic image or degraded document images, where significant background variation and severe noise make an image binarization challenging. The proposed method first estimates the background of an input image, subtracts the estimated background from the input image, and apply a global thresholding to the subtracted outcome for achieving a binary image of foregrounds. A robust regression approach was proposed to estimate the background intensity surface with minimal effects of foreground intensities and noises, and a global threshold selector was proposed on the basis of a model selection criterion in a sparse regression. The proposed approach was validated using 26 test images and the corresponding ground truths, and the outcomes of the proposed work were compared with those from nine existing image binarization methods. The approach was also combined with three state-of-the-art morphological segmentation methods to show how the proposed approach can improve their image segmentation outcomes. |
Tasks | Model Selection, Semantic Segmentation |
Published | 2016-09-26 |
URL | http://arxiv.org/abs/1609.08078v4 |
http://arxiv.org/pdf/1609.08078v4.pdf | |
PWC | https://paperswithcode.com/paper/robust-regression-for-image-binarization |
Repo | |
Framework | |
Chess Player by Co-Evolutionary Algorithm
Title | Chess Player by Co-Evolutionary Algorithm |
Authors | Nuno Ramos, Sergio Salgado, Agostinho C Rosa |
Abstract | A co-evolutionary algorithm (CA) based chess player is presented. Implementation details of the algorithms, namely coding, population, variation operators are described. The alpha-beta or mini-max like behaviour of the player is achieved through two competitive or cooperative populations. Special attention is given to the fitness function evaluation (the heart of the solution). Test results on algorithms vs. algorithms or human player is provided. |
Tasks | |
Published | 2016-05-21 |
URL | http://arxiv.org/abs/1605.06710v1 |
http://arxiv.org/pdf/1605.06710v1.pdf | |
PWC | https://paperswithcode.com/paper/chess-player-by-co-evolutionary-algorithm |
Repo | |
Framework | |
Automatic labeling of molecular biomarkers of whole slide immunohistochemistry images using fully convolutional networks
Title | Automatic labeling of molecular biomarkers of whole slide immunohistochemistry images using fully convolutional networks |
Authors | Fahime Sheikhzadeh, Martial Guillaud, Rabab K. Ward |
Abstract | This paper addresses the problem of quantifying biomarkers in multi-stained tissues, based on color and spatial information. A deep learning based method that can automatically localize and quantify the cells expressing biomarker(s) in a whole slide image is proposed. The deep learning network is a fully convolutional network (FCN) whose input is the true RGB color image of a tissue and output is a map of the different biomarkers. The FCN relies on a convolutional neural network (CNN) that classifies each cell separately according to the biomarker it expresses. In this study, images of immunohistochemistry (IHC) stained slides were collected and used. More than 4,500 RGB images of cells were manually labeled based on the expressing biomarkers. The labeled cell images were used to train the CNN (obtaining an accuracy of 92% in a test set). The trained CNN is then extended to an FCN that generates a map of all biomarkers in the whole slide image acquired by the scanner (instead of classifying every cell image). To evaluate our method, we manually labeled all nuclei expressing different biomarkers in two whole slide images and used theses as the ground truth. Our proposed method for immunohistochemical analysis compares well with the manual labeling by humans (average F-score of 0.96). |
Tasks | |
Published | 2016-12-30 |
URL | http://arxiv.org/abs/1612.09420v1 |
http://arxiv.org/pdf/1612.09420v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-labeling-of-molecular-biomarkers-of |
Repo | |
Framework | |