Paper Group ANR 26
Tracking System to Automate Data Collection of Microscopic Pedestrian Traffic Flow. A propagation matting method based on the Local Sampling and KNN Classification with adaptive feature space. Joint Sensing Matrix and Sparsifying Dictionary Optimization for Tensor Compressive Sensing. Machine Learning in Falls Prediction; A cognition-based predicto …
Tracking System to Automate Data Collection of Microscopic Pedestrian Traffic Flow
Title | Tracking System to Automate Data Collection of Microscopic Pedestrian Traffic Flow |
Authors | Kardi Teknomo, Yasushi Takeyama, Hajime Inamura |
Abstract | To deal with many pedestrian data, automatic data collection is needed. This paper describes how to automate the microscopic pedestrian flow data collection from video files. The study is restricted only to pedestrians without considering vehicular - pedestrian interaction. Pedestrian tracking system consists of three sub-systems, which calculates the image processing, object tracking and traffic flow variables. The system receives input of stacks of images and parameters. The first sub-system performs Image Processing analysis while the second sub-system carries out the tracking of pedestrians by matching the features and tracing the pedestrian numbers frame by frame. The last sub-system deals with a NTXY database to calculate the pedestrian traffic-flow characteristic such as flow rate, speed and area module. Comparison with manual data collection method confirmed that the procedures described have significant potential to automate the data collection of both microscopic and macroscopic pedestrian flow variables. |
Tasks | Object Tracking |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.01810v1 |
http://arxiv.org/pdf/1609.01810v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-system-to-automate-data-collection |
Repo | |
Framework | |
A propagation matting method based on the Local Sampling and KNN Classification with adaptive feature space
Title | A propagation matting method based on the Local Sampling and KNN Classification with adaptive feature space |
Authors | Xiao Chen, Fazhi He |
Abstract | Closed Form is a propagation based matting algorithm, functioning well on images with good propagation . The deficiency of the Closed Form method is that for complex areas with poor image propagation , such as hole areas or areas of long and narrow structures. The right results are usually hard to get. On these areas, if certain flags are provided, it can improve the effects of matting. In this paper, we design a matting algorithm by local sampling and the KNN classifier propagation based matting algorithm. First of all, build the corresponding features space according to the different components of image colors to reduce the influence of overlapping between the foreground and background, and to improve the classification accuracy of KNN classifier. Second, adaptively use local sampling or using local KNN classifier for processing based on the pros and cons of the sample performance of unknown image areas. Finally, based on different treatment methods for the unknown areas, we will use different weight for augmenting constraints to make the treatment more effective. In this paper, by combining qualitative observation and quantitative analysis, we will make evaluation of the experimental results through online standard set of evaluation tests. It shows that on images with good propagation , this method is as effective as the Closed Form method, while on images in complex regions, it can perform even better than Closed Form. |
Tasks | |
Published | 2016-05-03 |
URL | http://arxiv.org/abs/1605.00732v1 |
http://arxiv.org/pdf/1605.00732v1.pdf | |
PWC | https://paperswithcode.com/paper/a-propagation-matting-method-based-on-the |
Repo | |
Framework | |
Joint Sensing Matrix and Sparsifying Dictionary Optimization for Tensor Compressive Sensing
Title | Joint Sensing Matrix and Sparsifying Dictionary Optimization for Tensor Compressive Sensing |
Authors | Xin Ding, Wei Chen, Ian J. Wassell |
Abstract | Tensor Compressive Sensing (TCS) is a multidimensional framework of Compressive Sensing (CS), and it is advantageous in terms of reducing the amount of storage, easing hardware implementations and preserving multidimensional structures of signals in comparison to a conventional CS system. In a TCS system, instead of using a random sensing matrix and a predefined dictionary, the average-case performance can be further improved by employing an optimized multidimensional sensing matrix and a learned multilinear sparsifying dictionary. In this paper, we propose a joint optimization approach of the sensing matrix and dictionary for a TCS system. For the sensing matrix design in TCS, an extended separable approach with a closed form solution and a novel iterative non-separable method are proposed when the multilinear dictionary is fixed. In addition, a multidimensional dictionary learning method that takes advantages of the multidimensional structure is derived, and the influence of sensing matrices is taken into account in the learning process. A joint optimization is achieved via alternately iterating the optimization of the sensing matrix and dictionary. Numerical experiments using both synthetic data and real images demonstrate the superiority of the proposed approaches. |
Tasks | Compressive Sensing, Dictionary Learning |
Published | 2016-01-28 |
URL | http://arxiv.org/abs/1601.07804v1 |
http://arxiv.org/pdf/1601.07804v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-sensing-matrix-and-sparsifying |
Repo | |
Framework | |
Machine Learning in Falls Prediction; A cognition-based predictor of falls for the acute neurological in-patient population
Title | Machine Learning in Falls Prediction; A cognition-based predictor of falls for the acute neurological in-patient population |
Authors | Bilal A. Mateen, Matthias Bussas, Catherine Doogan, Denise Waller, Alessia Saverino, Franz J Király, E Diane Playford |
Abstract | Background Information: Falls are associated with high direct and indirect costs, and significant morbidity and mortality for patients. Pathological falls are usually a result of a compromised motor system, and/or cognition. Very little research has been conducted on predicting falls based on this premise. Aims: To demonstrate that cognitive and motor tests can be used to create a robust predictive tool for falls. Methods: Three tests of attention and executive function (Stroop, Trail Making, and Semantic Fluency), a measure of physical function (Walk-12), a series of questions (concerning recent falls, surgery and physical function) and demographic information were collected from a cohort of 323 patients at a tertiary neurological center. The principal outcome was a fall during the in-patient stay (n = 54). Data-driven, predictive modelling was employed to identify the statistical modelling strategies which are most accurate in predicting falls, and which yield the most parsimonious models of clinical relevance. Results: The Trail test was identified as the best predictor of falls. Moreover, addition of any others variables, to the results of the Trail test did not improve the prediction (Wilcoxon signed-rank p < .001). The best statistical strategy for predicting falls was the random forest (Wilcoxon signed-rank p < .001), based solely on results of the Trail test. Tuning of the model results in the following optimized values: 68% (+- 7.7) sensitivity, 90% (+- 2.3) specificity, with a positive predictive value of 60%, when the relevant data is available. Conclusion: Predictive modelling has identified a simple yet powerful machine learning prediction strategy based on a single clinical test, the Trail test. Predictive evaluation shows this strategy to be robust, suggesting predictive modelling and machine learning as the standard for future predictive tools. |
Tasks | |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.07751v1 |
http://arxiv.org/pdf/1607.07751v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-in-falls-prediction-a |
Repo | |
Framework | |
Learning Action Concept Trees and Semantic Alignment Networks from Image-Description Data
Title | Learning Action Concept Trees and Semantic Alignment Networks from Image-Description Data |
Authors | Jiyang Gao, Ram Nevatia |
Abstract | Action classification in still images has been a popular research topic in computer vision. Labelling large scale datasets for action classification requires tremendous manual work, which is hard to scale up. Besides, the action categories in such datasets are pre-defined and vocabularies are fixed. However humans may describe the same action with different phrases, which leads to the difficulty of vocabulary expansion for traditional fully-supervised methods. We observe that large amounts of images with sentence descriptions are readily available on the Internet. The sentence descriptions can be regarded as weak labels for the images, which contain rich information and could be used to learn flexible expressions of action categories. We propose a method to learn an Action Concept Tree (ACT) and an Action Semantic Alignment (ASA) model for classification from image-description data via a two-stage learning process. A new dataset for the task of learning actions from descriptions is built. Experimental results show that our method outperforms several baseline methods significantly. |
Tasks | Action Classification |
Published | 2016-09-08 |
URL | http://arxiv.org/abs/1609.02284v1 |
http://arxiv.org/pdf/1609.02284v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-action-concept-trees-and-semantic |
Repo | |
Framework | |
Fast Nonsmooth Regularized Risk Minimization with Continuation
Title | Fast Nonsmooth Regularized Risk Minimization with Continuation |
Authors | Shuai Zheng, Ruiliang Zhang, James T. Kwok |
Abstract | In regularized risk minimization, the associated optimization problem becomes particularly difficult when both the loss and regularizer are nonsmooth. Existing approaches either have slow or unclear convergence properties, are restricted to limited problem subclasses, or require careful setting of a smoothing parameter. In this paper, we propose a continuation algorithm that is applicable to a large class of nonsmooth regularized risk minimization problems, can be flexibly used with a number of existing solvers for the underlying smoothed subproblem, and with convergence results on the whole algorithm rather than just one of its subproblems. In particular, when accelerated solvers are used, the proposed algorithm achieves the fastest known rates of $O(1/T^2)$ on strongly convex problems, and $O(1/T)$ on general convex problems. Experiments on nonsmooth classification and regression tasks demonstrate that the proposed algorithm outperforms the state-of-the-art. |
Tasks | |
Published | 2016-02-25 |
URL | http://arxiv.org/abs/1602.07844v1 |
http://arxiv.org/pdf/1602.07844v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-nonsmooth-regularized-risk-minimization |
Repo | |
Framework | |
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
Title | Convolutional Neural Networks Analyzed via Convolutional Sparse Coding |
Authors | Vardan Papyan, Yaniv Romano, Michael Elad |
Abstract | Convolutional neural networks (CNN) have led to many state-of-the-art results spanning through various fields. However, a clear and profound theoretical understanding of the forward pass, the core algorithm of CNN, is still lacking. In parallel, within the wide field of sparse approximation, Convolutional Sparse Coding (CSC) has gained increasing attention in recent years. A theoretical study of this model was recently conducted, establishing it as a reliable and stable alternative to the commonly practiced patch-based processing. Herein, we propose a novel multi-layer model, ML-CSC, in which signals are assumed to emerge from a cascade of CSC layers. This is shown to be tightly connected to CNN, so much so that the forward pass of the CNN is in fact the thresholding pursuit serving the ML-CSC model. This connection brings a fresh view to CNN, as we are able to attribute to this architecture theoretical claims such as uniqueness of the representations throughout the network, and their stable estimation, all guaranteed under simple local sparsity conditions. Lastly, identifying the weaknesses in the above pursuit scheme, we propose an alternative to the forward pass, which is connected to deconvolutional, recurrent and residual networks, and has better theoretical guarantees. |
Tasks | |
Published | 2016-07-27 |
URL | http://arxiv.org/abs/1607.08194v4 |
http://arxiv.org/pdf/1607.08194v4.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-analyzed-via-1 |
Repo | |
Framework | |
Deep Structured Scene Parsing by Learning with Image Descriptions
Title | Deep Structured Scene Parsing by Learning with Image Descriptions |
Authors | Liang Lin, Guangrun Wang, Rui Zhang, Ruimao Zhang, Xiaodan Liang, Wangmeng Zuo |
Abstract | This paper addresses a fundamental problem of scene understanding: How to parse the scene image into a structured configuration (i.e., a semantic object hierarchy with object interaction relations) that finely accords with human perception. We propose a deep architecture consisting of two networks: i) a convolutional neural network (CNN) extracting the image representation for pixelwise object labeling and ii) a recursive neural network (RNN) discovering the hierarchical object structure and the inter-object relations. Rather than relying on elaborative user annotations (e.g., manually labeling semantic maps and relations), we train our deep model in a weakly-supervised manner by leveraging the descriptive sentences of the training images. Specifically, we decompose each sentence into a semantic tree consisting of nouns and verb phrases, and facilitate these trees discovering the configurations of the training images. Once these scene configurations are determined, then the parameters of both the CNN and RNN are updated accordingly by back propagation. The entire model training is accomplished through an Expectation-Maximization method. Extensive experiments suggest that our model is capable of producing meaningful and structured scene configurations and achieving more favorable scene labeling performance on PASCAL VOC 2012 over other state-of-the-art weakly-supervised methods. |
Tasks | Scene Labeling, Scene Parsing, Scene Understanding |
Published | 2016-04-08 |
URL | http://arxiv.org/abs/1604.02271v3 |
http://arxiv.org/pdf/1604.02271v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-structured-scene-parsing-by-learning |
Repo | |
Framework | |
Geometric Scene Parsing with Hierarchical LSTM
Title | Geometric Scene Parsing with Hierarchical LSTM |
Authors | Zhanglin Peng, Ruimao Zhang, Xiaodan Liang, Xiaobai Liu, Liang Lin |
Abstract | This paper addresses the problem of geometric scene parsing, i.e. simultaneously labeling geometric surfaces (e.g. sky, ground and vertical plane) and determining the interaction relations (e.g. layering, supporting, siding and affinity) between main regions. This problem is more challenging than the traditional semantic scene labeling, as recovering geometric structures necessarily requires the rich and diverse contextual information. To achieve these goals, we propose a novel recurrent neural network model, named Hierarchical Long Short-Term Memory (H-LSTM). It contains two coupled sub-networks: the Pixel LSTM (P-LSTM) and the Multi-scale Super-pixel LSTM (MS-LSTM) for handling the surface labeling and relation prediction, respectively. The two sub-networks provide complementary information to each other to exploit hierarchical scene contexts, and they are jointly optimized for boosting the performance. Our extensive experiments show that our model is capable of parsing scene geometric structures and outperforming several state-of-the-art methods by large margins. In addition, we show promising 3D reconstruction results from the still images based on the geometric parsing. |
Tasks | 3D Reconstruction, Scene Labeling, Scene Parsing |
Published | 2016-04-07 |
URL | http://arxiv.org/abs/1604.01931v2 |
http://arxiv.org/pdf/1604.01931v2.pdf | |
PWC | https://paperswithcode.com/paper/geometric-scene-parsing-with-hierarchical |
Repo | |
Framework | |
Non-linear Label Ranking for Large-scale Prediction of Long-Term User Interests
Title | Non-linear Label Ranking for Large-scale Prediction of Long-Term User Interests |
Authors | Nemanja Djuric, Mihajlo Grbovic, Vladan Radosavljevic, Narayan Bhamidipati, Slobodan Vucetic |
Abstract | We consider the problem of personalization of online services from the viewpoint of ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertisers’ revenue. We propose to address this problem as a task of ranking the ad categories depending on a user’s preference, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings. Experiments on a real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance, strongly suggesting the benefit of using the proposed model on large-scale ranking problems. |
Tasks | |
Published | 2016-06-29 |
URL | http://arxiv.org/abs/1606.08963v1 |
http://arxiv.org/pdf/1606.08963v1.pdf | |
PWC | https://paperswithcode.com/paper/non-linear-label-ranking-for-large-scale |
Repo | |
Framework | |
Linking Image and Text with 2-Way Nets
Title | Linking Image and Text with 2-Way Nets |
Authors | Aviv Eisenschtat, Lior Wolf |
Abstract | Linking two data sources is a basic building block in numerous computer vision problems. Canonical Correlation Analysis (CCA) achieves this by utilizing a linear optimizer in order to maximize the correlation between the two views. Recent work makes use of non-linear models, including deep learning techniques, that optimize the CCA loss in some feature space. In this paper, we introduce a novel, bi-directional neural network architecture for the task of matching vectors from two data sources. Our approach employs two tied neural network channels that project the two views into a common, maximally correlated space using the Euclidean loss. We show a direct link between the correlation-based loss and Euclidean loss, enabling the use of Euclidean loss for correlation maximization. To overcome common Euclidean regression optimization problems, we modify well-known techniques to our problem, including batch normalization and dropout. We show state of the art results on a number of computer vision matching tasks including MNIST image matching and sentence-image matching on the Flickr8k, Flickr30k and COCO datasets. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.07973v3 |
http://arxiv.org/pdf/1608.07973v3.pdf | |
PWC | https://paperswithcode.com/paper/linking-image-and-text-with-2-way-nets |
Repo | |
Framework | |
LLFR: A Lanczos-Based Latent Factor Recommender for Big Data Scenarios
Title | LLFR: A Lanczos-Based Latent Factor Recommender for Big Data Scenarios |
Authors | Maria Kalantzi |
Abstract | The purpose if this master’s thesis is to study and develop a new algorithmic framework for Collaborative Filtering to produce recommendations in the top-N recommendation problem. Thus, we propose Lanczos Latent Factor Recommender (LLFR); a novel “big data friendly” collaborative filtering algorithm for top-N recommendation. Using a computationally efficient Lanczos-based procedure, LLFR builds a low dimensional item similarity model, that can be readily exploited to produce personalized ranking vectors over the item space. A number of experiments on real datasets indicate that LLFR outperforms other state-of-the-art top-N recommendation methods from a computational as well as a qualitative perspective. Our experimental results also show that its relative performance gains, compared to competing methods, increase as the data get sparser, as in the Cold Start Problem. More specifically, this is true both when the sparsity is generalized - as in the New Community Problem, a very common problem faced by real recommender systems in their beginning stages, when there is not sufficient number of ratings for the collaborative filtering algorithms to uncover similarities between items or users - and in the very interesting case where the sparsity is localized in a small fraction of the dataset - as in the New Users Problem, where new users are introduced to the system, they have not rated many items and thus, the CF algorithm can not make reliable personalized recommendations yet. |
Tasks | Recommendation Systems |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04335v1 |
http://arxiv.org/pdf/1606.04335v1.pdf | |
PWC | https://paperswithcode.com/paper/llfr-a-lanczos-based-latent-factor |
Repo | |
Framework | |
Inference of High-dimensional Autoregressive Generalized Linear Models
Title | Inference of High-dimensional Autoregressive Generalized Linear Models |
Authors | Eric C. Hall, Garvesh Raskutti, Rebecca Willett |
Abstract | Vector autoregressive models characterize a variety of time series in which linear combinations of current and past observations can be used to accurately predict future observations. For instance, each element of an observation vector could correspond to a different node in a network, and the parameters of an autoregressive model would correspond to the impact of the network structure on the time series evolution. Often these models are used successfully in practice to learn the structure of social, epidemiological, financial, or biological neural networks. However, little is known about statistical guarantees on estimates of such models in non-Gaussian settings. This paper addresses the inference of the autoregressive parameters and associated network structure within a generalized linear model framework that includes Poisson and Bernoulli autoregressive processes. At the heart of this analysis is a sparsity-regularized maximum likelihood estimator. While sparsity-regularization is well-studied in the statistics and machine learning communities, those analysis methods cannot be applied to autoregressive generalized linear models because of the correlations and potential heteroscedasticity inherent in the observations. Sample complexity bounds are derived using a combination of martingale concentration inequalities and modern empirical process techniques for dependent random variables. These bounds, which are supported by several simulation studies, characterize the impact of various network parameters on estimator performance. |
Tasks | Time Series |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02693v2 |
http://arxiv.org/pdf/1605.02693v2.pdf | |
PWC | https://paperswithcode.com/paper/inference-of-high-dimensional-autoregressive |
Repo | |
Framework | |
All Fingers are not Equal: Intensity of References in Scientific Articles
Title | All Fingers are not Equal: Intensity of References in Scientific Articles |
Authors | Tanmoy Chakraborty, Ramasuri Narayanam |
Abstract | Research accomplishment is usually measured by considering all citations with equal importance, thus ignoring the wide variety of purposes an article is being cited for. Here, we posit that measuring the intensity of a reference is crucial not only to perceive better understanding of research endeavor, but also to improve the quality of citation-based applications. To this end, we collect a rich annotated dataset with references labeled by the intensity, and propose a novel graph-based semi-supervised model, GraLap to label the intensity of references. Experiments with AAN datasets show a significant improvement compared to the baselines to achieve the true labels of the references (46% better correlation). Finally, we provide four applications to demonstrate how the knowledge of reference intensity leads to design better real-world applications. |
Tasks | |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00081v1 |
http://arxiv.org/pdf/1609.00081v1.pdf | |
PWC | https://paperswithcode.com/paper/all-fingers-are-not-equal-intensity-of |
Repo | |
Framework | |
Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset
Title | Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset |
Authors | Suyash Shetty |
Abstract | In this project we work on creating a model to classify images for the Pascal VOC Challenge 2012. We use convolutional neural networks trained on a single GPU instance provided by Amazon via their cloud service Amazon Web Services (AWS) to classify images in the Pascal VOC 2012 data set. We train multiple convolutional neural network models and finally settle on the best model which produced a validation accuracy of 85.6% and a testing accuracy of 85.24%. |
Tasks | Image Classification |
Published | 2016-07-13 |
URL | http://arxiv.org/abs/1607.03785v1 |
http://arxiv.org/pdf/1607.03785v1.pdf | |
PWC | https://paperswithcode.com/paper/application-of-convolutional-neural-network |
Repo | |
Framework | |