Paper Group ANR 438
Weighted Unsupervised Learning for 3D Object Detection. An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition. Unbiased Sparse Subspace Clustering By Selective Pursuit. Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation. Investigating gated recurrent neural networks for speech …
Weighted Unsupervised Learning for 3D Object Detection
Title | Weighted Unsupervised Learning for 3D Object Detection |
Authors | Kamran Kowsari, Manal H. Alassaf |
Abstract | This paper introduces a novel weighted unsupervised learning for object detection using an RGB-D camera. This technique is feasible for detecting the moving objects in the noisy environments that are captured by an RGB-D camera. The main contribution of this paper is a real-time algorithm for detecting each object using weighted clustering as a separate cluster. In a preprocessing step, the algorithm calculates the pose 3D position X, Y, Z and RGB color of each data point and then it calculates each data point’s normal vector using the point’s neighbor. After preprocessing, our algorithm calculates k-weights for each data point; each weight indicates membership. Resulting in clustered objects of the scene. |
Tasks | 3D Object Detection, Object Detection |
Published | 2016-02-18 |
URL | http://arxiv.org/abs/1602.05920v2 |
http://arxiv.org/pdf/1602.05920v2.pdf | |
PWC | https://paperswithcode.com/paper/weighted-unsupervised-learning-for-3d-object |
Repo | |
Framework | |
An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition
Title | An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition |
Authors | Yan Yan, Hanzi Wang, Cuihua Li, Chenhui Yang, Bineng Zhong |
Abstract | In this paper, an effective unconstrained correlation filter called Uncon- strained Optimal Origin Tradeoff Filter (UOOTF) is presented and applied to robust face recognition. Compared with the conventional correlation filters in Class-dependence Feature Analysis (CFA), UOOTF improves the overall performance for unseen patterns by removing the hard constraints on the origin correlation outputs during the filter design. To handle non-linearly separable distributions between different classes, we further develop a non- linear extension of UOOTF based on the kernel technique. The kernel ex- tension of UOOTF allows for higher flexibility of the decision boundary due to a wider range of non-linearity properties. Experimental results demon- strate the effectiveness of the proposed unconstrained correlation filter and its kernelization in the task of face recognition. |
Tasks | Face Recognition, Robust Face Recognition |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07800v1 |
http://arxiv.org/pdf/1603.07800v1.pdf | |
PWC | https://paperswithcode.com/paper/an-effective-unconstrained-correlation-filter |
Repo | |
Framework | |
Unbiased Sparse Subspace Clustering By Selective Pursuit
Title | Unbiased Sparse Subspace Clustering By Selective Pursuit |
Authors | Hanno Ackermann, Michael Ying Yang, Bodo Rosenhahn |
Abstract | Sparse subspace clustering (SSC) is an elegant approach for unsupervised segmentation if the data points of each cluster are located in linear subspaces. This model applies, for instance, in motion segmentation if some restrictions on the camera model hold. SSC requires that problems based on the $l_1$-norm are solved to infer which points belong to the same subspace. If these unknown subspaces are well-separated this algorithm is guaranteed to succeed. The algorithm rests upon the assumption that points on the same subspace are well spread. The question what happens if this condition is violated has not yet been investigated. In this work, the effect of particular distributions on the same subspace will be analyzed. It will be shown that SSC fails to infer correct labels if points on the same subspace fall into more than one cluster. |
Tasks | Motion Segmentation |
Published | 2016-09-16 |
URL | http://arxiv.org/abs/1609.05057v2 |
http://arxiv.org/pdf/1609.05057v2.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-sparse-subspace-clustering-by |
Repo | |
Framework | |
Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation
Title | Object Detection, Tracking, and Motion Segmentation for Object-level Video Segmentation |
Authors | Benjamin Drayer, Thomas Brox |
Abstract | We present an approach for object segmentation in videos that combines frame-level object detection with concepts from object tracking and motion segmentation. The approach extracts temporally consistent object tubes based on an off-the-shelf detector. Besides the class label for each tube, this provides a location prior that is independent of motion. For the final video segmentation, we combine this information with motion cues. The method overcomes the typical problems of weakly supervised/unsupervised video segmentation, such as scenes with no motion, dominant camera motion, and objects that move as a unit. In contrast to most tracking methods, it provides an accurate, temporally consistent segmentation of each object. We report results on four video segmentation datasets: YouTube Objects, SegTrackv2, egoMotion, and FBMS. |
Tasks | Motion Segmentation, Object Detection, Object Tracking, Semantic Segmentation, Video Semantic Segmentation |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03066v1 |
http://arxiv.org/pdf/1608.03066v1.pdf | |
PWC | https://paperswithcode.com/paper/object-detection-tracking-and-motion |
Repo | |
Framework | |
Investigating gated recurrent neural networks for speech synthesis
Title | Investigating gated recurrent neural networks for speech synthesis |
Authors | Zhizheng Wu, Simon King |
Abstract | Recently, recurrent neural networks (RNNs) as powerful sequence models have re-emerged as a potential acoustic model for statistical parametric speech synthesis (SPSS). The long short-term memory (LSTM) architecture is particularly attractive because it addresses the vanishing gradient problem in standard RNNs, making them easier to train. Although recent studies have demonstrated that LSTMs can achieve significantly better performance on SPSS than deep feed-forward neural networks, little is known about why. Here we attempt to answer two questions: a) why do LSTMs work well as a sequence model for SPSS; b) which component (e.g., input gate, output gate, forget gate) is most important. We present a visual analysis alongside a series of experiments, resulting in a proposal for a simplified architecture. The simplified architecture has significantly fewer parameters than an LSTM, thus reducing generation complexity considerably without degrading quality. |
Tasks | Speech Synthesis |
Published | 2016-01-11 |
URL | http://arxiv.org/abs/1601.02539v1 |
http://arxiv.org/pdf/1601.02539v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-gated-recurrent-neural-networks |
Repo | |
Framework | |
A machine learning method for the large-scale evaluation of urban visual environment
Title | A machine learning method for the large-scale evaluation of urban visual environment |
Authors | Lun Liu, Hui Wang, Chunyang Wu |
Abstract | Given the size of modern cities in the urbanising age, it is beyond the perceptual capacity of most people to develop a good knowledge about the beauty and ugliness of the city at every street corner. Correspondingly, for planners, it is also difficult to accurately answer questions like ‘where are the worst-looking places in the city that regeneration should give first consideration’, or ‘in the fast urbanising cities, how is the city appearance changing’, etc. To address this issue, we here present a computer vision method for the large-scale and automatic evaluation of the urban visual environment, by leveraging state-of-the-art machine learning techniques and the wide-coverage street view images. From the various factors that are at work, we choose two key features, the visual quality of street facade and the continuity of street wall, as the starting point of this line of analysis. In order to test the validity of this method, we further compare the machine ratings with ratings collected on site from 752 passers-by on fifty-six locations. We show that the machine learning model can produce a good estimation of people’s real visual experience, and it holds much potential for various tasks in terms of urban design evaluation, culture identification, etc. |
Tasks | |
Published | 2016-08-11 |
URL | http://arxiv.org/abs/1608.03396v1 |
http://arxiv.org/pdf/1608.03396v1.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-method-for-the-large-scale |
Repo | |
Framework | |
Effective Computer Model For Recognizing Nationality From Frontal Image
Title | Effective Computer Model For Recognizing Nationality From Frontal Image |
Authors | Bat-Erdene Batsukh, Ganbat Tsend |
Abstract | We are introducing new effective computer model for extracting nationality from frontal image candidate using face part color, size and distances based on deep research. Determining face part size, color, and distances is depending on a variety of factors including image quality, lighting condition, rotation angle, occlusion and facial emotion. Therefore, first we need to detect a face on the image then convert an image into the real input. After that, we can determine image candidate gender, face shape, key points and face parts. Finally, we will return the result, based on the comparison of sizes and distances with the sample measurement table database. While we were measuring samples, there were big differences between images by their gender and face shapes. Input images must be the frontal face image that has smooth lighting and does not have any rotation angle. The model can be used in military, police, defense, healthcare, and technology sectors. Finally, Computer can distinguish nationality from the face image. |
Tasks | |
Published | 2016-03-15 |
URL | http://arxiv.org/abs/1603.04550v1 |
http://arxiv.org/pdf/1603.04550v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-computer-model-for-recognizing |
Repo | |
Framework | |
Data Analytics using Ontologies of Management Theories: Towards Implementing ‘From Theory to Practice’
Title | Data Analytics using Ontologies of Management Theories: Towards Implementing ‘From Theory to Practice’ |
Authors | Henry M. Kim, Jackie Ho Nam Cheung, Marek Laskowski, Iryna Gel |
Abstract | We explore how computational ontologies can be impactful vis-a-vis the developing discipline of “data science.” We posit an approach wherein management theories are represented as formal axioms, and then applied to draw inferences about data that reside in corporate databases. That is, management theories would be implemented as rules within a data analytics engine. We demonstrate a case study development of such an ontology by formally representing an accounting theory in First-Order Logic. Though quite preliminary, the idea that an information technology, namely ontologies, can potentially actualize the academic cliche, “From Theory to Practice,” and be applicable to the burgeoning domain of data analytics is novel and exciting. |
Tasks | |
Published | 2016-08-28 |
URL | http://arxiv.org/abs/1608.07846v1 |
http://arxiv.org/pdf/1608.07846v1.pdf | |
PWC | https://paperswithcode.com/paper/data-analytics-using-ontologies-of-management |
Repo | |
Framework | |
Causal Discovery for Manufacturing Domains
Title | Causal Discovery for Manufacturing Domains |
Authors | Katerina Marazopoulou, Rumi Ghosh, Prasanth Lade, David Jensen |
Abstract | Yield and quality improvement is of paramount importance to any manufacturing company. One of the ways of improving yield is through discovery of the root causal factors affecting yield. We propose the use of data-driven interpretable causal models to identify key factors affecting yield. We focus on factors that are measured in different stages of production and testing in the manufacturing cycle of a product. We apply causal structure learning techniques on real data collected from this line. Specifically, the goal of this work is to learn interpretable causal models from observational data produced by manufacturing lines. Emphasis has been given to the interpretability of the models to make them actionable in the field of manufacturing. We highlight the challenges presented by assembly line data and propose ways to alleviate them.We also identify unique characteristics of data originating from assembly lines and how to leverage them in order to improve causal discovery. Standard evaluation techniques for causal structure learning shows that the learned causal models seem to closely represent the underlying latent causal relationship between different factors in the production process. These results were also validated by manufacturing domain experts who found them promising. This work demonstrates how data mining and knowledge discovery can be used for root cause analysis in the domain of manufacturing and connected industry. |
Tasks | Causal Discovery |
Published | 2016-05-13 |
URL | http://arxiv.org/abs/1605.04056v2 |
http://arxiv.org/pdf/1605.04056v2.pdf | |
PWC | https://paperswithcode.com/paper/causal-discovery-for-manufacturing-domains |
Repo | |
Framework | |
Deep neural networks are robust to weight binarization and other non-linear distortions
Title | Deep neural networks are robust to weight binarization and other non-linear distortions |
Authors | Paul Merolla, Rathinakumar Appuswamy, John Arthur, Steve K. Esser, Dharmendra Modha |
Abstract | Recent results show that deep neural networks achieve excellent performance even when, during training, weights are quantized and projected to a binary representation. Here, we show that this is just the tip of the iceberg: these same networks, during testing, also exhibit a remarkable robustness to distortions beyond quantization, including additive and multiplicative noise, and a class of non-linear projections where binarization is just a special case. To quantify this robustness, we show that one such network achieves 11% test error on CIFAR-10 even with 0.68 effective bits per weight. Furthermore, we find that a common training heuristic–namely, projecting quantized weights during backpropagation–can be altered (or even removed) and networks still achieve a base level of robustness during testing. Specifically, training with weight projections other than quantization also works, as does simply clipping the weights, both of which have never been reported before. We confirm our results for CIFAR-10 and ImageNet datasets. Finally, drawing from these ideas, we propose a stochastic projection rule that leads to a new state of the art network with 7.64% test error on CIFAR-10 using no data augmentation. |
Tasks | Data Augmentation, Quantization |
Published | 2016-06-07 |
URL | http://arxiv.org/abs/1606.01981v1 |
http://arxiv.org/pdf/1606.01981v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-are-robust-to-weight |
Repo | |
Framework | |
Factors in Finetuning Deep Model for object detection
Title | Factors in Finetuning Deep Model for object detection |
Authors | Wanli Ouyang, Xiaogang Wang, Cong Zhang, Xiaokang Yang |
Abstract | Finetuning from a pretrained deep model is found to yield state-of-the-art performance for many vision tasks. This paper investigates many factors that influence the performance in finetuning for object detection. There is a long-tailed distribution of sample numbers for classes in object detection. Our analysis and empirical results show that classes with more samples have higher impact on the feature learning. And it is better to make the sample number more uniform across classes. Generic object detection can be considered as multiple equally important tasks. Detection of each class is a task. These classes/tasks have their individuality in discriminative visual appearance representation. Taking this individuality into account, we cluster objects into visually similar class groups and learn deep representations for these groups separately. A hierarchical feature learning scheme is proposed. In this scheme, the knowledge from the group with large number of classes is transferred for learning features in its sub-groups. Finetuned on the GoogLeNet model, experimental results show 4.7% absolute mAP improvement of our approach on the ImageNet object detection dataset without increasing much computational cost at the testing stage. |
Tasks | Object Detection |
Published | 2016-01-20 |
URL | http://arxiv.org/abs/1601.05150v2 |
http://arxiv.org/pdf/1601.05150v2.pdf | |
PWC | https://paperswithcode.com/paper/factors-in-finetuning-deep-model-for-object |
Repo | |
Framework | |
The Effect of Heteroscedasticity on Regression Trees
Title | The Effect of Heteroscedasticity on Regression Trees |
Authors | Will Ruth, Thomas Loughin |
Abstract | Regression trees are becoming increasingly popular as omnibus predicting tools and as the basis of numerous modern statistical learning ensembles. Part of their popularity is their ability to create a regression prediction without ever specifying a structure for the mean model. However, the method implicitly assumes homogeneous variance across the entire explanatory-variable space. It is unknown how the algorithm behaves when faced with heteroscedastic data. In this study, we assess the performance of the most popular regression-tree algorithm in a single-variable setting under a very simple step-function model for heteroscedasticity. We use simulation to show that the locations of splits, and hence the ability to accurately predict means, are both adversely influenced by the change in variance. We identify the pruning algorithm as the main concern, although the effects on the splitting algorithm may be meaningful in some applications. |
Tasks | |
Published | 2016-06-16 |
URL | http://arxiv.org/abs/1606.05273v1 |
http://arxiv.org/pdf/1606.05273v1.pdf | |
PWC | https://paperswithcode.com/paper/the-effect-of-heteroscedasticity-on |
Repo | |
Framework | |
Subspace clustering based on low rank representation and weighted nuclear norm minimization
Title | Subspace clustering based on low rank representation and weighted nuclear norm minimization |
Authors | Yu Song, Yiquan Wu |
Abstract | Subspace clustering refers to the problem of segmenting a set of data points approximately drawn from a union of multiple linear subspaces. Aiming at the subspace clustering problem, various subspace clustering algorithms have been proposed and low rank representation based subspace clustering is a very promising and efficient subspace clustering algorithm. Low rank representation method seeks the lowest rank representation among all the candidates that can represent the data points as linear combinations of the bases in a given dictionary. Nuclear norm minimization is adopted to minimize the rank of the representation matrix. However, nuclear norm is not a very good approximation of the rank of a matrix and the representation matrix thus obtained can be of high rank which will affect the final clustering accuracy. Weighted nuclear norm (WNN) is a better approximation of the rank of a matrix and WNN is adopted in this paper to describe the rank of the representation matrix. The convex program is solved via conventional alternation direction method of multipliers (ADMM) and linearized alternating direction method of multipliers (LADMM) and they are respectively refer to as WNNM-LRR and WNNM-LRR(L). Experimental results show that, compared with low rank representation method and several other state-of-the-art subspace clustering methods, WNNM-LRR and WNNM-LRR(L) can get higher clustering accuracy. |
Tasks | |
Published | 2016-10-12 |
URL | http://arxiv.org/abs/1610.03604v3 |
http://arxiv.org/pdf/1610.03604v3.pdf | |
PWC | https://paperswithcode.com/paper/subspace-clustering-based-on-low-rank |
Repo | |
Framework | |
Crowd Counting via Weighted VLAD on Dense Attribute Feature Maps
Title | Crowd Counting via Weighted VLAD on Dense Attribute Feature Maps |
Authors | Biyun Sheng, Chunhua Shen, Guosheng Lin, Jun Li, Wankou Yang, Changyin Sun |
Abstract | Crowd counting is an important task in computer vision, which has many applications in video surveillance. Although the regression-based framework has achieved great improvements for crowd counting, how to improve the discriminative power of image representation is still an open problem. Conventional holistic features used in crowd counting often fail to capture semantic attributes and spatial cues of the image. In this paper, we propose integrating semantic information into learning locality-aware feature sets for accurate crowd counting. First, with the help of convolutional neural network (CNN), the original pixel space is mapped onto a dense attribute feature map, where each dimension of the pixel-wise feature indicates the probabilistic strength of a certain semantic class. Then, locality-aware features (LAF) built on the idea of spatial pyramids on neighboring patches are proposed to explore more spatial context and local information. Finally, the traditional VLAD encoding method is extended to a more generalized form in which diverse coefficient weights are taken into consideration. Experimental results validate the effectiveness of our presented method. |
Tasks | Crowd Counting |
Published | 2016-04-29 |
URL | http://arxiv.org/abs/1604.08660v1 |
http://arxiv.org/pdf/1604.08660v1.pdf | |
PWC | https://paperswithcode.com/paper/crowd-counting-via-weighted-vlad-on-dense |
Repo | |
Framework | |
DeepSetNet: Predicting Sets with Deep Neural Networks
Title | DeepSetNet: Predicting Sets with Deep Neural Networks |
Authors | S. Hamid Rezatofighi, Vijay Kumar B G, Anton Milan, Ehsan Abbasnejad, Anthony Dick, Ian Reid |
Abstract | This paper addresses the task of set prediction using deep learning. This is important because the output of many computer vision tasks, including image tagging and object detection, are naturally expressed as sets of entities rather than vectors. As opposed to a vector, the size of a set is not fixed in advance, and it is invariant to the ordering of entities within it. We define a likelihood for a set distribution and learn its parameters using a deep neural network. We also derive a loss for predicting a discrete distribution corresponding to set cardinality. Set prediction is demonstrated on the problem of multi-class image classification. Moreover, we show that the proposed cardinality loss can also trivially be applied to the tasks of object counting and pedestrian detection. Our approach outperforms existing methods in all three cases on standard datasets. |
Tasks | Image Classification, Object Counting, Object Detection, Pedestrian Detection |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.08998v5 |
http://arxiv.org/pdf/1611.08998v5.pdf | |
PWC | https://paperswithcode.com/paper/deepsetnet-predicting-sets-with-deep-neural |
Repo | |
Framework | |