Paper Group ANR 302
Learning Mixture of Gaussians with Streaming Data. On the Discrepancy Between Kleinberg’s Clustering Axioms and $k$-Means Clustering Algorithm Behavior. Towards Speech Emotion Recognition “in the wild” using Aggregated Corpora and Deep Multi-Task Learning. Toward a Formal Model of Cognitive Synergy. Aircraft Fuselage Defect Detection using Deep Neu …
Learning Mixture of Gaussians with Streaming Data
Title | Learning Mixture of Gaussians with Streaming Data |
Authors | Aditi Raghunathan, Ravishankar Krishnaswamy, Prateek Jain |
Abstract | In this paper, we study the problem of learning a mixture of Gaussians with streaming data: given a stream of $N$ points in $d$ dimensions generated by an unknown mixture of $k$ spherical Gaussians, the goal is to estimate the model parameters using a single pass over the data stream. We analyze a streaming version of the popular Lloyd’s heuristic and show that the algorithm estimates all the unknown centers of the component Gaussians accurately if they are sufficiently separated. Assuming each pair of centers are $C\sigma$ distant with $C=\Omega((k\log k)^{1/4}\sigma)$ and where $\sigma^2$ is the maximum variance of any Gaussian component, we show that asymptotically the algorithm estimates the centers optimally (up to constants); our center separation requirement matches the best known result for spherical Gaussians \citep{vempalawang}. For finite samples, we show that a bias term based on the initial estimate decreases at $O(1/{\rm poly}(N))$ rate while variance decreases at nearly optimal rate of $\sigma^2 d/N$. Our analysis requires seeding the algorithm with a good initial estimate of the true cluster centers for which we provide an online PCA based clustering algorithm. Indeed, the asymptotic per-step time complexity of our algorithm is the optimal $d\cdot k$ while space complexity of our algorithm is $O(dk\log k)$. In addition to the bias and variance terms which tend to $0$, the hard-thresholding based updates of streaming Lloyd’s algorithm is agnostic to the data distribution and hence incurs an approximation error that cannot be avoided. However, by using a streaming version of the classical (soft-thresholding-based) EM method that exploits the Gaussian distribution explicitly, we show that for a mixture of two Gaussians the true means can be estimated consistently, with estimation error decreasing at nearly optimal rate, and tending to $0$ for $N\rightarrow \infty$. |
Tasks | |
Published | 2017-07-08 |
URL | http://arxiv.org/abs/1707.02391v1 |
http://arxiv.org/pdf/1707.02391v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-mixture-of-gaussians-with-streaming |
Repo | |
Framework | |
On the Discrepancy Between Kleinberg’s Clustering Axioms and $k$-Means Clustering Algorithm Behavior
Title | On the Discrepancy Between Kleinberg’s Clustering Axioms and $k$-Means Clustering Algorithm Behavior |
Authors | Robert Kłopotek, Mieczysław Kłopotek |
Abstract | This paper investigates the validity of Kleinberg’s axioms for clustering functions with respect to the quite popular clustering algorithm called $k$-means. While Kleinberg’s axioms have been discussed heavily in the past, we concentrate here on the case predominantly relevant for $k$-means algorithm, that is behavior embedded in Euclidean space. We point at some contradictions and counter intuitiveness aspects of this axiomatic set within $\mathbb{R}^m$ that were evidently not discussed so far. Our results suggest that apparently without defining clearly what kind of clusters we expect we will not be able to construct a valid axiomatic system. In particular we look at the shape and the gaps between the clusters. Finally we demonstrate that there exist several ways to reconcile the formulation of the axioms with their intended meaning and that under this reformulation the axioms stop to be contradictory and the real-world $k$-means algorithm conforms to this axiomatic system. |
Tasks | |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04577v2 |
http://arxiv.org/pdf/1702.04577v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-discrepancy-between-kleinbergs |
Repo | |
Framework | |
Towards Speech Emotion Recognition “in the wild” using Aggregated Corpora and Deep Multi-Task Learning
Title | Towards Speech Emotion Recognition “in the wild” using Aggregated Corpora and Deep Multi-Task Learning |
Authors | Jaebok Kim, Gwenn Englebienne, Khiet P. Truong, Vanessa Evers |
Abstract | One of the challenges in Speech Emotion Recognition (SER) “in the wild” is the large mismatch between training and test data (e.g. speakers and tasks). In order to improve the generalisation capabilities of the emotion models, we propose to use Multi-Task Learning (MTL) and use gender and naturalness as auxiliary tasks in deep neural networks. This method was evaluated in within-corpus and various cross-corpus classification experiments that simulate conditions “in the wild”. In comparison to Single-Task Learning (STL) based state of the art methods, we found that our MTL method proposed improved performance significantly. Particularly, models using both gender and naturalness achieved more gains than those using either gender or naturalness separately. This benefit was also found in the high-level representations of the feature space, obtained from our method proposed, where discriminative emotional clusters could be observed. |
Tasks | Emotion Recognition, Multi-Task Learning, Speech Emotion Recognition |
Published | 2017-08-13 |
URL | http://arxiv.org/abs/1708.03920v1 |
http://arxiv.org/pdf/1708.03920v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-speech-emotion-recognition-in-the |
Repo | |
Framework | |
Toward a Formal Model of Cognitive Synergy
Title | Toward a Formal Model of Cognitive Synergy |
Authors | Ben Goertzel |
Abstract | “Cognitive synergy” refers to a dynamic in which multiple cognitive processes, cooperating to control the same cognitive system, assist each other in overcoming bottlenecks encountered during their internal processing. Cognitive synergy has been posited as a key feature of real-world general intelligence, and has been used explicitly in the design of the OpenCog cognitive architecture. Here category theory and related concepts are used to give a formalization of the cognitive synergy concept. A series of formal models of intelligent agents is proposed, with increasing specificity and complexity: simple reinforcement learning agents; “cognit” agents with an abstract memory and processing model; hypergraph-based agents (in which “cognit” operations are carried out via hypergraphs); hypergraph agents with a rich language of nodes and hyperlinks (such as the OpenCog framework provides); “PGMC” agents whose rich hypergraphs are endowed with cognitive processes guided via Probabilistic Growth and Mining of Combinations; and finally variations of the PrimeAGI design, which is currently being built on top of OpenCog. A notion of cognitive synergy is developed for cognitive processes acting within PGMC agents, based on developing a formal notion of “stuckness,” and defining synergy as a relationship between cognitive processes in which they can help each other out when they get stuck. It is proposed that cognitive processes relating to each other synergetically, associate in a certain way with functors that map into each other via natural transformations. Cognitive synergy is proposed to correspond to a certain inequality regarding the relative costs of different paths through certain commutation diagrams. Applications of this notion of cognitive synergy to particular cognitive phenomena, and specific cognitive processes in the PrimeAGI design, are discussed. |
Tasks | |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04361v1 |
http://arxiv.org/pdf/1703.04361v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-a-formal-model-of-cognitive-synergy |
Repo | |
Framework | |
Aircraft Fuselage Defect Detection using Deep Neural Networks
Title | Aircraft Fuselage Defect Detection using Deep Neural Networks |
Authors | Touba Malekzadeh, Milad Abdollahzadeh, Hossein Nejati, Ngai-Man Cheung |
Abstract | To ensure flight safety of aircraft structures, it is necessary to have regular maintenance using visual and nondestructive inspection (NDI) methods. In this paper, we propose an automatic image-based aircraft defect detection using Deep Neural Networks (DNNs). To the best of our knowledge, this is the first work for aircraft defect detection using DNNs. We perform a comprehensive evaluation of state-of-the-art feature descriptors and show that the best performance is achieved by vgg-f DNN as feature extractor with a linear SVM classifier. To reduce the processing time, we propose to apply SURF key point detector to identify defect patch candidates. Our experiment results suggest that we can achieve over 96% accuracy at around 15s processing time for a high-resolution (20-megapixel) image on a laptop. |
Tasks | |
Published | 2017-12-26 |
URL | http://arxiv.org/abs/1712.09213v1 |
http://arxiv.org/pdf/1712.09213v1.pdf | |
PWC | https://paperswithcode.com/paper/aircraft-fuselage-defect-detection-using-deep |
Repo | |
Framework | |
Standard Steady State Genetic Algorithms Can Hillclimb Faster than Mutation-only Evolutionary Algorithms
Title | Standard Steady State Genetic Algorithms Can Hillclimb Faster than Mutation-only Evolutionary Algorithms |
Authors | Dogan Corus, Pietro S. Oliveto |
Abstract | Explaining to what extent the real power of genetic algorithms lies in the ability of crossover to recombine individuals into higher quality solutions is an important problem in evolutionary computation. In this paper we show how the interplay between mutation and crossover can make genetic algorithms hillclimb faster than their mutation-only counterparts. We devise a Markov Chain framework that allows to rigorously prove an upper bound on the runtime of standard steady state genetic algorithms to hillclimb the OneMax function. The bound establishes that the steady-state genetic algorithms are 25% faster than all standard bit mutation-only evolutionary algorithms with static mutation rate up to lower order terms for moderate population sizes. The analysis also suggests that larger populations may be faster than populations of size 2. We present a lower bound for a greedy (2+1) GA that matches the upper bound for populations larger than 2, rigorously proving that 2 individuals cannot outperform larger population sizes under greedy selection and greedy crossover up to lower order terms. In complementary experiments the best population size is greater than 2 and the greedy genetic algorithms are faster than standard ones, further suggesting that the derived lower bound also holds for the standard steady state (2+1) GA. |
Tasks | |
Published | 2017-08-04 |
URL | http://arxiv.org/abs/1708.01571v2 |
http://arxiv.org/pdf/1708.01571v2.pdf | |
PWC | https://paperswithcode.com/paper/standard-steady-state-genetic-algorithms-can |
Repo | |
Framework | |
Taking Visual Motion Prediction To New Heightfields
Title | Taking Visual Motion Prediction To New Heightfields |
Authors | Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi |
Abstract | While the basic laws of Newtonian mechanics are well understood, explaining a physical scenario still requires manually modeling the problem with suitable equations and estimating the associated parameters. In order to be able to leverage the approximation capabilities of artificial intelligence techniques in such physics related contexts, researchers have handcrafted the relevant states, and then used neural networks to learn the state transitions using simulation runs as training data. Unfortunately, such approaches are unsuited for modeling complex real-world scenarios, where manually authoring relevant state spaces tend to be tedious and challenging. In this work, we investigate if neural networks can implicitly learn physical states of real-world mechanical processes only based on visual data while internally modeling non-homogeneous environment and in the process enable long-term physical extrapolation. We develop a recurrent neural network architecture for this task and also characterize resultant uncertainties in the form of evolving variance estimates. We evaluate our setup to extrapolate motion of rolling ball(s) on bowls of varying shape and orientation, and on arbitrary heightfields using only images as input. We report significant improvements over existing image-based methods both in terms of accuracy of predictions and complexity of scenarios; and report competitive performance with approaches that, unlike us, assume access to internal physical states. |
Tasks | motion prediction |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.09448v1 |
http://arxiv.org/pdf/1712.09448v1.pdf | |
PWC | https://paperswithcode.com/paper/taking-visual-motion-prediction-to-new |
Repo | |
Framework | |
A Comparative Study of CNN, BoVW and LBP for Classification of Histopathological Images
Title | A Comparative Study of CNN, BoVW and LBP for Classification of Histopathological Images |
Authors | Meghana Dinesh Kumar, Morteza Babaie, Shujin Zhu, Shivam Kalra, H. R. Tizhoosh |
Abstract | Despite the progress made in the field of medical imaging, it remains a large area of open research, especially due to the variety of imaging modalities and disease-specific characteristics. This paper is a comparative study describing the potential of using local binary patterns (LBP), deep features and the bag-of-visual words (BoVW) scheme for the classification of histopathological images. We introduce a new dataset, \emph{KIMIA Path960}, that contains 960 histopathology images belonging to 20 different classes (different tissue types). We make this dataset publicly available. The small size of the dataset and its inter- and intra-class variability makes it ideal for initial investigations when comparing image descriptors for search and classification in complex medical imaging cases like histopathology. We investigate deep features, LBP histograms and BoVW to classify the images via leave-one-out validation. The accuracy of image classification obtained using LBP was 90.62% while the highest accuracy using deep features reached 94.72%. The dictionary approach (BoVW) achieved 96.50%. Deep solutions may be able to deliver higher accuracies but they need extensive training with a large number of (balanced) image datasets. |
Tasks | Image Classification |
Published | 2017-09-27 |
URL | http://arxiv.org/abs/1710.01249v1 |
http://arxiv.org/pdf/1710.01249v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-of-cnn-bovw-and-lbp-for |
Repo | |
Framework | |
Deep Multi-camera People Detection
Title | Deep Multi-camera People Detection |
Authors | Tatjana Chavdarova, François Fleuret |
Abstract | This paper addresses the problem of multi-view people occupancy map estimation. Existing solutions for this problem either operate per-view, or rely on a background subtraction pre-processing. Both approaches lessen the detection performance as scenes become more crowded. The former does not exploit joint information, whereas the latter deals with ambiguous input due to the foreground blobs becoming more and more interconnected as the number of targets increases. Although deep learning algorithms have proven to excel on remarkably numerous computer vision tasks, such a method has not been applied yet to this problem. In large part this is due to the lack of large-scale multi-camera data-set. The core of our method is an architecture which makes use of monocular pedestrian data-set, available at larger scale then the multi-view ones, applies parallel processing to the multiple video streams, and jointly utilises it. Our end-to-end deep learning method outperforms existing methods by large margins on the commonly used PETS 2009 data-set. Furthermore, we make publicly available a new three-camera HD data-set. Our source code and trained models will be made available under an open-source license. |
Tasks | |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04593v3 |
http://arxiv.org/pdf/1702.04593v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-camera-people-detection |
Repo | |
Framework | |
Navigator-free EPI Ghost Correction with Structured Low-Rank Matrix Models: New Theory and Methods
Title | Navigator-free EPI Ghost Correction with Structured Low-Rank Matrix Models: New Theory and Methods |
Authors | Rodrigo A. Lobos, Tae Hyung Kim, W. Scott Hoge, Justin P. Haldar |
Abstract | Structured low-rank matrix models have previously been introduced to enable calibrationless MR image reconstruction from sub-Nyquist data, and such ideas have recently been extended to enable navigator-free echo-planar imaging (EPI) ghost correction. This paper presents novel theoretical analysis which shows that, because of uniform subsampling, the structured low-rank matrix optimization problems for EPI data will always have either undesirable or non-unique solutions in the absence of additional constraints. This theory leads us to recommend and investigate problem formulations for navigator-free EPI that incorporate side information from either image-domain or k-space domain parallel imaging methods. The importance of using nonconvex low-rank matrix regularization is also identified. We demonstrate using phantom and \emph{in vivo} data that the proposed methods are able to eliminate ghost artifacts for several navigator-free EPI acquisition schemes, obtaining better performance in comparison to state-of-the-art methods across a range of different scenarios. Results are shown for both single-channel acquisition and highly accelerated multi-channel acquisition. |
Tasks | Image Reconstruction |
Published | 2017-08-16 |
URL | http://arxiv.org/abs/1708.05095v3 |
http://arxiv.org/pdf/1708.05095v3.pdf | |
PWC | https://paperswithcode.com/paper/navigator-free-epi-ghost-correction-with |
Repo | |
Framework | |
Sequential Multi-Class Labeling in Crowdsourcing
Title | Sequential Multi-Class Labeling in Crowdsourcing |
Authors | Qiyu Kang, Wee Peng Tay |
Abstract | We consider a crowdsourcing platform where workers’ responses to questions posed by a crowdsourcer are used to determine the hidden state of a multi-class labeling problem. As workers may be unreliable, we propose to perform sequential questioning in which the questions posed to the workers are designed based on previous questions and answers. We propose a Partially-Observable Markov Decision Process (POMDP) framework to determine the best questioning strategy, subject to the crowdsourcer’s budget constraint. As this POMDP formulation is in general intractable, we develop a suboptimal approach based on a $q$-ary Ulam-R'enyi game. We also propose a sampling heuristic, which can be used in tandem with standard POMDP solvers, using our Ulam-R'enyi strategy. We demonstrate through simulations that our approaches outperform a non-sequential strategy based on error correction coding and which does not utilize workers’ previous responses. |
Tasks | |
Published | 2017-11-06 |
URL | http://arxiv.org/abs/1711.02128v2 |
http://arxiv.org/pdf/1711.02128v2.pdf | |
PWC | https://paperswithcode.com/paper/sequential-multi-class-labeling-in |
Repo | |
Framework | |
A Solution for Crime Scene Reconstruction using Time-of-Flight Cameras
Title | A Solution for Crime Scene Reconstruction using Time-of-Flight Cameras |
Authors | Silvio Giancola, Daniele Piron, Pasquale Poppa, Remo Sala |
Abstract | In this work, we propose a method for three-dimensional (3D) reconstruction of wide crime scene, based on a Simultaneous Localization and Mapping (SLAM) approach. We used a Kinect V2 Time-of-Flight (TOF) RGB-D camera to provide colored dense point clouds at a 30 Hz frequency. This device is moved freely (6 degrees of freedom) during the scene exploration. The implemented SLAM solution aligns successive point clouds using an 3D keypoints description and matching approach. This type of approach exploits both colorimetric and geometrical information, and permits reconstruction under poor illumination conditions. Our solution has been tested for indoor crime scene and outdoor archaeological site reconstruction, returning a mean error around one centimeter. It is less precise than environmental laser scanner solution, but more practical and portable as well as less cumbersome. Also, the hardware is definitively cheaper. |
Tasks | 3D Reconstruction, Simultaneous Localization and Mapping |
Published | 2017-08-07 |
URL | http://arxiv.org/abs/1708.02033v1 |
http://arxiv.org/pdf/1708.02033v1.pdf | |
PWC | https://paperswithcode.com/paper/a-solution-for-crime-scene-reconstruction |
Repo | |
Framework | |
Revisiting hand-crafted feature for action recognition: a set of improved dense trajectories
Title | Revisiting hand-crafted feature for action recognition: a set of improved dense trajectories |
Authors | Kenji Matsui, Toru Tamaki, Gwladys Auffret, Bisser Raytchev, Kazufumi Kaneda |
Abstract | We propose a feature for action recognition called Trajectory-Set (TS), on top of the improved Dense Trajectory (iDT). The TS feature encodes only trajectories around densely sampled interest points, without any appearance features. Experimental results on the UCF50, UCF101, and HMDB51 action datasets demonstrate that TS is comparable to state-of-the-arts, and outperforms many other methods; for HMDB the accuracy of 85.4%, compared to the best accuracy of 80.2% obtained by a deep method. Our code is available on-line at https://github.com/Gauffret/TrajectorySet . |
Tasks | Temporal Action Localization |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10143v1 |
http://arxiv.org/pdf/1711.10143v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-hand-crafted-feature-for-action |
Repo | |
Framework | |
Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval
Title | Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval |
Authors | Eng-Jon Ong, Sameed Husain, Miroslaw Bober |
Abstract | This paper addresses the problem of large scale image retrieval, with the aim of accurately ranking the similarity of a large number of images to a given query image. To achieve this, we propose a novel Siamese network. This network consists of two computational strands, each comprising of a CNN component followed by a Fisher vector component. The CNN component produces dense, deep convolutional descriptors that are then aggregated by the Fisher Vector method. Crucially, we propose to simultaneously learn both the CNN filter weights and Fisher Vector model parameters. This allows us to account for the evolving distribution of deep descriptors over the course of the learning process. We show that the proposed approach gives significant improvements over the state-of-the-art methods on the Oxford and Paris image retrieval datasets. Additionally, we provide a baseline performance measure for both these datasets with the inclusion of 1 million distractors. |
Tasks | Image Retrieval |
Published | 2017-02-01 |
URL | http://arxiv.org/abs/1702.00338v1 |
http://arxiv.org/pdf/1702.00338v1.pdf | |
PWC | https://paperswithcode.com/paper/siamese-network-of-deep-fisher-vector |
Repo | |
Framework | |
Aggregation of Classifiers: A Justifiable Information Granularity Approach
Title | Aggregation of Classifiers: A Justifiable Information Granularity Approach |
Authors | Tien Thanh Nguyen, Xuan Cuong Pham, Alan Wee-Chung Liew, Witold Pedrycz |
Abstract | In this study, we introduce a new approach to combine multi-classifiers in an ensemble system. Instead of using numeric membership values encountered in fixed combining rules, we construct interval membership values associated with each class prediction at the level of meta-data of observation by using concepts of information granules. In the proposed method, uncertainty (diversity) of findings produced by the base classifiers is quantified by interval-based information granules. The discriminative decision model is generated by considering both the bounds and the length of the obtained intervals. We select ten and then fifteen learning algorithms to build a heterogeneous ensemble system and then conducted the experiment on a number of UCI datasets. The experimental results demonstrate that the proposed approach performs better than the benchmark algorithms including six fixed combining methods, one trainable combining method, AdaBoost, Bagging, and Random Subspace. |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05411v1 |
http://arxiv.org/pdf/1703.05411v1.pdf | |
PWC | https://paperswithcode.com/paper/aggregation-of-classifiers-a-justifiable |
Repo | |
Framework | |