May 5, 2019

3407 words 16 mins read

Paper Group ANR 505

Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data. Indoor Space Recognition using Deep Convolutional Neural Network: A Case Study at MIT Campus. Seeing with Humans: Gaze-Assisted Neural Image Captioning. N-gram Opcode Analysis for Android Malware Detection. Cascaded Subpatch Networks for Effective CNNs. Th …

Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data


Title	Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data
Authors	Anurag Kumar, Bhiksha Raj
Abstract	In this paper we propose a novel learning framework called Supervised and Weakly Supervised Learning where the goal is to learn simultaneously from weakly and strongly labeled data. Strongly labeled data can be simply understood as fully supervised data where all labeled instances are available. In weakly supervised learning only data is weakly labeled which prevents one from directly applying supervised learning methods. Our proposed framework is motivated by the fact that a small amount of strongly labeled data can give considerable improvement over only weakly supervised learning. The primary problem domain focus of this paper is acoustic event and scene detection in audio recordings. We first propose a naive formulation for leveraging labeled data in both forms. We then propose a more general framework for Supervised and Weakly Supervised Learning (SWSL). Based on this general framework, we propose a graph based approach for SWSL. Our main method is based on manifold regularization on graphs in which we show that the unified learning can be formulated as a constraint optimization problem which can be solved by iterative concave-convex procedure (CCCP). Our experiments show that our proposed framework can address several concerns of audio content analysis using weakly labeled data.
Tasks	Scene Recognition
Published	2016-11-12
URL	http://arxiv.org/abs/1611.04871v3
PDF	http://arxiv.org/pdf/1611.04871v3.pdf
PWC	https://paperswithcode.com/paper/audio-event-and-scene-recognition-a-unified
Repo
Framework

Indoor Space Recognition using Deep Convolutional Neural Network: A Case Study at MIT Campus


Title	Indoor Space Recognition using Deep Convolutional Neural Network: A Case Study at MIT Campus
Authors	Fan Zhang, Fabio Duarte, Ruixian Ma, Dimitrios Milioris, Hui Lin, Carlo Ratti
Abstract	In this paper, we propose a robust and parsimonious approach using Deep Convolutional Neural Network (DCNN) to recognize and interpret interior space. DCNN has achieved incredible success in object and scene recognition. In this study we design and train a DCNN to classify a pre-zoning indoor space, and from a single phone photo to recognize the learned space features, with no need of additional assistive technology. We collect more than 600,000 images inside MIT campus buildings to train our DCNN model, and achieved 97.9% accuracy in validation dataset and 81.7% accuracy in test dataset based on spatial-scale fixed model. Furthermore, the recognition accuracy and spatial resolution can be potentially improved through multiscale classification model. We identify the discriminative image regions through Class Activating Mapping (CAM) technique, to observe the model’s behavior in how to recognize space and interpret it in an abstract way. By evaluating the results with misclassification matrix, we investigate the visual spatial feature of interior space by looking into its visual similarity and visual distinctiveness, giving insights into interior design and human indoor perception and wayfinding research. The contribution of this paper is threefold. First, we propose a robust and parsimonious approach for indoor navigation using DCNN. Second, we demonstrate that DCNN also has a potential capability in space feature learning and recognition, even under severe appearance changes. Third, we introduce a DCNN based approach to look into the visual similarity and visual distinctiveness of interior space.
Tasks	Scene Recognition
Published	2016-10-07
URL	http://arxiv.org/abs/1610.02414v1
PDF	http://arxiv.org/pdf/1610.02414v1.pdf
PWC	https://paperswithcode.com/paper/indoor-space-recognition-using-deep
Repo
Framework

Seeing with Humans: Gaze-Assisted Neural Image Captioning


Title	Seeing with Humans: Gaze-Assisted Neural Image Captioning
Authors	Yusuke Sugano, Andreas Bulling
Abstract	Gaze reflects how humans process visual scenes and is therefore increasingly used in computer vision systems. Previous works demonstrated the potential of gaze for object-centric tasks, such as object localization and recognition, but it remains unclear if gaze can also be beneficial for scene-centric tasks, such as image captioning. We present a new perspective on gaze-assisted image captioning by studying the interplay between human gaze and the attention mechanism of deep neural networks. Using a public large-scale gaze dataset, we first assess the relationship between state-of-the-art object and scene recognition models, bottom-up visual saliency, and human gaze. We then propose a novel split attention model for image captioning. Our model integrates human gaze information into an attention-based long short-term memory architecture, and allows the algorithm to allocate attention selectively to both fixated and non-fixated image regions. Through evaluation on the COCO/SALICON datasets we show that our method improves image captioning performance and that gaze can complement machine attention for semantic scene understanding tasks.
Tasks	Image Captioning, Object Localization, Scene Recognition, Scene Understanding
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05203v1
PDF	http://arxiv.org/pdf/1608.05203v1.pdf
PWC	https://paperswithcode.com/paper/seeing-with-humans-gaze-assisted-neural-image
Repo
Framework

N-gram Opcode Analysis for Android Malware Detection


Title	N-gram Opcode Analysis for Android Malware Detection
Authors	BooJoong Kang, Suleiman Y. Yerima, Sakir Sezer, Kieran McLaughlin
Abstract	Android malware has been on the rise in recent years due to the increasing popularity of Android and the proliferation of third party application markets. Emerging Android malware families are increasingly adopting sophisticated detection avoidance techniques and this calls for more effective approaches for Android malware detection. Hence, in this paper we present and evaluate an n-gram opcode features based approach that utilizes machine learning to identify and categorize Android malware. This approach enables automated feature discovery without relying on prior expert or domain knowledge for pre-determined features. Furthermore, by using a data segmentation technique for feature selection, our analysis is able to scale up to 10-gram opcodes. Our experiments on a dataset of 2520 samples showed an f-measure of 98% using the n-gram opcode based approach. We also provide empirical findings that illustrate factors that have probable impact on the overall n-gram opcodes performance trends.
Tasks	Android Malware Detection, Feature Selection, Malware Detection
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01445v1
PDF	http://arxiv.org/pdf/1612.01445v1.pdf
PWC	https://paperswithcode.com/paper/n-gram-opcode-analysis-for-android-malware
Repo
Framework

Cascaded Subpatch Networks for Effective CNNs


Title	Cascaded Subpatch Networks for Effective CNNs
Authors	Xiaoheng Jiang, Yanwei Pang, Manli Sun, Xuelong Li
Abstract	Conventional Convolutional Neural Networks (CNNs) use either a linear or non-linear filter to extract features from an image patch (region) of spatial size $ H\times W $ (Typically, $ H $ is small and is equal to $ W$, e.g., $ H $ is 5 or 7). Generally, the size of the filter is equal to the size $ H\times W $ of the input patch. We argue that the representation ability of equal-size strategy is not strong enough. To overcome the drawback, we propose to use subpatch filter whose spatial size $ h\times w $ is smaller than $ H\times W $. The proposed subpatch filter consists of two subsequent filters. The first one is a linear filter of spatial size $ h\times w $ and is aimed at extracting features from spatial domain. The second one is of spatial size $ 1\times 1 $ and is used for strengthening the connection between different input feature channels and for reducing the number of parameters. The subpatch filter convolves with the input patch and the resulting network is called a subpatch network. Taking the output of one subpatch network as input, we further repeat constructing subpatch networks until the output contains only one neuron in spatial domain. These subpatch networks form a new network called Cascaded Subpatch Network (CSNet). The feature layer generated by CSNet is called csconv layer. For the whole input image, we construct a deep neural network by stacking a sequence of csconv layers. Experimental results on four benchmark datasets demonstrate the effectiveness and compactness of the proposed CSNet. For example, our CSNet reaches a test error of $ 5.68% $ on the CIFAR10 dataset without model averaging. To the best of our knowledge, this is the best result ever obtained on the CIFAR10 dataset.
Tasks
Published	2016-03-01
URL	http://arxiv.org/abs/1603.00128v1
PDF	http://arxiv.org/pdf/1603.00128v1.pdf
PWC	https://paperswithcode.com/paper/cascaded-subpatch-networks-for-effective-cnns
Repo
Framework

The Sum-Product Theorem: A Foundation for Learning Tractable Models


Title	The Sum-Product Theorem: A Foundation for Learning Tractable Models
Authors	Abram L. Friesen, Pedro Domingos
Abstract	Inference in expressive probabilistic models is generally intractable, which makes them difficult to learn and limits their applicability. Sum-product networks are a class of deep models where, surprisingly, inference remains tractable even when an arbitrary number of hidden layers are present. In this paper, we generalize this result to a much broader set of learning problems: all those where inference consists of summing a function over a semiring. This includes satisfiability, constraint satisfaction, optimization, integration, and others. In any semiring, for summation to be tractable it suffices that the factors of every product have disjoint scopes. This unifies and extends many previous results in the literature. Enforcing this condition at learning time thus ensures that the learned models are tractable. We illustrate the power and generality of this approach by applying it to a new type of structured prediction problem: learning a nonconvex function that can be globally optimized in polynomial time. We show empirically that this greatly outperforms the standard approach of learning without regard to the cost of optimization.
Tasks	Structured Prediction
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03553v1
PDF	http://arxiv.org/pdf/1611.03553v1.pdf
PWC	https://paperswithcode.com/paper/the-sum-product-theorem-a-foundation-for
Repo
Framework

Unsupervised Risk Estimation Using Only Conditional Independence Structure


Title	Unsupervised Risk Estimation Using Only Conditional Independence Structure
Authors	Jacob Steinhardt, Percy Liang
Abstract	We show how to estimate a model’s test error from unlabeled data, on distributions very different from the training distribution, while assuming only that certain conditional independencies are preserved between train and test. We do not need to assume that the optimal predictor is the same between train and test, or that the true distribution lies in any parametric family. We can also efficiently differentiate the error estimate to perform unsupervised discriminative learning. Our technical tool is the method of moments, which allows us to exploit conditional independencies in the absence of a fully-specified model. Our framework encompasses a large family of losses including the log and exponential loss, and extends to structured output settings such as hidden Markov models.
Tasks
Published	2016-06-16
URL	http://arxiv.org/abs/1606.05313v1
PDF	http://arxiv.org/pdf/1606.05313v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-risk-estimation-using-only
Repo
Framework

Quantum Clustering and Gaussian Mixtures


Title	Quantum Clustering and Gaussian Mixtures
Authors	Mahajabin Rahman, Davi Geiger
Abstract	The mixture of Gaussian distributions, a soft version of k-means , is considered a state-of-the-art clustering algorithm. It is widely used in computer vision for selecting classes, e.g., color, texture, and shapes. In this algorithm, each class is described by a Gaussian distribution, defined by its mean and covariance. The data is described by a weighted sum of these Gaussian distributions. We propose a new method, inspired by quantum interference in physics. Instead of modeling each class distribution directly, we model a class wave function such that its magnitude square is the class Gaussian distribution. We then mix the class wave functions to create the mixture wave function. The final mixture distribution is then the magnitude square of the mixture wave function. As a result, we observe the quantum class interference phenomena, not present in the Gaussian mixture model. We show that the quantum method outperforms the Gaussian mixture method in every aspect of the estimations. It provides more accurate estimations of all distribution parameters, with much less fluctuations, and it is also more robust to data deformations from the Gaussian assumptions. We illustrate our method for color segmentation as an example application.
Tasks
Published	2016-12-29
URL	http://arxiv.org/abs/1612.09199v1
PDF	http://arxiv.org/pdf/1612.09199v1.pdf
PWC	https://paperswithcode.com/paper/quantum-clustering-and-gaussian-mixtures
Repo
Framework

Probabilistic classifiers with low rank indefinite kernels


Title	Probabilistic classifiers with low rank indefinite kernels
Authors	Frank-Michael Schleif, Andrej Gisbrecht, Peter Tino
Abstract	Indefinite similarity measures can be frequently found in bio-informatics by means of alignment scores, but are also common in other fields like shape measures in image retrieval. Lacking an underlying vector space, the data are given as pairwise similarities only. The few algorithms available for such data do not scale to larger datasets. Focusing on probabilistic batch classifiers, the Indefinite Kernel Fisher Discriminant (iKFD) and the Probabilistic Classification Vector Machine (PCVM) are both effective algorithms for this type of data but, with cubic complexity. Here we propose an extension of iKFD and PCVM such that linear runtime and memory complexity is achieved for low rank indefinite kernels. Employing the Nystr"om approximation for indefinite kernels, we also propose a new almost parameter free approach to identify the landmarks, restricted to a supervised learning problem. Evaluations at several larger similarity data from various domains show that the proposed methods provides similar generalization capabilities while being easier to parametrize and substantially faster for large scale data.
Tasks	Image Retrieval
Published	2016-04-08
URL	http://arxiv.org/abs/1604.02264v1
PDF	http://arxiv.org/pdf/1604.02264v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-classifiers-with-low-rank
Repo
Framework

Deep Learning Assessment of Tumor Proliferation in Breast Cancer Histological Images


Title	Deep Learning Assessment of Tumor Proliferation in Breast Cancer Histological Images
Authors	Manan Shah, Christopher Rubadue, David Suster, Dayong Wang
Abstract	Current analysis of tumor proliferation, the most salient prognostic biomarker for invasive breast cancer, is limited to subjective mitosis counting by pathologists in localized regions of tissue images. This study presents the first data-driven integrative approach to characterize the severity of tumor growth and spread on a categorical and molecular level, utilizing multiple biologically salient deep learning classifiers to develop a comprehensive prognostic model. Our approach achieves pathologist-level performance on three-class categorical tumor severity prediction. It additionally pioneers prediction of molecular expression data from a tissue image, obtaining a Spearman’s rank correlation coefficient of 0.60 with ex vivo mean calculated RNA expression. Furthermore, our framework is applied to identify over two hundred unprecedented biomarkers critical to the accurate assessment of tumor proliferation, validating our proposed integrative pipeline as the first to holistically and objectively analyze histopathological images.
Tasks
Published	2016-10-11
URL	http://arxiv.org/abs/1610.03467v1
PDF	http://arxiv.org/pdf/1610.03467v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-assessment-of-tumor
Repo
Framework

The Quality of the Covariance Selection Through Detection Problem and AUC Bounds


Title	The Quality of the Covariance Selection Through Detection Problem and AUC Bounds
Authors	Navid Tafaghodi Khajavi, Anthony Kuh
Abstract	We consider the problem of quantifying the quality of a model selection problem for a graphical model. We discuss this by formulating the problem as a detection problem. Model selection problems usually minimize a distance between the original distribution and the model distribution. For the special case of Gaussian distributions, the model selection problem simplifies to the covariance selection problem which is widely discussed in literature by Dempster [2] where the likelihood criterion is maximized or equivalently the Kullback-Leibler (KL) divergence is minimized to compute the model covariance matrix. While this solution is optimal for Gaussian distributions in the sense of the KL divergence, it is not optimal when compared with other information divergences and criteria such as Area Under the Curve (AUC). In this paper, we analytically compute upper and lower bounds for the AUC and discuss the quality of model selection problem using the AUC and its bounds as an accuracy measure in detection problem. We define the correlation approximation matrix (CAM) and show that analytical computation of the KL divergence, the AUC and its bounds only depend on the eigenvalues of CAM. We also show the relationship between the AUC, the KL divergence and the ROC curve by optimizing with respect to the ROC curve. In the examples provided, we pick tree structures as the simplest graphical models. We perform simulations on fully-connected graphs and compute the tree structured models by applying the widely used Chow-Liu algorithm [3]. Examples show that the quality of tree approximation models are not good in general based on information divergences, the AUC and its bounds when the number of nodes in the graphical model is large. We show both analytically and by simulations that the 1-AUC for the tree approximation model decays exponentially as the dimension of graphical model increases.
Tasks	Model Selection
Published	2016-05-18
URL	http://arxiv.org/abs/1605.05776v4
PDF	http://arxiv.org/pdf/1605.05776v4.pdf
PWC	https://paperswithcode.com/paper/the-quality-of-the-covariance-selection
Repo
Framework


Title	On Tie Strength Augmented Social Correlation for Inferring Preference of Mobile Telco Users
Authors	Shifeng Liu, Zheng Hu, Sujit Dey, Xin Ke
Abstract	For mobile telecom operators, it is critical to build preference profiles of their customers and connected users, which can help operators make better marketing strategies, and provide more personalized services. With the deployment of deep packet inspection (DPI) in telecom networks, it is possible for the telco operators to obtain user online preference. However, DPI has its limitations and user preference derived only from DPI faces sparsity and cold start problems. To better infer the user preference, social correlation in telco users network derived from Call Detailed Records (CDRs) with regard to online preference is investigated. Though widely verified in several online social networks, social correlation between online preference of users in mobile telco networks, where the CDRs derived relationship are of less social properties and user mobile internet surfing activities are not visible to neighbourhood, has not been explored at a large scale. Based on a real world telecom dataset including CDRs and preference of more than $550K$ users for several months, we verified that correlation does exist between online preference in such \textit{ambiguous} social network. Furthermore, we found that the stronger ties that users build, the more similarity between their preference may have. After defining the preference inferring task as a Top-$K$ recommendation problem, we incorporated Matrix Factorization Collaborative Filtering model with social correlation and tie strength based on call patterns to generate Top-$K$ preferred categories for users. The proposed Tie Strength Augmented Social Recommendation (TSASoRec) model takes data sparsity and cold start user problems into account, considering both the recorded and missing recorded category entries. The experiment on real dataset shows the proposed model can better infer user preference, especially for cold start users.
Tasks
Published	2016-03-01
URL	http://arxiv.org/abs/1603.00145v2
PDF	http://arxiv.org/pdf/1603.00145v2.pdf
PWC	https://paperswithcode.com/paper/on-tie-strength-augmented-social-correlation
Repo
Framework

Temporal Multinomial Mixture for Instance-Oriented Evolutionary Clustering


Title	Temporal Multinomial Mixture for Instance-Oriented Evolutionary Clustering
Authors	Young-Min Kim, Julien Velcin, Stéphane Bonnevay, Marian-Andrei Rizoiu
Abstract	Evolutionary clustering aims at capturing the temporal evolution of clusters. This issue is particularly important in the context of social media data that are naturally temporally driven. In this paper, we propose a new probabilistic model-based evolutionary clustering technique. The Temporal Multinomial Mixture (TMM) is an extension of classical mixture model that optimizes feature co-occurrences in the trade-off with temporal smoothness. Our model is evaluated for two recent case studies on opinion aggregation over time. We compare four different probabilistic clustering models and we show the superiority of our proposal in the task of instance-oriented clustering.
Tasks
Published	2016-01-11
URL	http://arxiv.org/abs/1601.02300v1
PDF	http://arxiv.org/pdf/1601.02300v1.pdf
PWC	https://paperswithcode.com/paper/temporal-multinomial-mixture-for-instance
Repo
Framework

Learning Invariant Representations Of Planar Curves


Title	Learning Invariant Representations Of Planar Curves
Authors	Gautam Pai, Aaron Wetzler, Ron Kimmel
Abstract	We propose a metric learning framework for the construction of invariant geometric functions of planar curves for the Eucledian and Similarity group of transformations. We leverage on the representational power of convolutional neural networks to compute these geometric quantities. In comparison with axiomatic constructions, we show that the invariants approximated by the learning architectures have better numerical qualities such as robustness to noise, resiliency to sampling, as well as the ability to adapt to occlusion and partiality. Finally, we develop a novel multi-scale representation in a similarity metric learning paradigm.
Tasks	Metric Learning
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07807v2
PDF	http://arxiv.org/pdf/1611.07807v2.pdf
PWC	https://paperswithcode.com/paper/learning-invariant-representations-of-planar
Repo
Framework

The Machine Learning Algorithm as Creative Musical Tool


Title	The Machine Learning Algorithm as Creative Musical Tool
Authors	Rebecca Fiebrink, Baptiste Caramiaux
Abstract	Machine learning is the capacity of a computational system to learn structures from datasets in order to make predictions on newly seen data. Such an approach offers a significant advantage in music scenarios in which musicians can teach the system to learn an idiosyncratic style, or can break the rules to explore the system’s capacity in unexpected ways. In this chapter we draw on music, machine learning, and human-computer interaction to elucidate an understanding of machine learning algorithms as creative tools for music and the sonic arts. We motivate a new understanding of learning algorithms as human-computer interfaces. We show that, like other interfaces, learning algorithms can be characterised by the ways their affordances intersect with goals of human users. We also argue that the nature of interaction between users and algorithms impacts the usability and usefulness of those algorithms in profound ways. This human-centred view of machine learning motivates our concluding discussion of what it means to employ machine learning as a creative tool.
Tasks
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00379v1
PDF	http://arxiv.org/pdf/1611.00379v1.pdf
PWC	https://paperswithcode.com/paper/the-machine-learning-algorithm-as-creative
Repo
Framework