July 28, 2019

3129 words 15 mins read

Paper Group ANR 303

Graph-Based Classification of Omnidirectional Images. Classification of Aerial Photogrammetric 3D Point Clouds. How hard can it be? Estimating the difficulty of visual search in an image. Improving content marketing processes with the approaches by artificial intelligence. Convolutional Spike Timing Dependent Plasticity based Feature Learning in Sp …

Graph-Based Classification of Omnidirectional Images


Title	Graph-Based Classification of Omnidirectional Images
Authors	Renata Khasanova, Pascal Frossard
Abstract	Omnidirectional cameras are widely used in such areas as robotics and virtual reality as they provide a wide field of view. Their images are often processed with classical methods, which might unfortunately lead to non-optimal solutions as these methods are designed for planar images that have different geometrical properties than omnidirectional ones. In this paper we study image classification task by taking into account the specific geometry of omnidirectional cameras with graph-based representations. In particular, we extend deep learning architectures to data on graphs; we propose a principled way of graph construction such that convolutional filters respond similarly for the same pattern on different positions of the image regardless of lens distortions. Our experiments show that the proposed method outperforms current techniques for the omnidirectional image classification problem.
Tasks	graph construction, Image Classification
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08301v1
PDF	http://arxiv.org/pdf/1707.08301v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-classification-of-omnidirectional
Repo
Framework

Classification of Aerial Photogrammetric 3D Point Clouds


Title	Classification of Aerial Photogrammetric 3D Point Clouds
Authors	Carlos Becker, Nicolai Häni, Elena Rosinskaya, Emmanuel d’Angelo, Christoph Strecha
Abstract	We present a powerful method to extract per-point semantic class labels from aerialphotogrammetry data. Labeling this kind of data is important for tasks such as environmental modelling, object classification and scene understanding. Unlike previous point cloud classification methods that rely exclusively on geometric features, we show that incorporating color information yields a significant increase in accuracy in detecting semantic classes. We test our classification method on three real-world photogrammetry datasets that were generated with Pix4Dmapper Pro, and with varying point densities. We show that off-the-shelf machine learning techniques coupled with our new features allow us to train highly accurate classifiers that generalize well to unseen data, processing point clouds containing 10 million points in less than 3 minutes on a desktop computer.
Tasks	Object Classification, Scene Understanding
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08374v1
PDF	http://arxiv.org/pdf/1705.08374v1.pdf
PWC	https://paperswithcode.com/paper/classification-of-aerial-photogrammetric-3d
Repo
Framework

How hard can it be? Estimating the difficulty of visual search in an image


Title	How hard can it be? Estimating the difficulty of visual search in an image
Authors	Radu Tudor Ionescu, Bogdan Alexe, Marius Leordeanu, Marius Popescu, Dim P. Papadopoulos, Vittorio Ferrari
Abstract	We address the problem of estimating image difficulty defined as the human response time for solving a visual search task. We collect human annotations of image difficulty for the PASCAL VOC 2012 data set through a crowd-sourcing platform. We then analyze what human interpretable image properties can have an impact on visual search difficulty, and how accurate are those properties for predicting difficulty. Next, we build a regression model based on deep features learned with state of the art convolutional neural networks and show better results for predicting the ground-truth visual search difficulty scores produced by human annotators. Our model is able to correctly rank about 75% image pairs according to their difficulty score. We also show that our difficulty predictor generalizes well to new classes not seen during training. Finally, we demonstrate that our predicted difficulty scores are useful for weakly supervised object localization (8% improvement) and semi-supervised object classification (1% improvement).
Tasks	Object Classification, Object Localization, Weakly-Supervised Object Localization
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08280v1
PDF	http://arxiv.org/pdf/1705.08280v1.pdf
PWC	https://paperswithcode.com/paper/how-hard-can-it-be-estimating-the-difficulty
Repo
Framework

Improving content marketing processes with the approaches by artificial intelligence


Title	Improving content marketing processes with the approaches by artificial intelligence
Authors	Utku Kose, Selcuk Sert
Abstract	Content marketing is todays one of the most remarkable approaches in the context of marketing processes of companies. Value of this kind of marketing has improved in time, thanks to the latest developments regarding to computer and communication technologies. Nowadays, especially social media based platforms have a great importance on enabling companies to design multimedia oriented, interactive content. But on the other hand, there is still something more to do for improved content marketing approaches. In this context, objective of this study is to focus on intelligent content marketing, which can be done by using artificial intelligence. Artificial Intelligence is todays one of the most remarkable research fields and it can be used easily as multidisciplinary. So, this study has aimed to discuss about its potential on improving content marketing. In detail, the study has enabled readers to improve their awareness about the intersection point of content marketing and artificial intelligence. Furthermore, the authors have introduced some example models of intelligent content marketing, which can be achieved by using current Web technologies and artificial intelligence techniques.
Tasks
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02114v1
PDF	http://arxiv.org/pdf/1704.02114v1.pdf
PWC	https://paperswithcode.com/paper/improving-content-marketing-processes-with
Repo
Framework

Convolutional Spike Timing Dependent Plasticity based Feature Learning in Spiking Neural Networks


Title	Convolutional Spike Timing Dependent Plasticity based Feature Learning in Spiking Neural Networks
Authors	Priyadarshini Panda, Gopalakrishnan Srinivasan, Kaushik Roy
Abstract	Brain-inspired learning models attempt to mimic the cortical architecture and computations performed in the neurons and synapses constituting the human brain to achieve its efficiency in cognitive tasks. In this work, we present convolutional spike timing dependent plasticity based feature learning with biologically plausible leaky-integrate-and-fire neurons in Spiking Neural Networks (SNNs). We use shared weight kernels that are trained to encode representative features underlying the input patterns thereby improving the sparsity as well as the robustness of the learning model. We demonstrate that the proposed unsupervised learning methodology learns several visual categories for object recognition with fewer number of examples and outperforms traditional fully-connected SNN architectures while yielding competitive accuracy. Additionally, we observe that the learning model performs out-of-set generalization further making the proposed biologically plausible framework a viable and efficient architecture for future neuromorphic applications.
Tasks	Object Recognition
Published	2017-03-10
URL	http://arxiv.org/abs/1703.03854v2
PDF	http://arxiv.org/pdf/1703.03854v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-spike-timing-dependent
Repo
Framework

Revisiting Graph Construction for Fast Image Segmentation


Title	Revisiting Graph Construction for Fast Image Segmentation
Authors	Zizhao Zhang, Fuyong Xing, Hanzi Wang, Yan Yan, Ying Huang, Xiaoshuang Shi, Lin Yang
Abstract	In this paper, we propose a simple but effective method for fast image segmentation. We re-examine the locality-preserving character of spectral clustering by constructing a graph over image regions with both global and local connections. Our novel approach to build graph connections relies on two key observations: 1) local region pairs that co-occur frequently will have a high probability to reside on a common object; 2) spatially distant regions in a common object often exhibit similar visual saliency, which implies their neighborship in a manifold. We present a novel energy function to efficiently conduct graph partitioning. Based on multiple high quality partitions, we show that the generated eigenvector histogram based representation can automatically drive effective unary potentials for a hierarchical random field model to produce multi-class segmentation. Sufficient experiments, on the BSDS500 benchmark, large-scale PASCAL VOC and COCO datasets, demonstrate the competitive segmentation accuracy and significantly improved efficiency of our proposed method compared with other state of the arts.
Tasks	graph construction, graph partitioning, Semantic Segmentation
Published	2017-02-18
URL	http://arxiv.org/abs/1702.05650v2
PDF	http://arxiv.org/pdf/1702.05650v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-graph-construction-for-fast-image
Repo
Framework

A Family of Metrics for Clustering Algorithms


Title	A Family of Metrics for Clustering Algorithms
Authors	Clark Alexander, Sofya Akhmametyeva
Abstract	We give the motivation for scoring clustering algorithms and a metric $M : A \rightarrow \mathbb{N}$ from the set of clustering algorithms to the natural numbers which we realize as \begin{equation} M(A) = \sum_i \alpha_i f_i - \beta_i^{w_i} \end{equation} where $\alpha_i,\beta_i,w_i$ are parameters used for scoring the feature $f_i$, which is computed empirically.. We give a method by which one can score features such as stability, noise sensitivity, etc and derive the necessary parameters. We conclude by giving a sample set of scores.
Tasks
Published	2017-07-27
URL	http://arxiv.org/abs/1707.08912v1
PDF	http://arxiv.org/pdf/1707.08912v1.pdf
PWC	https://paperswithcode.com/paper/a-family-of-metrics-for-clustering-algorithms
Repo
Framework

Deep Generalized Canonical Correlation Analysis


Title	Deep Generalized Canonical Correlation Analysis
Authors	Adrian Benton, Huda Khayrallah, Biman Gujral, Dee Ann Reisinger, Sheng Zhang, Raman Arora
Abstract	We present Deep Generalized Canonical Correlation Analysis (DGCCA) – a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other. While methods for nonlinear two-view representation learning (Deep CCA, (Andrew et al., 2013)) and linear many-view representation learning (Generalized CCA (Horst, 1961)) exist, DGCCA is the first CCA-style multiview representation learning technique that combines the flexibility of nonlinear (deep) representation learning with the statistical power of incorporating information from many independent sources, or views. We present the DGCCA formulation as well as an efficient stochastic optimization algorithm for solving it. We learn DGCCA representations on two distinct datasets for three downstream tasks: phonetic transcription from acoustic and articulatory measurements, and recommending hashtags and friends on a dataset of Twitter users. We find that DGCCA representations soundly beat existing methods at phonetic transcription and hashtag recommendation, and in general perform no worse than standard linear many-view techniques.
Tasks	Representation Learning, Stochastic Optimization
Published	2017-02-08
URL	http://arxiv.org/abs/1702.02519v2
PDF	http://arxiv.org/pdf/1702.02519v2.pdf
PWC	https://paperswithcode.com/paper/deep-generalized-canonical-correlation
Repo
Framework

Learning-based Ensemble Average Propagator Estimation


Title	Learning-based Ensemble Average Propagator Estimation
Authors	Chuyang Ye
Abstract	By capturing the anisotropic water diffusion in tissue, diffusion magnetic resonance imaging (dMRI) provides a unique tool for noninvasively probing the tissue microstructure and orientation in the human brain. The diffusion profile can be described by the ensemble average propagator (EAP), which is inferred from observed diffusion signals. However, accurate EAP estimation using the number of diffusion gradients that is clinically practical can be challenging. In this work, we propose a deep learning algorithm for EAP estimation, which is named learning-based ensemble average propagator estimation (LEAPE). The EAP is commonly represented by a basis and its associated coefficients, and here we choose the SHORE basis and design a deep network to estimate the coefficients. The network comprises two cascaded components. The first component is a multiple layer perceptron (MLP) that simultaneously predicts the unknown coefficients. However, typical training loss functions, such as mean squared errors, may not properly represent the geometry of the possibly non-Euclidean space of the coefficients, which in particular causes problems for the extraction of directional information from the EAP. Therefore, to regularize the training, in the second component we compute an auxiliary output of approximated fiber orientation (FO) errors with the aid of a second MLP that is trained separately. We performed experiments using dMRI data that resemble clinically achievable $q$-space sampling, and observed promising results compared with the conventional EAP estimation method.
Tasks
Published	2017-06-20
URL	http://arxiv.org/abs/1706.06258v1
PDF	http://arxiv.org/pdf/1706.06258v1.pdf
PWC	https://paperswithcode.com/paper/learning-based-ensemble-average-propagator
Repo
Framework

A Spacetime Approach to Generalized Cognitive Reasoning in Multi-scale Learning


Title	A Spacetime Approach to Generalized Cognitive Reasoning in Multi-scale Learning
Authors	Mark Burgess
Abstract	In modern machine learning, pattern recognition replaces realtime semantic reasoning. The mapping from input to output is learned with fixed semantics by training outcomes deliberately. This is an expensive and static approach which depends heavily on the availability of a very particular kind of prior raining data to make inferences in a single step. Conventional semantic network approaches, on the other hand, base multi-step reasoning on modal logics and handcrafted ontologies, which are ad hoc, expensive to construct, and fragile to inconsistency. Both approaches may be enhanced by a hybrid approach, which completely separates reasoning from pattern recognition. In this report, a quasi-linguistic approach to knowledge representation is discussed, motivated by spacetime structure. Tokenized patterns from diverse sources are integrated to build a lightly constrained and approximately scale-free network. This is then be parsed with very simple recursive algorithms to generate `brainstorming’ sets of reasoned knowledge. \|
Tasks
Published	2017-02-12
URL	http://arxiv.org/abs/1702.04638v2
PDF	http://arxiv.org/pdf/1702.04638v2.pdf
PWC	https://paperswithcode.com/paper/a-spacetime-approach-to-generalized-cognitive
Repo
Framework

GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled Images


Title	GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled Images
Authors	Avi Singh, Larry Yang, Sergey Levine
Abstract	We tackle the problem of learning robotic sensorimotor control policies that can generalize to visually diverse and unseen environments. Achieving broad generalization typically requires large datasets, which are difficult to obtain for task-specific interactive processes such as reinforcement learning or learning from demonstration. However, much of the visual diversity in the world can be captured through passively collected datasets of images or videos. In our method, which we refer to as GPLAC (Generalized Policy Learning with Attentional Classifier), we use both interaction data and weakly labeled image data to augment the generalization capacity of sensorimotor policies. Our method combines multitask learning on action selection and an auxiliary binary classification objective, together with a convolutional neural network architecture that uses an attentional mechanism to avoid distractors. We show that pairing interaction data from just a single environment with a diverse dataset of weakly labeled data results in greatly improved generalization to unseen environments, and show that this generalization depends on both the auxiliary objective and the attentional architecture that we propose. We demonstrate our results in both simulation and on a real robotic manipulator, and demonstrate substantial improvement over standard convolutional architectures and domain adaptation methods.
Tasks	Domain Adaptation
Published	2017-08-07
URL	http://arxiv.org/abs/1708.02313v1
PDF	http://arxiv.org/pdf/1708.02313v1.pdf
PWC	https://paperswithcode.com/paper/gplac-generalizing-vision-based-robotic
Repo
Framework

Concept Drift and Anomaly Detection in Graph Streams


Title	Concept Drift and Anomaly Detection in Graph Streams
Authors	Daniele Zambon, Cesare Alippi, Lorenzo Livi
Abstract	Graph representations offer powerful and intuitive ways to describe data in a multitude of application domains. Here, we consider stochastic processes generating graphs and propose a methodology for detecting changes in stationarity of such processes. The methodology is general and considers a process generating attributed graphs with a variable number of vertices/edges, without the need to assume one-to-one correspondence between vertices at different time steps. The methodology acts by embedding every graph of the stream into a vector domain, where a conventional multivariate change detection procedure can be easily applied. We ground the soundness of our proposal by proving several theoretical results. In addition, we provide a specific implementation of the methodology and evaluate its effectiveness on several detection problems involving attributed graphs representing biological molecules and drawings. Experimental results are contrasted with respect to suitable baseline methods, demonstrating the effectiveness of our approach.
Tasks	Anomaly Detection
Published	2017-06-21
URL	http://arxiv.org/abs/1706.06941v3
PDF	http://arxiv.org/pdf/1706.06941v3.pdf
PWC	https://paperswithcode.com/paper/concept-drift-and-anomaly-detection-in-graph
Repo
Framework

Multi-frame image super-resolution with fast upscaling technique


Title	Multi-frame image super-resolution with fast upscaling technique
Authors	Longguang Wang, Zaiping Lin, Xinpu Deng, Wei An
Abstract	Multi-frame image super-resolution (MISR) aims to fuse information in low-resolution (LR) image sequence to compose a high-resolution (HR) one, which is applied extensively in many areas recently. Different with single image super-resolution (SISR), sub-pixel transitions between multiple frames introduce additional information, attaching more significance to fusion operator to alleviate the ill-posedness of MISR. For reconstruction-based approaches, the inevitable projection of reconstruction errors from LR space to HR space is commonly tackled by an interpolation operator, however crude interpolation may not fit the natural image and generate annoying blurring artifacts, especially after fusion operator. In this paper, we propose an end-to-end fast upscaling technique to replace the interpolation operator, design upscaling filters in LR space for periodic sub-locations respectively and shuffle the filter results to derive the final reconstruction errors in HR space. The proposed fast upscaling technique not only reduce the computational complexity of the upscaling operation by utilizing shuffling operation to avoid complex operation in HR space, but also realize superior performance with fewer blurring artifacts. Extensive experimental results demonstrate the effectiveness and efficiency of the proposed technique, whilst, combining the proposed technique with bilateral total variation (BTV) regu-larization, the MISR approach outperforms state-of-the-art methods.
Tasks	Image Super-Resolution, Super-Resolution
Published	2017-06-20
URL	http://arxiv.org/abs/1706.06266v2
PDF	http://arxiv.org/pdf/1706.06266v2.pdf
PWC	https://paperswithcode.com/paper/multi-frame-image-super-resolution-with-fast
Repo
Framework

Topology Reduction in Deep Convolutional Feature Extraction Networks


Title	Topology Reduction in Deep Convolutional Feature Extraction Networks
Authors	Thomas Wiatowski, Philipp Grohs, Helmut Bölcskei
Abstract	Deep convolutional neural networks (CNNs) used in practice employ potentially hundreds of layers and $10$,$000$s of nodes. Such network sizes entail significant computational complexity due to the large number of convolutions that need to be carried out; in addition, a large number of parameters needs to be learned and stored. Very deep and wide CNNs may therefore not be well suited to applications operating under severe resource constraints as is the case, e.g., in low-power embedded and mobile platforms. This paper aims at understanding the impact of CNN topology, specifically depth and width, on the network’s feature extraction capabilities. We address this question for the class of scattering networks that employ either Weyl-Heisenberg filters or wavelets, the modulus non-linearity, and no pooling. The exponential feature map energy decay results in Wiatowski et al., 2017, are generalized to $\mathcal{O}(a^{-N})$, where an arbitrary decay factor $a>1$ can be realized through suitable choice of the Weyl-Heisenberg prototype function or the mother wavelet. We then show how networks of fixed (possibly small) depth $N$ can be designed to guarantee that $((1-\varepsilon)\cdot 100)%$ of the input signal’s energy are contained in the feature vector. Based on the notion of operationally significant nodes, we characterize, partly rigorously and partly heuristically, the topology-reducing effects of (effectively) band-limited input signals, band-limited filters, and feature map symmetries. Finally, for networks based on Weyl-Heisenberg filters, we determine the prototype function bandwidth that minimizes—for fixed network depth $N$—the average number of operationally significant nodes per layer.
Tasks
Published	2017-07-10
URL	http://arxiv.org/abs/1707.02711v2
PDF	http://arxiv.org/pdf/1707.02711v2.pdf
PWC	https://paperswithcode.com/paper/topology-reduction-in-deep-convolutional
Repo
Framework

Beyond Low-Rank Representations: Orthogonal Clustering Basis Reconstruction with Optimized Graph Structure for Multi-view Spectral Clustering


Title	Beyond Low-Rank Representations: Orthogonal Clustering Basis Reconstruction with Optimized Graph Structure for Multi-view Spectral Clustering
Authors	Yang Wang, Lin Wu
Abstract	Low-Rank Representation (LRR) is arguably one of the most powerful paradigms for Multi-view spectral clustering, which elegantly encodes the multi-view local graph/manifold structures into an intrinsic low-rank self-expressive data similarity embedded in high-dimensional space, to yield a better graph partition than their single-view counterparts. In this paper we revisit it with a fundamentally different perspective by discovering LRR as essentially a latent clustered orthogonal projection based representation winged with an optimized local graph structure for spectral clustering; each column of the representation is fundamentally a cluster basis orthogonal to others to indicate its members, which intuitively projects the view-specific feature representation to be the one spanned by all orthogonal basis to characterize the cluster structures. Upon this finding, we propose our technique with the followings: (1) We decompose LRR into latent clustered orthogonal representation via low-rank matrix factorization, to encode the more flexible cluster structures than LRR over primal data objects; (2) We convert the problem of LRR into that of simultaneously learning orthogonal clustered representation and optimized local graph structure for each view; (3) The learned orthogonal clustered representations and local graph structures enjoy the same magnitude for multi-view, so that the ideal multi-view consensus can be readily achieved. The experiments over multi-view datasets validate its superiority.
Tasks
Published	2017-08-04
URL	http://arxiv.org/abs/1708.02288v4
PDF	http://arxiv.org/pdf/1708.02288v4.pdf
PWC	https://paperswithcode.com/paper/beyond-low-rank-representations-orthogonal
Repo
Framework