July 28, 2019

2878 words 14 mins read

Paper Group ANR 255

Gated Recurrent Neural Tensor Network. Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis. A Generalised Directional Laplacian Distribution: Estimation, Mixture Models and Audio Source Separation. A New 3D Segmentation Methodology for Lumbar Vertebral Bodies for the Measurement of BMD and Geometry. Mathema …

Gated Recurrent Neural Tensor Network


Title	Gated Recurrent Neural Tensor Network
Authors	Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura
Abstract	Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling temporal and sequential data need to capture long-term dependencies on datasets and represent them in hidden layers with a powerful model to capture more information from inputs. For modeling long-term dependencies in a dataset, the gating mechanism concept can help RNNs remember and forget previous information. Representing the hidden layers of an RNN with more expressive operations (i.e., tensor products) helps it learn a more complex relationship between the current input and the previous hidden layer information. These ideas can generally improve RNN performances. In this paper, we proposed a novel RNN architecture that combine the concepts of gating mechanism and the tensor product into a single model. By combining these two concepts into a single RNN, our proposed models learn long-term dependencies by modeling with gating units and obtain more expressive and direct interaction between input and hidden layers using a tensor product on 3-dimensional array (tensor) weight parameters. We use Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit (GRU) RNN and combine them with a tensor product inside their formulations. Our proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the tensor product. We conducted experiments with our proposed models on word-level and character-level language modeling tasks and revealed that our proposed models significantly improved their performance compared to our baseline models.
Tasks	Language Modelling
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02222v1
PDF	http://arxiv.org/pdf/1706.02222v1.pdf
PWC	https://paperswithcode.com/paper/gated-recurrent-neural-tensor-network
Repo
Framework

Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis


Title	Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis
Authors	Jürgen Bernard, Christian Ritter, David Sessler, Matthias Zeppelzauer, Jörn Kohlhammer, Dieter Fellner
Abstract	The definition of similarity is a key prerequisite when analyzing complex data types in data mining, information retrieval, or machine learning. However, the meaningful definition is often hampered by the complexity of data objects and particularly by different notions of subjective similarity latent in targeted user groups. Taking the example of soccer players, we present a visual-interactive system that learns users’ mental models of similarity. In a visual-interactive interface, users are able to label pairs of soccer players with respect to their subjective notion of similarity. Our proposed similarity model automatically learns the respective concept of similarity using an active learning strategy. A visual-interactive retrieval technique is provided to validate the model and to execute downstream retrieval tasks for soccer player analysis. The applicability of the approach is demonstrated in different evaluation strategies, including usage scenarions and cross-validation tests.
Tasks	Active Learning, Information Retrieval
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03385v1
PDF	http://arxiv.org/pdf/1703.03385v1.pdf
PWC	https://paperswithcode.com/paper/visual-interactive-similarity-search-for
Repo
Framework

A Generalised Directional Laplacian Distribution: Estimation, Mixture Models and Audio Source Separation


Title	A Generalised Directional Laplacian Distribution: Estimation, Mixture Models and Audio Source Separation
Authors	Nikolaos Mitianoudis
Abstract	Directional or Circular statistics are pertaining to the analysis and interpretation of directions or rotations. In this work, a novel probability distribution is proposed to model multidimensional sparse directional data. The Generalised Directional Laplacian Distribution (DLD) is a hybrid between the Laplacian distribution and the von Mises-Fisher distribution. The distribution’s parameters are estimated using Maximum-Likelihood Estimation over a set of training data points. Mixtures of Directional Laplacian Distributions (MDLD) are also introduced in order to model multiple concentrations of sparse directional data. The author explores the application of the derived DLD mixture model to cluster sound sources that exist in an underdetermined instantaneous sound mixture. The proposed model can solve the general K x L (K<L) underdetermined instantaneous source separation problem, offering a fast and stable solution.
Tasks
Published	2017-08-16
URL	http://arxiv.org/abs/1708.04816v1
PDF	http://arxiv.org/pdf/1708.04816v1.pdf
PWC	https://paperswithcode.com/paper/a-generalised-directional-laplacian
Repo
Framework

A New 3D Segmentation Methodology for Lumbar Vertebral Bodies for the Measurement of BMD and Geometry


Title	A New 3D Segmentation Methodology for Lumbar Vertebral Bodies for the Measurement of BMD and Geometry
Authors	Andre Mastmeyer, Klaus Engelke, Willi Kalender
Abstract	In this paper a new technique is presented that extracts the geometry of lumbar vertebral bodies from spiral CT scans. Our new multi-step segmentation approach yields highly accurate and precise measurement of the bone mineral density (BMD) in different volumes of interest which are defined relative to a local anatomical coordinate systems. The approach also enables the analysis of the geometry of the relevant vertebrae. Intra- and inter operator precision for segmentation, BMD measurement and position of the coordinate system are below 1.5% in patient data, accuracy errors are below 1.5% for BMD and below 4% for volume in phantom data. The long-term goal of the approach is to improve fracture prediction in osteoporosis.
Tasks
Published	2017-05-19
URL	http://arxiv.org/abs/1705.07143v1
PDF	http://arxiv.org/pdf/1705.07143v1.pdf
PWC	https://paperswithcode.com/paper/a-new-3d-segmentation-methodology-for-lumbar
Repo
Framework

Mathematics of Deep Learning


Title	Mathematics of Deep Learning
Authors	Rene Vidal, Joan Bruna, Raja Giryes, Stefano Soatto
Abstract	Recently there has been a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for representation learning and classification. However, the mathematical reasons for this success remain elusive. This tutorial will review recent work that aims to provide a mathematical justification for several properties of deep networks, such as global optimality, geometric stability, and invariance of the learned representations.
Tasks	Representation Learning
Published	2017-12-13
URL	http://arxiv.org/abs/1712.04741v1
PDF	http://arxiv.org/pdf/1712.04741v1.pdf
PWC	https://paperswithcode.com/paper/mathematics-of-deep-learning
Repo
Framework

Collaborative vehicle routing: a survey


Title	Collaborative vehicle routing: a survey
Authors	Margaretha Gansterer, Richard F. Hartl
Abstract	In horizontal collaborations, carriers form coalitions in order to perform parts of their logistics operations jointly. By exchanging transportation requests among each other, they can operate more efficiently and in a more sustainable way. Collaborative vehicle routing has been extensively discussed in the literature. We identify three major streams of research: (i) centralized collaborative planning, (ii) decentralized planning without auctions, and (ii) auction-based decentralized planning. For each of them we give a structured overview on the state of knowledge and discuss future research directions.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.05254v1
PDF	http://arxiv.org/pdf/1706.05254v1.pdf
PWC	https://paperswithcode.com/paper/collaborative-vehicle-routing-a-survey
Repo
Framework

Ensembles of Deep LSTM Learners for Activity Recognition using Wearables


Title	Ensembles of Deep LSTM Learners for Activity Recognition using Wearables
Authors	Yu Guan, Thomas Ploetz
Abstract	Recently, deep learning (DL) methods have been introduced very successfully into human activity recognition (HAR) scenarios in ubiquitous and wearable computing. Especially the prospect of overcoming the need for manual feature design combined with superior classification capabilities render deep neural networks very attractive for real-life HAR application. Even though DL-based approaches now outperform the state-of-the-art in a number of recognitions tasks of the field, yet substantial challenges remain. Most prominently, issues with real-life datasets, typically including imbalanced datasets and problematic data quality, still limit the effectiveness of activity recognition using wearables. In this paper we tackle such challenges through Ensembles of deep Long Short Term Memory (LSTM) networks. We have developed modified training procedures for LSTM networks and combine sets of diverse LSTM learners into classifier collectives. We demonstrate, both formally and empirically, that Ensembles of deep LSTM learners outperform the individual LSTM networks. Through an extensive experimental evaluation on three standard benchmarks (Opportunity, PAMAP2, Skoda) we demonstrate the excellent recognition capabilities of our approach and its potential for real-life applications of human activity recognition.
Tasks	Activity Recognition, Human Activity Recognition
Published	2017-03-28
URL	http://arxiv.org/abs/1703.09370v1
PDF	http://arxiv.org/pdf/1703.09370v1.pdf
PWC	https://paperswithcode.com/paper/ensembles-of-deep-lstm-learners-for-activity
Repo
Framework

Fast and Efficient Calculations of Structural Invariants of Chirality


Title	Fast and Efficient Calculations of Structural Invariants of Chirality
Authors	He Zhang, Hanlin Mo, You Hao, Shirui Li, Hua Li
Abstract	Chirality plays an important role in physics, chemistry, biology, and other fields. It describes an essential symmetry in structure. However, chirality invariants are usually complicated in expression or difficult to evaluate. In this paper, we present five general three-dimensional chirality invariants based on the generating functions. And the five chiral invariants have four characteristics:(1) They play an important role in the detection of symmetry, especially in the treatment of ‘false zero’ problem. (2) Three of the five chiral invariants decode an universal chirality index. (3) Three of them are proposed for the first time. (4) The five chiral invariants have low order no bigger than 4, brief expression, low time complexity O(n) and can act as descriptors of three-dimensional objects in shape analysis. The five chiral invariants give a geometric view to study the chiral invariants. And the experiments show that the five chirality invariants are effective and efficient, they can be used as a tool for symmetry detection or features in shape analysis.
Tasks
Published	2017-10-20
URL	http://arxiv.org/abs/1711.05866v2
PDF	http://arxiv.org/pdf/1711.05866v2.pdf
PWC	https://paperswithcode.com/paper/fast-and-efficient-calculations-of-structural
Repo
Framework

Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery


Title	Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
Authors	Namrata Vaswani, Thierry Bouwmans, Sajid Javed, Praneeth Narayanamurthy
Abstract	PCA is one of the most widely used dimension reduction techniques. A related easier problem is “subspace learning” or “subspace estimation”. Given relatively clean data, both are easily solved via singular value decomposition (SVD). The problem of subspace learning or PCA in the presence of outliers is called robust subspace learning or robust PCA (RPCA). For long data sequences, if one tries to use a single lower dimensional subspace to represent the data, the required subspace dimension may end up being quite large. For such data, a better model is to assume that it lies in a low-dimensional subspace that can change over time, albeit gradually. The problem of tracking such data (and the subspaces) while being robust to outliers is called robust subspace tracking (RST). This article provides a magazine-style overview of the entire field of robust subspace learning and tracking. In particular solutions for three problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition (S+LR), RST via S+LR, and “robust subspace recovery (RSR)". RSR assumes that an entire data vector is either an outlier or an inlier. The S+LR formulation instead assumes that outliers occur on only a few data vector indices and hence are well modeled as sparse corruptions.
Tasks	Dimensionality Reduction
Published	2017-11-26
URL	http://arxiv.org/abs/1711.09492v4
PDF	http://arxiv.org/pdf/1711.09492v4.pdf
PWC	https://paperswithcode.com/paper/robust-subspace-learning-robust-pca-robust
Repo
Framework

Living Together: Mind and Machine Intelligence


Title	Living Together: Mind and Machine Intelligence
Authors	Neil D. Lawrence
Abstract	In this paper we consider the nature of the machine intelligences we have created in the context of our human intelligence. We suggest that the fundamental difference between human and machine intelligence comes down to \emph{embodiment factors}. We define embodiment factors as the ratio between an entity’s ability to communicate information vs compute information. We speculate on the role of embodiment factors in driving our own intelligence and consciousness. We briefly review dual process models of cognition and cast machine intelligence within that framework, characterising it as a dominant System Zero, which can drive behaviour through interfacing with us subconsciously. Driven by concerns about the consequence of such a system we suggest prophylactic courses of action that could be considered. Our main conclusion is that it is \emph{not} sentient intelligence we should fear but \emph{non-sentient} intelligence.
Tasks
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07996v1
PDF	http://arxiv.org/pdf/1705.07996v1.pdf
PWC	https://paperswithcode.com/paper/living-together-mind-and-machine-intelligence
Repo
Framework

Generating Visual Representations for Zero-Shot Classification


Title	Generating Visual Representations for Zero-Shot Classification
Authors	Maxime Bucher, Stéphane Herbin, Frédéric Jurie
Abstract	This paper addresses the task of learning an image clas-sifier when some categories are defined by semantic descriptions only (e.g. visual attributes) while the others are defined by exemplar images as well. This task is often referred to as the Zero-Shot classification task (ZSC). Most of the previous methods rely on learning a common embedding space allowing to compare visual features of unknown categories with semantic descriptions. This paper argues that these approaches are limited as i) efficient discrimi-native classifiers can’t be used ii) classification tasks with seen and unseen categories (Generalized Zero-Shot Classification or GZSC) can’t be addressed efficiently. In contrast , this paper suggests to address ZSC and GZSC by i) learning a conditional generator using seen classes ii) generate artificial training examples for the categories without exemplars. ZSC is then turned into a standard supervised learning problem. Experiments with 4 generative models and 5 datasets experimentally validate the approach, giving state-of-the-art results on both ZSC and GZSC.
Tasks	Zero-Shot Learning
Published	2017-08-23
URL	http://arxiv.org/abs/1708.06975v3
PDF	http://arxiv.org/pdf/1708.06975v3.pdf
PWC	https://paperswithcode.com/paper/generating-visual-representations-for-zero
Repo
Framework


Title	Near-optimal linear decision trees for k-SUM and related problems
Authors	Daniel M. Kane, Shachar Lovett, Shay Moran
Abstract	We construct near optimal linear decision trees for a variety of decision problems in combinatorics and discrete geometry. For example, for any constant $k$, we construct linear decision trees that solve the $k$-SUM problem on $n$ elements using $O(n \log^2 n)$ linear queries. Moreover, the queries we use are comparison queries, which compare the sums of two $k$-subsets; when viewed as linear queries, comparison queries are $2k$-sparse and have only ${-1,0,1}$ coefficients. We give similar constructions for sorting sumsets $A+B$ and for solving the SUBSET-SUM problem, both with optimal number of queries, up to poly-logarithmic terms. Our constructions are based on the notion of “inference dimension”, recently introduced by the authors in the context of active classification with comparison queries. This can be viewed as another contribution to the fruitful link between machine learning and discrete geometry, which goes back to the discovery of the VC dimension.
Tasks
Published	2017-05-04
URL	http://arxiv.org/abs/1705.01720v1
PDF	http://arxiv.org/pdf/1705.01720v1.pdf
PWC	https://paperswithcode.com/paper/near-optimal-linear-decision-trees-for-k-sum
Repo
Framework

Linear Time Complexity Deep Fourier Scattering Network and Extension to Nonlinear Invariants


Title	Linear Time Complexity Deep Fourier Scattering Network and Extension to Nonlinear Invariants
Authors	Randall Balestriero, Herve Glotin
Abstract	In this paper we propose a scalable version of a state-of-the-art deterministic time-invariant feature extraction approach based on consecutive changes of basis and nonlinearities, namely, the scattering network. The first focus of the paper is to extend the scattering network to allow the use of higher order nonlinearities as well as extracting nonlinear and Fourier based statistics leading to the required invariants of any inherently structured input. In order to reach fast convolutions and to leverage the intrinsic structure of wavelets, we derive our complete model in the Fourier domain. In addition of providing fast computations, we are now able to exploit sparse matrices due to extremely high sparsity well localized in the Fourier domain. As a result, we are able to reach a true linear time complexity with inputs in the Fourier domain allowing fast and energy efficient solutions to machine learning tasks. Validation of the features and computational results will be presented through the use of these invariant coefficients to perform classification on audio recordings of bird songs captured in multiple different soundscapes. In the end, the applicability of the presented solutions to deep artificial neural networks is discussed.
Tasks
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05841v1
PDF	http://arxiv.org/pdf/1707.05841v1.pdf
PWC	https://paperswithcode.com/paper/linear-time-complexity-deep-fourier
Repo
Framework

Iterative Thresholding for Demixing Structured Superpositions in High Dimensions


Title	Iterative Thresholding for Demixing Structured Superpositions in High Dimensions
Authors	Mohammadreza Soltani, Chinmay Hegde
Abstract	We consider the demixing problem of two (or more) high-dimensional vectors from nonlinear observations when the number of such observations is far less than the ambient dimension of the underlying vectors. Specifically, we demonstrate an algorithm that stably estimate the underlying components under general \emph{structured sparsity} assumptions on these components. Specifically, we show that for certain types of structured superposition models, our method provably recovers the components given merely $n = \mathcal{O}(s)$ samples where $s$ denotes the number of nonzero entries in the underlying components. Moreover, our method achieves a fast (linear) convergence rate, and also exhibits fast (near-linear) per-iteration complexity for certain types of structured models. We also provide a range of simulations to illustrate the performance of the proposed algorithm.
Tasks
Published	2017-01-23
URL	http://arxiv.org/abs/1701.06597v1
PDF	http://arxiv.org/pdf/1701.06597v1.pdf
PWC	https://paperswithcode.com/paper/iterative-thresholding-for-demixing
Repo
Framework

Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features


Title	Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features
Authors	Sharath Adavanne, Giambattista Parascandolo, Pasi Pertilä, Toni Heittola, Tuomas Virtanen
Abstract	In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have many overlapping sound events, making it hard to recognize with just mono channel audio. Human listeners have been successfully recognizing the mixture of overlapping sound events using pitch cues and exploiting the stereo (multichannel) audio signal available at their ears to spatially localize these events. Traditionally SED systems have only been using mono channel audio, motivated by the human listener we propose to extend them to use multichannel audio. The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database. The usage of spatial and harmonic features are shown to improve the performance of SED.
Tasks	Sound Event Detection
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02293v1
PDF	http://arxiv.org/pdf/1706.02293v1.pdf
PWC	https://paperswithcode.com/paper/sound-event-detection-in-multichannel-audio
Repo
Framework