Paper Group ANR 255
Gated Recurrent Neural Tensor Network. Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis. A Generalised Directional Laplacian Distribution: Estimation, Mixture Models and Audio Source Separation. A New 3D Segmentation Methodology for Lumbar Vertebral Bodies for the Measurement of BMD and Geometry. Mathema …
Gated Recurrent Neural Tensor Network
Title | Gated Recurrent Neural Tensor Network |
Authors | Andros Tjandra, Sakriani Sakti, Ruli Manurung, Mirna Adriani, Satoshi Nakamura |
Abstract | Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling temporal and sequential data need to capture long-term dependencies on datasets and represent them in hidden layers with a powerful model to capture more information from inputs. For modeling long-term dependencies in a dataset, the gating mechanism concept can help RNNs remember and forget previous information. Representing the hidden layers of an RNN with more expressive operations (i.e., tensor products) helps it learn a more complex relationship between the current input and the previous hidden layer information. These ideas can generally improve RNN performances. In this paper, we proposed a novel RNN architecture that combine the concepts of gating mechanism and the tensor product into a single model. By combining these two concepts into a single RNN, our proposed models learn long-term dependencies by modeling with gating units and obtain more expressive and direct interaction between input and hidden layers using a tensor product on 3-dimensional array (tensor) weight parameters. We use Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit (GRU) RNN and combine them with a tensor product inside their formulations. Our proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the tensor product. We conducted experiments with our proposed models on word-level and character-level language modeling tasks and revealed that our proposed models significantly improved their performance compared to our baseline models. |
Tasks | Language Modelling |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02222v1 |
http://arxiv.org/pdf/1706.02222v1.pdf | |
PWC | https://paperswithcode.com/paper/gated-recurrent-neural-tensor-network |
Repo | |
Framework | |
Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis
Title | Visual-Interactive Similarity Search for Complex Objects by Example of Soccer Player Analysis |
Authors | Jürgen Bernard, Christian Ritter, David Sessler, Matthias Zeppelzauer, Jörn Kohlhammer, Dieter Fellner |
Abstract | The definition of similarity is a key prerequisite when analyzing complex data types in data mining, information retrieval, or machine learning. However, the meaningful definition is often hampered by the complexity of data objects and particularly by different notions of subjective similarity latent in targeted user groups. Taking the example of soccer players, we present a visual-interactive system that learns users’ mental models of similarity. In a visual-interactive interface, users are able to label pairs of soccer players with respect to their subjective notion of similarity. Our proposed similarity model automatically learns the respective concept of similarity using an active learning strategy. A visual-interactive retrieval technique is provided to validate the model and to execute downstream retrieval tasks for soccer player analysis. The applicability of the approach is demonstrated in different evaluation strategies, including usage scenarions and cross-validation tests. |
Tasks | Active Learning, Information Retrieval |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03385v1 |
http://arxiv.org/pdf/1703.03385v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-interactive-similarity-search-for |
Repo | |
Framework | |
A Generalised Directional Laplacian Distribution: Estimation, Mixture Models and Audio Source Separation
Title | A Generalised Directional Laplacian Distribution: Estimation, Mixture Models and Audio Source Separation |
Authors | Nikolaos Mitianoudis |
Abstract | Directional or Circular statistics are pertaining to the analysis and interpretation of directions or rotations. In this work, a novel probability distribution is proposed to model multidimensional sparse directional data. The Generalised Directional Laplacian Distribution (DLD) is a hybrid between the Laplacian distribution and the von Mises-Fisher distribution. The distribution’s parameters are estimated using Maximum-Likelihood Estimation over a set of training data points. Mixtures of Directional Laplacian Distributions (MDLD) are also introduced in order to model multiple concentrations of sparse directional data. The author explores the application of the derived DLD mixture model to cluster sound sources that exist in an underdetermined instantaneous sound mixture. The proposed model can solve the general K x L (K<L) underdetermined instantaneous source separation problem, offering a fast and stable solution. |
Tasks | |
Published | 2017-08-16 |
URL | http://arxiv.org/abs/1708.04816v1 |
http://arxiv.org/pdf/1708.04816v1.pdf | |
PWC | https://paperswithcode.com/paper/a-generalised-directional-laplacian |
Repo | |
Framework | |
A New 3D Segmentation Methodology for Lumbar Vertebral Bodies for the Measurement of BMD and Geometry
Title | A New 3D Segmentation Methodology for Lumbar Vertebral Bodies for the Measurement of BMD and Geometry |
Authors | Andre Mastmeyer, Klaus Engelke, Willi Kalender |
Abstract | In this paper a new technique is presented that extracts the geometry of lumbar vertebral bodies from spiral CT scans. Our new multi-step segmentation approach yields highly accurate and precise measurement of the bone mineral density (BMD) in different volumes of interest which are defined relative to a local anatomical coordinate systems. The approach also enables the analysis of the geometry of the relevant vertebrae. Intra- and inter operator precision for segmentation, BMD measurement and position of the coordinate system are below 1.5% in patient data, accuracy errors are below 1.5% for BMD and below 4% for volume in phantom data. The long-term goal of the approach is to improve fracture prediction in osteoporosis. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07143v1 |
http://arxiv.org/pdf/1705.07143v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-3d-segmentation-methodology-for-lumbar |
Repo | |
Framework | |
Mathematics of Deep Learning
Title | Mathematics of Deep Learning |
Authors | Rene Vidal, Joan Bruna, Raja Giryes, Stefano Soatto |
Abstract | Recently there has been a dramatic increase in the performance of recognition systems due to the introduction of deep architectures for representation learning and classification. However, the mathematical reasons for this success remain elusive. This tutorial will review recent work that aims to provide a mathematical justification for several properties of deep networks, such as global optimality, geometric stability, and invariance of the learned representations. |
Tasks | Representation Learning |
Published | 2017-12-13 |
URL | http://arxiv.org/abs/1712.04741v1 |
http://arxiv.org/pdf/1712.04741v1.pdf | |
PWC | https://paperswithcode.com/paper/mathematics-of-deep-learning |
Repo | |
Framework | |
Collaborative vehicle routing: a survey
Title | Collaborative vehicle routing: a survey |
Authors | Margaretha Gansterer, Richard F. Hartl |
Abstract | In horizontal collaborations, carriers form coalitions in order to perform parts of their logistics operations jointly. By exchanging transportation requests among each other, they can operate more efficiently and in a more sustainable way. Collaborative vehicle routing has been extensively discussed in the literature. We identify three major streams of research: (i) centralized collaborative planning, (ii) decentralized planning without auctions, and (ii) auction-based decentralized planning. For each of them we give a structured overview on the state of knowledge and discuss future research directions. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.05254v1 |
http://arxiv.org/pdf/1706.05254v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-vehicle-routing-a-survey |
Repo | |
Framework | |
Ensembles of Deep LSTM Learners for Activity Recognition using Wearables
Title | Ensembles of Deep LSTM Learners for Activity Recognition using Wearables |
Authors | Yu Guan, Thomas Ploetz |
Abstract | Recently, deep learning (DL) methods have been introduced very successfully into human activity recognition (HAR) scenarios in ubiquitous and wearable computing. Especially the prospect of overcoming the need for manual feature design combined with superior classification capabilities render deep neural networks very attractive for real-life HAR application. Even though DL-based approaches now outperform the state-of-the-art in a number of recognitions tasks of the field, yet substantial challenges remain. Most prominently, issues with real-life datasets, typically including imbalanced datasets and problematic data quality, still limit the effectiveness of activity recognition using wearables. In this paper we tackle such challenges through Ensembles of deep Long Short Term Memory (LSTM) networks. We have developed modified training procedures for LSTM networks and combine sets of diverse LSTM learners into classifier collectives. We demonstrate, both formally and empirically, that Ensembles of deep LSTM learners outperform the individual LSTM networks. Through an extensive experimental evaluation on three standard benchmarks (Opportunity, PAMAP2, Skoda) we demonstrate the excellent recognition capabilities of our approach and its potential for real-life applications of human activity recognition. |
Tasks | Activity Recognition, Human Activity Recognition |
Published | 2017-03-28 |
URL | http://arxiv.org/abs/1703.09370v1 |
http://arxiv.org/pdf/1703.09370v1.pdf | |
PWC | https://paperswithcode.com/paper/ensembles-of-deep-lstm-learners-for-activity |
Repo | |
Framework | |
Fast and Efficient Calculations of Structural Invariants of Chirality
Title | Fast and Efficient Calculations of Structural Invariants of Chirality |
Authors | He Zhang, Hanlin Mo, You Hao, Shirui Li, Hua Li |
Abstract | Chirality plays an important role in physics, chemistry, biology, and other fields. It describes an essential symmetry in structure. However, chirality invariants are usually complicated in expression or difficult to evaluate. In this paper, we present five general three-dimensional chirality invariants based on the generating functions. And the five chiral invariants have four characteristics:(1) They play an important role in the detection of symmetry, especially in the treatment of ‘false zero’ problem. (2) Three of the five chiral invariants decode an universal chirality index. (3) Three of them are proposed for the first time. (4) The five chiral invariants have low order no bigger than 4, brief expression, low time complexity O(n) and can act as descriptors of three-dimensional objects in shape analysis. The five chiral invariants give a geometric view to study the chiral invariants. And the experiments show that the five chirality invariants are effective and efficient, they can be used as a tool for symmetry detection or features in shape analysis. |
Tasks | |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1711.05866v2 |
http://arxiv.org/pdf/1711.05866v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-efficient-calculations-of-structural |
Repo | |
Framework | |
Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery
Title | Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery |
Authors | Namrata Vaswani, Thierry Bouwmans, Sajid Javed, Praneeth Narayanamurthy |
Abstract | PCA is one of the most widely used dimension reduction techniques. A related easier problem is “subspace learning” or “subspace estimation”. Given relatively clean data, both are easily solved via singular value decomposition (SVD). The problem of subspace learning or PCA in the presence of outliers is called robust subspace learning or robust PCA (RPCA). For long data sequences, if one tries to use a single lower dimensional subspace to represent the data, the required subspace dimension may end up being quite large. For such data, a better model is to assume that it lies in a low-dimensional subspace that can change over time, albeit gradually. The problem of tracking such data (and the subspaces) while being robust to outliers is called robust subspace tracking (RST). This article provides a magazine-style overview of the entire field of robust subspace learning and tracking. In particular solutions for three problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition (S+LR), RST via S+LR, and “robust subspace recovery (RSR)". RSR assumes that an entire data vector is either an outlier or an inlier. The S+LR formulation instead assumes that outliers occur on only a few data vector indices and hence are well modeled as sparse corruptions. |
Tasks | Dimensionality Reduction |
Published | 2017-11-26 |
URL | http://arxiv.org/abs/1711.09492v4 |
http://arxiv.org/pdf/1711.09492v4.pdf | |
PWC | https://paperswithcode.com/paper/robust-subspace-learning-robust-pca-robust |
Repo | |
Framework | |
Living Together: Mind and Machine Intelligence
Title | Living Together: Mind and Machine Intelligence |
Authors | Neil D. Lawrence |
Abstract | In this paper we consider the nature of the machine intelligences we have created in the context of our human intelligence. We suggest that the fundamental difference between human and machine intelligence comes down to \emph{embodiment factors}. We define embodiment factors as the ratio between an entity’s ability to communicate information vs compute information. We speculate on the role of embodiment factors in driving our own intelligence and consciousness. We briefly review dual process models of cognition and cast machine intelligence within that framework, characterising it as a dominant System Zero, which can drive behaviour through interfacing with us subconsciously. Driven by concerns about the consequence of such a system we suggest prophylactic courses of action that could be considered. Our main conclusion is that it is \emph{not} sentient intelligence we should fear but \emph{non-sentient} intelligence. |
Tasks | |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07996v1 |
http://arxiv.org/pdf/1705.07996v1.pdf | |
PWC | https://paperswithcode.com/paper/living-together-mind-and-machine-intelligence |
Repo | |
Framework | |
Generating Visual Representations for Zero-Shot Classification
Title | Generating Visual Representations for Zero-Shot Classification |
Authors | Maxime Bucher, Stéphane Herbin, Frédéric Jurie |
Abstract | This paper addresses the task of learning an image clas-sifier when some categories are defined by semantic descriptions only (e.g. visual attributes) while the others are defined by exemplar images as well. This task is often referred to as the Zero-Shot classification task (ZSC). Most of the previous methods rely on learning a common embedding space allowing to compare visual features of unknown categories with semantic descriptions. This paper argues that these approaches are limited as i) efficient discrimi-native classifiers can’t be used ii) classification tasks with seen and unseen categories (Generalized Zero-Shot Classification or GZSC) can’t be addressed efficiently. In contrast , this paper suggests to address ZSC and GZSC by i) learning a conditional generator using seen classes ii) generate artificial training examples for the categories without exemplars. ZSC is then turned into a standard supervised learning problem. Experiments with 4 generative models and 5 datasets experimentally validate the approach, giving state-of-the-art results on both ZSC and GZSC. |
Tasks | Zero-Shot Learning |
Published | 2017-08-23 |
URL | http://arxiv.org/abs/1708.06975v3 |
http://arxiv.org/pdf/1708.06975v3.pdf | |
PWC | https://paperswithcode.com/paper/generating-visual-representations-for-zero |
Repo | |
Framework | |
Near-optimal linear decision trees for k-SUM and related problems
Title | Near-optimal linear decision trees for k-SUM and related problems |
Authors | Daniel M. Kane, Shachar Lovett, Shay Moran |
Abstract | We construct near optimal linear decision trees for a variety of decision problems in combinatorics and discrete geometry. For example, for any constant $k$, we construct linear decision trees that solve the $k$-SUM problem on $n$ elements using $O(n \log^2 n)$ linear queries. Moreover, the queries we use are comparison queries, which compare the sums of two $k$-subsets; when viewed as linear queries, comparison queries are $2k$-sparse and have only ${-1,0,1}$ coefficients. We give similar constructions for sorting sumsets $A+B$ and for solving the SUBSET-SUM problem, both with optimal number of queries, up to poly-logarithmic terms. Our constructions are based on the notion of “inference dimension”, recently introduced by the authors in the context of active classification with comparison queries. This can be viewed as another contribution to the fruitful link between machine learning and discrete geometry, which goes back to the discovery of the VC dimension. |
Tasks | |
Published | 2017-05-04 |
URL | http://arxiv.org/abs/1705.01720v1 |
http://arxiv.org/pdf/1705.01720v1.pdf | |
PWC | https://paperswithcode.com/paper/near-optimal-linear-decision-trees-for-k-sum |
Repo | |
Framework | |
Linear Time Complexity Deep Fourier Scattering Network and Extension to Nonlinear Invariants
Title | Linear Time Complexity Deep Fourier Scattering Network and Extension to Nonlinear Invariants |
Authors | Randall Balestriero, Herve Glotin |
Abstract | In this paper we propose a scalable version of a state-of-the-art deterministic time-invariant feature extraction approach based on consecutive changes of basis and nonlinearities, namely, the scattering network. The first focus of the paper is to extend the scattering network to allow the use of higher order nonlinearities as well as extracting nonlinear and Fourier based statistics leading to the required invariants of any inherently structured input. In order to reach fast convolutions and to leverage the intrinsic structure of wavelets, we derive our complete model in the Fourier domain. In addition of providing fast computations, we are now able to exploit sparse matrices due to extremely high sparsity well localized in the Fourier domain. As a result, we are able to reach a true linear time complexity with inputs in the Fourier domain allowing fast and energy efficient solutions to machine learning tasks. Validation of the features and computational results will be presented through the use of these invariant coefficients to perform classification on audio recordings of bird songs captured in multiple different soundscapes. In the end, the applicability of the presented solutions to deep artificial neural networks is discussed. |
Tasks | |
Published | 2017-07-18 |
URL | http://arxiv.org/abs/1707.05841v1 |
http://arxiv.org/pdf/1707.05841v1.pdf | |
PWC | https://paperswithcode.com/paper/linear-time-complexity-deep-fourier |
Repo | |
Framework | |
Iterative Thresholding for Demixing Structured Superpositions in High Dimensions
Title | Iterative Thresholding for Demixing Structured Superpositions in High Dimensions |
Authors | Mohammadreza Soltani, Chinmay Hegde |
Abstract | We consider the demixing problem of two (or more) high-dimensional vectors from nonlinear observations when the number of such observations is far less than the ambient dimension of the underlying vectors. Specifically, we demonstrate an algorithm that stably estimate the underlying components under general \emph{structured sparsity} assumptions on these components. Specifically, we show that for certain types of structured superposition models, our method provably recovers the components given merely $n = \mathcal{O}(s)$ samples where $s$ denotes the number of nonzero entries in the underlying components. Moreover, our method achieves a fast (linear) convergence rate, and also exhibits fast (near-linear) per-iteration complexity for certain types of structured models. We also provide a range of simulations to illustrate the performance of the proposed algorithm. |
Tasks | |
Published | 2017-01-23 |
URL | http://arxiv.org/abs/1701.06597v1 |
http://arxiv.org/pdf/1701.06597v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-thresholding-for-demixing |
Repo | |
Framework | |
Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features
Title | Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features |
Authors | Sharath Adavanne, Giambattista Parascandolo, Pasi Pertilä, Toni Heittola, Tuomas Virtanen |
Abstract | In this paper, we propose the use of spatial and harmonic features in combination with long short term memory (LSTM) recurrent neural network (RNN) for automatic sound event detection (SED) task. Real life sound recordings typically have many overlapping sound events, making it hard to recognize with just mono channel audio. Human listeners have been successfully recognizing the mixture of overlapping sound events using pitch cues and exploiting the stereo (multichannel) audio signal available at their ears to spatially localize these events. Traditionally SED systems have only been using mono channel audio, motivated by the human listener we propose to extend them to use multichannel audio. The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database. The usage of spatial and harmonic features are shown to improve the performance of SED. |
Tasks | Sound Event Detection |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02293v1 |
http://arxiv.org/pdf/1706.02293v1.pdf | |
PWC | https://paperswithcode.com/paper/sound-event-detection-in-multichannel-audio |
Repo | |
Framework | |