January 27, 2020

2981 words 14 mins read

Paper Group ANR 1328

Paper Group ANR 1328

Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation. The Mode of Computing. Multi-Kernel Prediction Networks for Denoising of Burst Images. Learning an Effective Equivariant 3D Descriptor Without Supervision. Towards Pure End-to-End Learning for Recognizing Multiple Text Sequences from an Image. Non-Parametric Inference Ad …

Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation

Title Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation
Authors Tingle Li, Jiawei Chen, Haowen Hou, Ming Li
Abstract Recent studies in deep learning-based source separation have two major approaches: one approach is modeling in the spectrogram domain, and the other approach is modeling in the time domain, but all of them used pure CNN or LSTM. In this paper, we propose a Sliced Attention-based neural network (Sams-Net) at the spectrogram domain for music source separation task, which enables feature interactions from the magnitude spectrogram contribute differently to the separation. Sams-Net has two main advantages: one is that it can be easily parallel computing compared with LSTM, and the other is that it has a larger receptive field compared with CNN. Experiments indicate that our proposed Sams-Net outperforms most of the state-of-the-art methods, although it contains fewer parameters.
Tasks Music Source Separation
Published 2019-09-12
URL https://arxiv.org/abs/1909.05746v3
PDF https://arxiv.org/pdf/1909.05746v3.pdf
PWC https://paperswithcode.com/paper/tf-attention-net-an-end-to-end-neural-network
Repo
Framework

The Mode of Computing

Title The Mode of Computing
Authors Luis A. Pineda
Abstract The Turing Machine is the paradigmatic case of computing machines, but there are others, such as Artificial Neural Networks, Table Computing, Relational-Indeterminate Computing and diverse forms of analogical computing, each of which based on a particular underlying intuition of the phenomenon of computing. This variety can be captured in terms of system levels, re-interpreting and generalizing Newell’s hierarchy, which includes the knowledge level at the top and the symbol level immediately below it. In this re-interpretation the knowledge level consists of human knowledge and the symbol level is generalized into a new level that here is called The Mode of Computing. Natural computing performed by the brains of humans and non-human animals with a developed enough neural system should be understood in terms of a hierarchy of system levels too. By analogy from standard computing machinery there must be a system level above the neural circuitry levels and directly below the knowledge level that is named here The mode of Natural Computing. A central question for Cognition is the characterization of this mode. The Mode of Computing provides a novel perspective on the phenomena of computing, interpreting, the representational and non-representational views of cognition, and consciousness.
Tasks
Published 2019-03-25
URL https://arxiv.org/abs/1903.10559v2
PDF https://arxiv.org/pdf/1903.10559v2.pdf
PWC https://paperswithcode.com/paper/the-mode-of-computing
Repo
Framework

Multi-Kernel Prediction Networks for Denoising of Burst Images

Title Multi-Kernel Prediction Networks for Denoising of Burst Images
Authors Talmaj Marinč, Vignesh Srinivasan, Serhan Gül, Cornelius Hellge, Wojciech Samek
Abstract In low light or short-exposure photography the image is often corrupted by noise. While longer exposure helps reduce the noise, it can produce blurry results due to the object and camera motion. The reconstruction of a noise-less image is an ill posed problem. Recent approaches for image denoising aim to predict kernels which are convolved with a set of successively taken images (burst) to obtain a clear image. We propose a deep neural network based approach called Multi-Kernel Prediction Networks (MKPN) for burst image denoising. MKPN predicts kernels of not just one size but of varying sizes and performs fusion of these different kernels resulting in one kernel per pixel. The advantages of our method are two fold: (a) the different sized kernels help in extracting different information from the image which results in better reconstruction and (b) kernel fusion assures retaining of the extracted information while maintaining computational efficiency. Experimental results reveal that MKPN outperforms state-of-the-art on our synthetic datasets with different noise levels.
Tasks Denoising, Image Denoising
Published 2019-02-05
URL http://arxiv.org/abs/1902.05392v1
PDF http://arxiv.org/pdf/1902.05392v1.pdf
PWC https://paperswithcode.com/paper/multi-kernel-prediction-networks-for
Repo
Framework

Learning an Effective Equivariant 3D Descriptor Without Supervision

Title Learning an Effective Equivariant 3D Descriptor Without Supervision
Authors Riccardo Spezialetti, Samuele Salti, Luigi Di Stefano
Abstract Establishing correspondences between 3D shapes is a fundamental task in 3D Computer Vision, typically addressed by matching local descriptors. Recently, a few attempts at applying the deep learning paradigm to the task have shown promising results. Yet, the only explored way to learn rotation invariant descriptors has been to feed neural networks with highly engineered and invariant representations provided by existing hand-crafted descriptors, a path that goes in the opposite direction of end-to-end learning from raw data so successfully deployed for 2D images. In this paper, we explore the benefits of taking a step back in the direction of end-to-end learning of 3D descriptors by disentangling the creation of a robust and distinctive rotation equivariant representation, which can be learned from unoriented input data, and the definition of a good canonical orientation, required only at test time to obtain an invariant descriptor. To this end, we leverage two recent innovations: spherical convolutional neural networks to learn an equivariant descriptor and plane folding decoders to learn without supervision. The effectiveness of the proposed approach is experimentally validated by outperforming hand-crafted and learned descriptors on a standard benchmark.
Tasks
Published 2019-09-15
URL https://arxiv.org/abs/1909.06887v1
PDF https://arxiv.org/pdf/1909.06887v1.pdf
PWC https://paperswithcode.com/paper/learning-an-effective-equivariant-3d
Repo
Framework

Towards Pure End-to-End Learning for Recognizing Multiple Text Sequences from an Image

Title Towards Pure End-to-End Learning for Recognizing Multiple Text Sequences from an Image
Authors Xu Zhenlong, Zhou shuigeng, Cheng zhanzhan, Bai fan, Niu yi, Pu shiliang
Abstract Here we address a challenging problem: recognizing multiple text sequences from an image by pure end-to-end learning. It is twofold: 1) Multiple text sequences recognition. Each image may contain multiple text sequences of different content, location and orientation, and we try to recognize all the text sequences contained in the image. 2) Pure end-to-end (PEE) learning.We solve the problem in a pure end-to-end learning way where each training image is labeled by only text transcripts of all contained sequences, without any geometric annotations. Most existing works recognize multiple text sequences from an image in a non-end-to-end (NEE) or quasi-end-to-end (QEE) way, in which each image is trained with both text transcripts and text locations.Only recently, a PEE method was proposed to recognize text sequences from an image where the text sequence was split to several lines in the image. However, it cannot be directly applied to recognizing multiple text sequences from an image. So in this paper, we propose a pure end-to-end learning method to recognize multiple text sequences from an image. Our method directly learns multiple sequences of probability distribution conditioned on each input image, and outputs multiple text transcripts with a well-designed decoding strategy.To evaluate the proposed method, we constructed several datasets mainly based on an existing public dataset andtwo real application scenarios. Experimental results show that the proposed method can effectively recognize multiple text sequences from images, and outperforms CTC-based and attention-based baseline methods.
Tasks
Published 2019-07-30
URL https://arxiv.org/abs/1907.12791v1
PDF https://arxiv.org/pdf/1907.12791v1.pdf
PWC https://paperswithcode.com/paper/towards-pure-end-to-end-learning-for
Repo
Framework

Non-Parametric Inference Adaptive to Intrinsic Dimension

Title Non-Parametric Inference Adaptive to Intrinsic Dimension
Authors Khashayar Khosravi, Greg Lewis, Vasilis Syrgkanis
Abstract We consider non-parametric estimation and inference of conditional moment models in high dimensions. We show that even when the dimension $D$ of the conditioning variable is larger than the sample size $n$, estimation and inference is feasible as long as the distribution of the conditioning variable has small intrinsic dimension $d$, as measured by locally low doubling measures. Our estimation is based on a sub-sampled ensemble of the $k$-nearest neighbors ($k$-NN) $Z$-estimator. We show that if the intrinsic dimension of the covariate distribution is equal to $d$, then the finite sample estimation error of our estimator is of order $n^{-1/(d+2)}$ and our estimate is $n^{1/(d+2)}$-asymptotically normal, irrespective of $D$. The sub-sampling size required for achieving these results depends on the unknown intrinsic dimension $d$. We propose an adaptive data-driven approach for choosing this parameter and prove that it achieves the desired rates. We discuss extensions and applications to heterogeneous treatment effect estimation.
Tasks
Published 2019-01-11
URL https://arxiv.org/abs/1901.03719v3
PDF https://arxiv.org/pdf/1901.03719v3.pdf
PWC https://paperswithcode.com/paper/non-parametric-inference-adaptive-to
Repo
Framework

Classification with the matrix-variate-$t$ distribution

Title Classification with the matrix-variate-$t$ distribution
Authors Geoffrey Z. Thompson, Ranjan Maitra, William Q. Meeker, Ashraf Bastawros
Abstract Matrix-variate distributions can intuitively model the dependence structure of matrix-valued observations that arise in applications with multivariate time series, spatio-temporal or repeated measures. This paper develops an Expectation-Maximization algorithm for discriminant analysis and classification with matrix-variate $t$-distributions. The methodology shows promise on simulated datasets or when applied to the forensic matching of fractured surfaces or the classification of functional Magnetic Resonance, satellite or hand gestures images.
Tasks Time Series
Published 2019-07-22
URL https://arxiv.org/abs/1907.09565v2
PDF https://arxiv.org/pdf/1907.09565v2.pdf
PWC https://paperswithcode.com/paper/classification-with-the-matrix-variate-t
Repo
Framework

Unsupervised Deep Features for Privacy Image Classification

Title Unsupervised Deep Features for Privacy Image Classification
Authors Chiranjibi Sitaula, Yong Xiang, Sunil Aryal, Xuequan Lu
Abstract Sharing images online poses security threats to a wide range of users due to the unawareness of privacy information. Deep features have been demonstrated to be a powerful representation for images. However, deep features usually suffer from the issues of a large size and requiring a huge amount of data for fine-tuning. In contrast to normal images (e.g., scene images), privacy images are often limited because of sensitive information. In this paper, we propose a novel approach that can work on limited data and generate deep features of smaller size. For training images, we first extract the initial deep features from the pre-trained model and then employ the K-means clustering algorithm to learn the centroids of these initial deep features. We use the learned centroids from training features to extract the final features for each testing image and encode our final features with the triangle encoding. To improve the discriminability of the features, we further perform the fusion of two proposed unsupervised deep features obtained from different layers. Experimental results show that the proposed features outperform state-of-the-art deep features, in terms of both classification accuracy and testing time.
Tasks Image Classification
Published 2019-09-24
URL https://arxiv.org/abs/1909.10708v1
PDF https://arxiv.org/pdf/1909.10708v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-deep-features-for-privacy-image
Repo
Framework

Capsule-Based Persian/Arabic Robust Handwritten Digit Recognition Using EM Routing

Title Capsule-Based Persian/Arabic Robust Handwritten Digit Recognition Using EM Routing
Authors Ali Ghofrani, Rahil Mahdian Toroghi
Abstract In this paper, the problem of handwritten digit recognition has been addressed. However, the underlying language is Persian/Arabic, and the system with which this task is a capsule network (CapsNet) has recently emerged as a more advanced architecture than its ancestor, namely CNN (Convolutional Neural Network). The training of the architecture is performed using the Hoda dataset, which has been provided for Persian/Arabic handwritten digits. The output of the system clearly outperforms the results achieved by its ancestors, as well as other previously presented recognition algorithms.
Tasks Handwritten Digit Recognition
Published 2019-12-08
URL https://arxiv.org/abs/1912.03634v2
PDF https://arxiv.org/pdf/1912.03634v2.pdf
PWC https://paperswithcode.com/paper/capsule-based-persianarabic-robust
Repo
Framework

Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning

Title Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning
Authors Andrew Cohen, Lei Yu, Xingye Qiao, Xiangrong Tong
Abstract Two hitherto disconnected threads of research, diverse exploration (DE) and maximum entropy RL have addressed a wide range of problems facing reinforcement learning algorithms via ostensibly distinct mechanisms. In this work, we identify a connection between these two approaches. First, a discriminator-based diversity objective is put forward and connected to commonly used divergence measures. We then extend this objective to the maximum entropy framework and propose an algorithm Maximum Entropy Diverse Exploration (MEDE) which provides a principled method to learn diverse behaviors. A theoretical investigation shows that the set of policies learned by MEDE capture the same modalities as the optimal maximum entropy policy. In effect, the proposed algorithm disentangles the maximum entropy policy into its diverse, constituent policies. Experiments show that MEDE is superior to the state of the art in learning high performing and diverse policies.
Tasks
Published 2019-11-03
URL https://arxiv.org/abs/1911.00828v1
PDF https://arxiv.org/pdf/1911.00828v1.pdf
PWC https://paperswithcode.com/paper/maximum-entropy-diverse-exploration
Repo
Framework

Sparse associative memory based on contextual code learning for disambiguating word senses

Title Sparse associative memory based on contextual code learning for disambiguating word senses
Authors Max Raphael Sobroza, Tales Marra, Deok-Hee Kim-Dufor, Claude Berrou
Abstract In recent literature, contextual pretrained Language Models (LMs) demonstrated their potential in generalizing the knowledge to several Natural Language Processing (NLP) tasks including supervised Word Sense Disambiguation (WSD), a challenging problem in the field of Natural Language Understanding (NLU). However, word representations from these models are still very dense, costly in terms of memory footprint, as well as minimally interpretable. In order to address such issues, we propose a new supervised biologically inspired technique for transferring large pre-trained language model representations into a compressed representation, for the case of WSD. Our produced representation contributes to increase the general interpretability of the framework and to decrease memory footprint, while enhancing performance.
Tasks Language Modelling, Word Sense Disambiguation
Published 2019-11-14
URL https://arxiv.org/abs/1911.06415v1
PDF https://arxiv.org/pdf/1911.06415v1.pdf
PWC https://paperswithcode.com/paper/sparse-associative-memory-based-on-contextual
Repo
Framework

Hyp-RL : Hyperparameter Optimization by Reinforcement Learning

Title Hyp-RL : Hyperparameter Optimization by Reinforcement Learning
Authors Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme
Abstract Hyperparameter tuning is an omnipresent problem in machine learning as it is an integral aspect of obtaining the state-of-the-art performance for any model. Most often, hyperparameters are optimized just by training a model on a grid of possible hyperparameter values and taking the one that performs best on a validation sample (grid search). More recently, methods have been introduced that build a so-called surrogate model that predicts the validation loss for a specific hyperparameter setting, model and dataset and then sequentially select the next hyperparameter to test, based on a heuristic function of the expected value and the uncertainty of the surrogate model called acquisition function (sequential model-based Bayesian optimization, SMBO). In this paper we model the hyperparameter optimization problem as a sequential decision problem, which hyperparameter to test next, and address it with reinforcement learning. This way our model does not have to rely on a heuristic acquisition function like SMBO, but can learn which hyperparameters to test next based on the subsequent reduction in validation loss they will eventually lead to, either because they yield good models themselves or because they allow the hyperparameter selection policy to build a better surrogate model that is able to choose better hyperparameters later on. Experiments on a large battery of 50 data sets demonstrate that our method outperforms the state-of-the-art approaches for hyperparameter learning.
Tasks Hyperparameter Optimization
Published 2019-06-27
URL https://arxiv.org/abs/1906.11527v1
PDF https://arxiv.org/pdf/1906.11527v1.pdf
PWC https://paperswithcode.com/paper/hyp-rl-hyperparameter-optimization-by
Repo
Framework

A Dynamic Modelling Framework for Human Hand Gesture Task Recognition

Title A Dynamic Modelling Framework for Human Hand Gesture Task Recognition
Authors Sara Masoud, Bijoy Chowdhury, Young-Jun Son, Chieri Kubota, Russell Tronstad
Abstract Gesture recognition and hand motion tracking are important tasks in advanced gesture based interaction systems. In this paper, we propose to apply a sliding windows filtering approach to sample the incoming streams of data from data gloves and a decision tree model to recognize the gestures in real time for a manual grafting operation of a vegetable seedling propagation facility. The sequence of these recognized gestures defines the tasks that are taking place, which helps to evaluate individuals’ performances and to identify any bottlenecks in real time. In this work, two pairs of data gloves are utilized, which reports the location of the fingers, hands, and wrists wirelessly (i.e., via Bluetooth). To evaluate the performance of the proposed framework, a preliminary experiment was conducted in multiple lab settings of tomato grafting operations, where multiple subjects wear the data gloves while performing different tasks. Our results show an accuracy of 91% on average, in terms of gesture recognition in real time by employing our proposed framework.
Tasks Gesture Recognition
Published 2019-11-10
URL https://arxiv.org/abs/1911.03923v2
PDF https://arxiv.org/pdf/1911.03923v2.pdf
PWC https://paperswithcode.com/paper/a-dynamic-modelling-framework-for-human-hand
Repo
Framework

Fusing heterogeneous data sets

Title Fusing heterogeneous data sets
Authors Yipeng Song
Abstract In systems biology, it is common to measure biochemical entities at different levels of the same biological system. One of the central problems for the data fusion of such data sets is the heterogeneity of the data. This thesis discusses two types of heterogeneity. The first one is the type of data, such as metabolomics, proteomics and RNAseq data in genomics. These different omics data reflect the properties of the studied biological system from different perspectives. The second one is the type of scale, which indicates the measurements obtained at different scales, such as binary, ordinal, interval and ratio-scaled variables. In this thesis, we developed several statistical methods capable to fuse data sets of these two types of heterogeneity. The advantages of the proposed methods in comparison with other approaches are assessed using comprehensive simulations as well as the analysis of real biological data sets.
Tasks
Published 2019-08-23
URL https://arxiv.org/abs/1908.09653v1
PDF https://arxiv.org/pdf/1908.09653v1.pdf
PWC https://paperswithcode.com/paper/fusing-heterogeneous-data-sets
Repo
Framework

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives

Title Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives
Authors Won Ik Cho, Young Ki Moon, Sangwhan Moon, Seok Min Kim, Nam Soo Kim
Abstract Modern dialog managers face the challenge of having to fulfill human-level conversational skills as part of common user expectations, including but not limited to discourse with no clear objective. Along with these requirements, agents are expected to extrapolate intent from the user’s dialogue even when subjected to non-canonical forms of speech. This depends on the agent’s comprehension of paraphrased forms of such utterances. In low-resource languages, the lack of data is a bottleneck that prevents advancements of the comprehension performance for these types of agents. In this paper, we demonstrate the necessity of being able to extract the intent argument of non-canonical directives, and also define guidelines for building paired corpora for this purpose. Following the guidelines, we label a dataset consisting of 30K instances of question/command-intent pairs, including annotations for a classification task for predicting the utterance type. We also propose a method for mitigating class imbalance in the final dataset, and demonstrate the potential applications of the corpus generation method and dataset.
Tasks
Published 2019-12-01
URL https://arxiv.org/abs/1912.00342v1
PDF https://arxiv.org/pdf/1912.00342v1.pdf
PWC https://paperswithcode.com/paper/machines-getting-with-the-program
Repo
Framework
comments powered by Disqus