October 19, 2019

3204 words 16 mins read

Paper Group ANR 192

Paper Group ANR 192

Some New Layer Architectures for Graph CNN. Neonatal Pain Expression Recognition Using Transfer Learning. Data-driven forecasting of solar irradiance. Rethinking Recurrent Latent Variable Model for Music Composition. A Geometric Approach for Real-time Monitoring of Dynamic Large Scale Graphs: AS-level graphs illustrated. Temporal Hockey Action Reco …

Some New Layer Architectures for Graph CNN

Title Some New Layer Architectures for Graph CNN
Authors Shrey Gadiya, Deepak Anand, Amit Sethi
Abstract While convolutional neural networks (CNNs) have recently made great strides in supervised classification of data structured on a grid (e.g. images composed of pixel grids), in several interesting datasets, the relations between features can be better represented as a general graph instead of a regular grid. Although recent algorithms that adapt CNNs to graphs have shown promising results, they mostly neglect learning explicit operations for edge features while focusing on vertex features alone. We propose new formulations for convolutional, pooling, and fully connected layers for neural networks that make more comprehensive use of the information available in multi-dimensional graphs. Using these layers led to an improvement in classification accuracy over the state-of-the-art methods on benchmark graph datasets.
Tasks
Published 2018-10-31
URL http://arxiv.org/abs/1811.00052v1
PDF http://arxiv.org/pdf/1811.00052v1.pdf
PWC https://paperswithcode.com/paper/some-new-layer-architectures-for-graph-cnn
Repo
Framework

Neonatal Pain Expression Recognition Using Transfer Learning

Title Neonatal Pain Expression Recognition Using Transfer Learning
Authors Ghada Zamzmi, Dmitry Goldgof, Rangachar Kasturi, Yu Sun
Abstract Transfer learning using pre-trained Convolutional Neural Networks (CNNs) has been successfully applied to images for different classification tasks. In this paper, we propose a new pipeline for pain expression recognition in neonates using transfer learning. Specifically, we propose to exploit a pre-trained CNN that was originally trained on a relatively similar dataset for face recognition (VGG Face) as well as CNNs that were pre-trained on a relatively different dataset for image classification (iVGG F,M, and S) to extract deep features from neonates’ faces. In the final stage, several supervised machine learning classifiers are trained to classify neonates’ facial expression into pain or no pain expression. The proposed pipeline achieved, on a testing dataset, 0.841 AUC and 90.34 accuracy, which is approx. 7 higher than the accuracy of handcrafted traditional features. We also propose to combine deep features with traditional features and hypothesize that the mixed features would improve pain classification performance. Combining deep features with traditional features achieved 92.71 accuracy and 0.948 AUC. These results show that transfer learning, which is a faster and more practical option than training CNN from the scratch, can be used to extract useful features for pain expression recognition in neonates. It also shows that combining deep features with traditional handcrafted features is a good practice to improve the performance of pain expression recognition and possibly the performance of similar applications.
Tasks Face Recognition, Image Classification, Transfer Learning
Published 2018-07-04
URL http://arxiv.org/abs/1807.01631v1
PDF http://arxiv.org/pdf/1807.01631v1.pdf
PWC https://paperswithcode.com/paper/neonatal-pain-expression-recognition-using
Repo
Framework

Data-driven forecasting of solar irradiance

Title Data-driven forecasting of solar irradiance
Authors Pierrick Bruneau, Philippe Pinheiro, Yoann Didry
Abstract This paper describes a flexible approach to short term prediction of meteorological variables. In particular, we focus on the prediction of the solar irradiance one hour ahead, a task that has high practical value when optimizing solar energy resources. As D'efi EGC 2018 provides us with time series data for multiple sensors (e.g. solar irradiance, temperature, hygrometry), recorded every minute for two years and 5 geographical sites from La R'eunion island, we test the value of using recently observed data as input for prediction models, as well as the performance of models across sites. After describing our data cleaning and normalization process, we combine a variable selection step based on AutoRegressive Integrated Moving Average (ARIMA) models, to using general purpose regression techniques such as neural networks and regression trees.
Tasks Time Series
Published 2018-01-10
URL https://arxiv.org/abs/1801.03373v2
PDF https://arxiv.org/pdf/1801.03373v2.pdf
PWC https://paperswithcode.com/paper/data-driven-forecasting-of-solar-irradiance
Repo
Framework

Rethinking Recurrent Latent Variable Model for Music Composition

Title Rethinking Recurrent Latent Variable Model for Music Composition
Authors Eunjeong Stella Koh, Shlomo Dubnov, Dustin Wright
Abstract We present a model for capturing musical features and creating novel sequences of music, called the Convolutional Variational Recurrent Neural Network. To generate sequential data, the model uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music. Using the sequence-to-sequence model, our generative model can exploit samples from a prior distribution and generate a longer sequence of music. We compare the performance of our proposed model with other types of Neural Networks using the criteria of Information Rate that is implemented by Variable Markov Oracle, a method that allows statistical characterization of musical information dynamics and detection of motifs in a song. Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.
Tasks
Published 2018-10-07
URL http://arxiv.org/abs/1810.03226v1
PDF http://arxiv.org/pdf/1810.03226v1.pdf
PWC https://paperswithcode.com/paper/rethinking-recurrent-latent-variable-model
Repo
Framework

A Geometric Approach for Real-time Monitoring of Dynamic Large Scale Graphs: AS-level graphs illustrated

Title A Geometric Approach for Real-time Monitoring of Dynamic Large Scale Graphs: AS-level graphs illustrated
Authors Loqman Salamatian, Dali Kaafar, Kavé Salamatian
Abstract The monitoring of large dynamic networks is a major chal- lenge for a wide range of application. The complexity stems from properties of the underlying graphs, in which slight local changes can lead to sizable variations of global prop- erties, e.g., under certain conditions, a single link cut that may be overlooked during monitoring can result in splitting the graph into two disconnected components. Moreover, it is often difficult to determine whether a change will propagate globally or remain local. Traditional graph theory measure such as the centrality or the assortativity of the graph are not satisfying to characterize global properties of the graph. In this paper, we tackle the problem of real-time monitoring of dynamic large scale graphs by developing a geometric approach that leverages notions of geometric curvature and recent development in graph embeddings using Ollivier-Ricci curvature [47]. We illustrate the use of our method by consid- ering the practical case of monitoring dynamic variations of global Internet using topology changes information provided by combining several BGP feeds. In particular, we use our method to detect major events and changes via the geometry of the embedding of the graph.
Tasks
Published 2018-06-02
URL http://arxiv.org/abs/1806.00676v1
PDF http://arxiv.org/pdf/1806.00676v1.pdf
PWC https://paperswithcode.com/paper/a-geometric-approach-for-real-time-monitoring
Repo
Framework

Temporal Hockey Action Recognition via Pose and Optical Flows

Title Temporal Hockey Action Recognition via Pose and Optical Flows
Authors Zixi Cai, Helmut Neher, Kanav Vats, David Clausi, John Zelek
Abstract Recognizing actions in ice hockey using computer vision poses challenges due to bulky equipment and inadequate image quality. A novel two-stream framework has been designed to improve action recognition accuracy for hockey using three main components. First, pose is estimated via the Part Affinity Fields model to extract meaningful cues from the player. Second, optical flow (using LiteFlowNet) is used to extract temporal features. Third, pose and optical flow streams are fused and passed to fully-connected layers to estimate the hockey player’s action. A novel publicly available dataset named HARPET (Hockey Action Recognition Pose Estimation, Temporal) was created, composed of sequences of annotated actions and pose of hockey players including their hockey sticks as an extension of human body pose. Three contributions are recognized. (1) The novel two-stream architecture achieves 85% action recognition accuracy, with the inclusion of optical flows increasing accuracy by about 10%. (2) The unique localization of hand-held objects (e.g., hockey sticks) as part of pose increases accuracy by about 13%. (3) For pose estimation, a bigger and more general dataset, MSCOCO, is successfully used for transfer learning to a smaller and more specific dataset, HARPET, achieving a PCKh of 87%.
Tasks Optical Flow Estimation, Pose Estimation, Temporal Action Localization, Transfer Learning
Published 2018-12-22
URL http://arxiv.org/abs/1812.09533v1
PDF http://arxiv.org/pdf/1812.09533v1.pdf
PWC https://paperswithcode.com/paper/temporal-hockey-action-recognition-via-pose
Repo
Framework

Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective

Title Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective
Authors Zhong-Qiu Wang, Ke Tan, DeLiang Wang
Abstract This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain. The key observation is that, for a mixture of two sources, with their magnitudes accurately estimated and under a geometric constraint, the absolute phase difference between each source and the mixture can be uniquely determined; in addition, the source phases at each time-frequency (T-F) unit can be narrowed down to only two candidates. To pick the right candidate, we propose three algorithms based on iterative phase reconstruction, group delay estimation, and phase-difference sign prediction. State-of-the-art results are obtained on the publicly available wsj0-2mix and 3mix corpus.
Tasks Speaker Separation
Published 2018-11-22
URL http://arxiv.org/abs/1811.09010v1
PDF http://arxiv.org/pdf/1811.09010v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-based-phase-reconstruction-for
Repo
Framework

Joint Training for Neural Machine Translation Models with Monolingual Data

Title Joint Training for Neural Machine Translation Models with Monolingual Data
Authors Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen
Abstract Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough. In this paper, we propose a novel approach to better leveraging monolingual data for neural machine translation by jointly learning source-to-target and target-to-source NMT models for a language pair with a joint EM optimization method. The training process starts with two initial NMT models pre-trained on parallel data for each direction, and these two models are iteratively updated by incrementally decreasing translation losses on training data. In each iteration step, both NMT models are first used to translate monolingual data from one language to the other, forming pseudo-training data of the other NMT model. Then two new NMT models are learnt from parallel data together with the pseudo training data. Both NMT models are expected to be improved and better pseudo-training data can be generated in next step. Experiment results on Chinese-English and English-German translation tasks show that our approach can simultaneously improve translation quality of source-to-target and target-to-source models, significantly outperforming strong baseline systems which are enhanced with monolingual data for model training including back-translation.
Tasks Domain Adaptation, Machine Translation
Published 2018-03-01
URL http://arxiv.org/abs/1803.00353v1
PDF http://arxiv.org/pdf/1803.00353v1.pdf
PWC https://paperswithcode.com/paper/joint-training-for-neural-machine-translation
Repo
Framework

Siamese networks for generating adversarial examples

Title Siamese networks for generating adversarial examples
Authors Mandar Kulkarni, Aria Abubakar
Abstract Machine learning models are vulnerable to adversarial examples. An adversary modifies the input data such that humans still assign the same label, however, machine learning models misclassify it. Previous approaches in the literature demonstrated that adversarial examples can even be generated for the remotely hosted model. In this paper, we propose a Siamese network based approach to generate adversarial examples for a multiclass target CNN. We assume that the adversary do not possess any knowledge of the target data distribution, and we use an unlabeled mismatched dataset to query the target, e.g., for the ResNet-50 target, we use the Food-101 dataset as the query. Initially, the target model assigns labels to the query dataset, and a Siamese network is trained on the image pairs derived from these multiclass labels. We learn the \emph{adversarial perturbations} for the Siamese model and show that these perturbations are also adversarial w.r.t. the target model. In experimental results, we demonstrate effectiveness of our approach on MNIST, CIFAR-10 and ImageNet targets with TinyImageNet/Food-101 query datasets.
Tasks
Published 2018-05-03
URL http://arxiv.org/abs/1805.01431v1
PDF http://arxiv.org/pdf/1805.01431v1.pdf
PWC https://paperswithcode.com/paper/siamese-networks-for-generating-adversarial
Repo
Framework

Stochastic Gradient Descent for Spectral Embedding with Implicit Orthogonality Constraint

Title Stochastic Gradient Descent for Spectral Embedding with Implicit Orthogonality Constraint
Authors Mireille El Gheche, Giovanni Chierchia, Pascal Frossard
Abstract In this paper, we propose a scalable algorithm for spectral embedding. The latter is a standard tool for graph clustering. However, its computational bottleneck is the eigendecomposition of the graph Laplacian matrix, which prevents its application to large-scale graphs. Our contribution consists of reformulating spectral embedding so that it can be solved via stochastic optimization. The idea is to replace the orthogonality constraint with an orthogonalization matrix injected directly into the criterion. As the gradient can be computed through a Cholesky factorization, our reformulation allows us to develop an efficient algorithm based on mini-batch gradient descent. Experimental results, both on synthetic and real data, confirm the efficiency of the proposed method in term of execution speed with respect to similar existing techniques.
Tasks Graph Clustering, Stochastic Optimization
Published 2018-12-13
URL http://arxiv.org/abs/1812.05721v2
PDF http://arxiv.org/pdf/1812.05721v2.pdf
PWC https://paperswithcode.com/paper/stochastic-gradient-descent-for-spectral
Repo
Framework

On Uncensored Mean First-Passage-Time Performance Experiments with Multiwalk in $\mathbb{R}^p$: a New Stochastic Optimization Algorithm

Title On Uncensored Mean First-Passage-Time Performance Experiments with Multiwalk in $\mathbb{R}^p$: a New Stochastic Optimization Algorithm
Authors Franc Brglez
Abstract A rigorous empirical comparison of two stochastic solvers is important when one of the solvers is a prototype of a new algorithm such as multiwalk (MWA). When searching for global minima in $\mathbb{R}^p$, the key data structures of MWA include: $p$ rulers with each ruler assigned $m$ marks and a set of $p$ neighborhood matrices of size up to $m(m-2)$, where each entry represents absolute values of pairwise differences between $m$ marks. Before taking the next step, a controller links the tableau of neighborhood matrices and computes new and improved positions for each of the $m$ marks. The number of columns in each neighborhood matrix is denoted as the neighborhood radius $r_n \le m-2$. Any variant of the DEA (differential evolution algorithm) has an effective population neighborhood of radius not larger than 1. Uncensored first-passage-time performance experiments that vary the neighborhood radius of a MW-solver can thus be readily compared to existing variants of DE-solvers. The paper considers seven test cases of increasing complexity and demonstrates, under uncensored first-passage-time performance experiments: (1) significant variability in convergence rate for seven DE-based solver configurations, and (2) consistent, monotonic, and significantly faster rate of convergence for the MW-solver prototype as we increase the neighborhood radius from 4 to its maximum value.
Tasks Stochastic Optimization
Published 2018-12-06
URL http://arxiv.org/abs/1812.03075v1
PDF http://arxiv.org/pdf/1812.03075v1.pdf
PWC https://paperswithcode.com/paper/on-uncensored-mean-first-passage-time
Repo
Framework

Sample size estimation for power and accuracy in the experimental comparison of algorithms

Title Sample size estimation for power and accuracy in the experimental comparison of algorithms
Authors Felipe Campelo, Fernanda Takahashi
Abstract Experimental comparisons of performance represent an important aspect of research on optimization algorithms. In this work we present a methodology for defining the required sample sizes for designing experiments with desired statistical properties for the comparison of two methods on a given problem class. The proposed approach allows the experimenter to define desired levels of accuracy for estimates of mean performance differences on individual problem instances, as well as the desired statistical power for comparing mean performances over a problem class of interest. The method calculates the required number of problem instances, and runs the algorithms on each test instance so that the accuracy of the estimated differences in performance is controlled at the predefined level. Two examples illustrate the application of the proposed method, and its ability to achieve the desired statistical properties with a methodologically sound definition of the relevant sample sizes.
Tasks
Published 2018-08-09
URL http://arxiv.org/abs/1808.02997v2
PDF http://arxiv.org/pdf/1808.02997v2.pdf
PWC https://paperswithcode.com/paper/sample-size-estimation-for-power-and-accuracy
Repo
Framework

R2RML Mappings in OBDA Systems: Enabling Comparison among OBDA Tools

Title R2RML Mappings in OBDA Systems: Enabling Comparison among OBDA Tools
Authors Manuel Namici
Abstract In today’s large enterprises there is a significant increasing trend in the amount of data that has to be stored and processed. To complicate this scenario the complexity of organizing and managing a large collection of data, structured according to a single, unified schema, makes so that there is almost never a single place where to look to satisfy an information need. The Ontology-Based Data Access (OBDA) paradigm aims at mitigating this phenomenon by providing to the users of the system a unified and shared conceptual view of the domain of interest (ontology), while still enabling the data to be stored in different data sources, which are managed by a relational database. In an OBDA system the link between the data stored at the sources and the ontology is provided through a declarative specification given in terms of a set of mappings. In this work we focus on comparing two of the available systems for OBDA, namely, Mastro and Ontop, by adopting OBDA specifications based on W3C recommendations. We first show how support for R2RML mappings has been integrated in Mastro, which was the last feature missing in order to enable the system to use specifications based solely on W3C recommendations relevant to OBDA. We then proceed in performing a comparison between these systems over two OBDA specifications, the NPD Benchmark and the ACI specification.
Tasks
Published 2018-04-04
URL http://arxiv.org/abs/1804.01405v1
PDF http://arxiv.org/pdf/1804.01405v1.pdf
PWC https://paperswithcode.com/paper/r2rml-mappings-in-obda-systems-enabling
Repo
Framework

A Bi-model based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling

Title A Bi-model based RNN Semantic Frame Parsing Model for Intent Detection and Slot Filling
Authors Yu Wang, Yilin Shen, Hongxia Jin
Abstract Intent detection and slot filling are two main tasks for building a spoken language understanding(SLU) system. Multiple deep learning based models have demonstrated good results on these tasks . The most effective algorithms are based on the structures of sequence to sequence models (or “encoder-decoder” models), and generate the intents and semantic tags either using separate models or a joint model. Most of the previous studies, however, either treat the intent detection and slot filling as two separate parallel tasks, or use a sequence to sequence model to generate both semantic tags and intent. Most of these approaches use one (joint) NN based model (including encoder-decoder structure) to model two tasks, hence may not fully take advantage of the cross-impact between them. In this paper, new Bi-model based RNN semantic frame parsing network structures are designed to perform the intent detection and slot filling tasks jointly, by considering their cross-impact to each other using two correlated bidirectional LSTMs (BLSTM). Our Bi-model structure with a decoder achieves state-of-the-art result on the benchmark ATIS data, with about 0.5$%$ intent accuracy improvement and 0.9 $%$ slot filling improvement.
Tasks Intent Detection, Slot Filling, Spoken Language Understanding
Published 2018-12-26
URL http://arxiv.org/abs/1812.10235v1
PDF http://arxiv.org/pdf/1812.10235v1.pdf
PWC https://paperswithcode.com/paper/a-bi-model-based-rnn-semantic-frame-parsing
Repo
Framework

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

Title Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network
Authors Wenjia Meng, Qian Zheng, Long Yang, Pengfei Li, Gang Pan
Abstract The deep Q-network (DQN) and return-based reinforcement learning are two promising algorithms proposed in recent years. DQN brings advances to complex sequential decision problems, while return-based algorithms have advantages in making use of sample trajectories. In this paper, we propose a general framework to combine DQN and most of the return-based reinforcement learning algorithms, named R-DQN. We show the performance of traditional DQN can be improved effectively by introducing return-based reinforcement learning. In order to further improve the R-DQN, we design a strategy with two measurements which can qualitatively measure the policy discrepancy. Moreover, we give the two measurements’ bounds in the proposed R-DQN framework. We show that algorithms with our strategy can accurately express the trace coefficient and achieve a better approximation to return. The experiments, conducted on several representative tasks from the OpenAI Gym library, validate the effectiveness of the proposed measurements. The results also show that the algorithms with our strategy outperform the state-of-the-art methods.
Tasks
Published 2018-06-14
URL https://arxiv.org/abs/1806.06953v3
PDF https://arxiv.org/pdf/1806.06953v3.pdf
PWC https://paperswithcode.com/paper/qualitative-measurements-of-policy
Repo
Framework
comments powered by Disqus