May 7, 2019

2690 words 13 mins read

Paper Group ANR 79

Paper Group ANR 79

DNN-based Speech Synthesis for Indian Languages from ASCII text. Combining Recurrent and Convolutional Neural Networks for Relation Classification. Bit-pragmatic Deep Neural Network Computing. Embedding Words and Senses Together via Joint Knowledge-Enhanced Training. Evaluation of Deep Learning based Pose Estimation for Sign Language Recognition. E …

DNN-based Speech Synthesis for Indian Languages from ASCII text

Title DNN-based Speech Synthesis for Indian Languages from ASCII text
Authors Srikanth Ronanki, Siva Reddy, Bajibabu Bollepalli, Simon King
Abstract Text-to-Speech synthesis in Indian languages has a seen lot of progress over the decade partly due to the annual Blizzard challenges. These systems assume the text to be written in Devanagari or Dravidian scripts which are nearly phonemic orthography scripts. However, the most common form of computer interaction among Indians is ASCII written transliterated text. Such text is generally noisy with many variations in spelling for the same word. In this paper we evaluate three approaches to synthesize speech from such noisy ASCII text: a naive Uni-Grapheme approach, a Multi-Grapheme approach, and a supervised Grapheme-to-Phoneme (G2P) approach. These methods first convert the ASCII text to a phonetic script, and then learn a Deep Neural Network to synthesize speech from that. We train and test our models on Blizzard Challenge datasets that were transliterated to ASCII using crowdsourcing. Our experiments on Hindi, Tamil and Telugu demonstrate that our models generate speech of competetive quality from ASCII text compared to the speech synthesized from the native scripts. All the accompanying transliterated datasets are released for public access.
Tasks Speech Synthesis, Text-To-Speech Synthesis
Published 2016-08-18
URL http://arxiv.org/abs/1608.05374v1
PDF http://arxiv.org/pdf/1608.05374v1.pdf
PWC https://paperswithcode.com/paper/dnn-based-speech-synthesis-for-indian
Repo
Framework

Combining Recurrent and Convolutional Neural Networks for Relation Classification

Title Combining Recurrent and Convolutional Neural Networks for Relation Classification
Authors Ngoc Thang Vu, Heike Adel, Pankaj Gupta, Hinrich Schütze
Abstract This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks. For both models, we demonstrate the effect of different architectural choices. We present a new context representation for convolutional neural networks for relation classification (extended middle context). Furthermore, we propose connectionist bi-directional recurrent neural networks and introduce ranking loss for their optimization. Finally, we show that combining convolutional and recurrent neural networks using a simple voting scheme is accurate enough to improve results. Our neural models achieve state-of-the-art results on the SemEval 2010 relation classification task.
Tasks Relation Classification
Published 2016-05-24
URL http://arxiv.org/abs/1605.07333v1
PDF http://arxiv.org/pdf/1605.07333v1.pdf
PWC https://paperswithcode.com/paper/combining-recurrent-and-convolutional-neural
Repo
Framework

Bit-pragmatic Deep Neural Network Computing

Title Bit-pragmatic Deep Neural Network Computing
Authors J. Albericio, P. Judd, A. Delmás, S. Sharify, A. Moshovos
Abstract We quantify a source of ineffectual computations when processing the multiplications of the convolutional layers in Deep Neural Networks (DNNs) and propose Pragmatic (PRA), an architecture that exploits it improving performance and energy efficiency. The source of these ineffectual computations is best understood in the context of conventional multipliers which generate internally multiple terms, that is, products of the multiplicand and powers of two, which added together produce the final product [1]. At runtime, many of these terms are zero as they are generated when the multiplicand is combined with the zero-bits of the multiplicator. While conventional bit-parallel multipliers calculate all terms in parallel to reduce individual product latency, PRA calculates only the non-zero terms using a) on-the-fly conversion of the multiplicator representation into an explicit list of powers of two, and b) hybrid bit-parallel multplicand/bit-serial multiplicator processing units. PRA exploits two sources of ineffectual computations: 1) the aforementioned zero product terms which are the result of the lack of explicitness in the multiplicator representation, and 2) the excess in the representation precision used for both multiplicants and multiplicators, e.g., [2]. Measurements demonstrate that for the convolutional layers, a straightforward variant of PRA improves performance by 2.6x over the DaDiaNao (DaDN) accelerator [3] and by 1.4x over STR [4]. Similarly, PRA improves energy efficiency by 28% and 10% on average compared to DaDN and STR. An improved cross lane synchronication scheme boosts performance improvements to 3.1x over DaDN. Finally, Pragmatic benefits persist even with an 8-bit quantized representation [5].
Tasks
Published 2016-10-20
URL http://arxiv.org/abs/1610.06920v1
PDF http://arxiv.org/pdf/1610.06920v1.pdf
PWC https://paperswithcode.com/paper/bit-pragmatic-deep-neural-network-computing
Repo
Framework

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

Title Embedding Words and Senses Together via Joint Knowledge-Enhanced Training
Authors Massimiliano Mancini, Jose Camacho-Collados, Ignacio Iacobacci, Roberto Navigli
Abstract Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora. However, their creation process does not allow the different meanings of a word to be automatically separated, as it conflates them into a single vector. We address this issue by proposing a new model which learns word and sense embeddings jointly. Our model exploits large corpora and knowledge from semantic networks in order to produce a unified vector space of word and sense embeddings. We evaluate the main features of our approach both qualitatively and quantitatively in a variety of tasks, highlighting the advantages of the proposed method in comparison to state-of-the-art word- and sense-based models.
Tasks Word Embeddings
Published 2016-12-08
URL http://arxiv.org/abs/1612.02703v2
PDF http://arxiv.org/pdf/1612.02703v2.pdf
PWC https://paperswithcode.com/paper/embedding-words-and-senses-together-via-joint
Repo
Framework

Evaluation of Deep Learning based Pose Estimation for Sign Language Recognition

Title Evaluation of Deep Learning based Pose Estimation for Sign Language Recognition
Authors Srujana Gattupalli, Amir Ghaderi, Vassilis Athitsos
Abstract Human body pose estimation and hand detection are two important tasks for systems that perform computer vision-based sign language recognition(SLR). However, both tasks are challenging, especially when the input is color videos, with no depth information. Many algorithms have been proposed in the literature for these tasks, and some of the most successful recent algorithms are based on deep learning. In this paper, we introduce a dataset for human pose estimation for SLR domain. We evaluate the performance of two deep learning based pose estimation methods, by performing user-independent experiments on our dataset. We also perform transfer learning, and we obtain results that demonstrate that transfer learning can improve pose estimation accuracy. The dataset and results from these methods can create a useful baseline for future works.
Tasks Pose Estimation, Sign Language Recognition, Transfer Learning
Published 2016-02-29
URL http://arxiv.org/abs/1602.09065v3
PDF http://arxiv.org/pdf/1602.09065v3.pdf
PWC https://paperswithcode.com/paper/evaluation-of-deep-learning-based-pose
Repo
Framework

Estimating the concentration of gold nanoparticles incorporated on Natural Rubber membranes using Multi-Level Starlet Optimal Segmentation

Title Estimating the concentration of gold nanoparticles incorporated on Natural Rubber membranes using Multi-Level Starlet Optimal Segmentation
Authors Alexandre Fioravante de Siqueira, Flávio Camargo Cabrera, Aylton Pagamisse, Aldo Eloizo Job
Abstract This study consolidates Multi-Level Starlet Segmentation (MLSS) and Multi-Level Starlet Optimal Segmentation (MLSOS), techniques for photomicrograph segmentation that use starlet wavelet detail levels to separate areas of interest in an input image. Several segmentation levels can be obtained using Multi-Level Starlet Segmentation; after that, Matthews correlation coefficient (MCC) is used to choose an optimal segmentation level, giving rise to Multi-Level Starlet Optimal Segmentation. In this paper, MLSOS is employed to estimate the concentration of gold nanoparticles with diameter around 47 nm, reducted on natural rubber membranes. These samples were used on the construction of SERS/SERRS substrates and in the study of natural rubber membranes with incorporated gold nanoparticles influence on Leishmania braziliensis physiology. Precision, recall and accuracy are used to evaluate the segmentation performance, and MLSOS presents accuracy greater than 88% for this application.
Tasks
Published 2016-10-26
URL http://arxiv.org/abs/1610.08436v1
PDF http://arxiv.org/pdf/1610.08436v1.pdf
PWC https://paperswithcode.com/paper/estimating-the-concentration-of-gold
Repo
Framework

Local Multiple Directional Pattern of Palmprint Image

Title Local Multiple Directional Pattern of Palmprint Image
Authors Lunke Fei, Jie Wen, Zheng Zhang, Ke Yan, Zuofeng Zhong
Abstract Lines are the most essential and discriminative features of palmprint images, which motivate researches to propose various line direction based methods for palmprint recognition. Conventional methods usually capture the only one of the most dominant direction of palmprint images. However, a number of points in palmprint images have double or even more than two dominant directions because of a plenty of crossing lines of palmprint images. In this paper, we propose a local multiple directional pattern (LMDP) to effectively characterize the multiple direction features of palmprint images. LMDP can not only exactly denote the number and positions of dominant directions but also effectively reflect the confidence of each dominant direction. Then, a simple and effective coding scheme is designed to represent the LMDP and a block-wise LMDP descriptor is used as the feature space of palmprint images in palmprint recognition. Extensive experimental results demonstrate the superiority of the LMDP over the conventional powerful descriptors and the state-of-the-art direction based methods in palmprint recognition.
Tasks
Published 2016-07-21
URL http://arxiv.org/abs/1607.06166v1
PDF http://arxiv.org/pdf/1607.06166v1.pdf
PWC https://paperswithcode.com/paper/local-multiple-directional-pattern-of
Repo
Framework

KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics

Title KeystoneML: Optimizing Pipelines for Large-Scale Advanced Analytics
Authors Evan R. Sparks, Shivaram Venkataraman, Tomer Kaftan, Michael J. Franklin, Benjamin Recht
Abstract Modern advanced analytics applications make use of machine learning techniques and contain multiple steps of domain-specific and general-purpose processing with high resource requirements. We present KeystoneML, a system that captures and optimizes the end-to-end large-scale machine learning applications for high-throughput training in a distributed environment with a high-level API. This approach offers increased ease of use and higher performance over existing systems for large scale learning. We demonstrate the effectiveness of KeystoneML in achieving high quality statistical accuracy and scalable training using real world datasets in several domains. By optimizing execution KeystoneML achieves up to 15x training throughput over unoptimized execution on a real image classification application.
Tasks Image Classification
Published 2016-10-29
URL http://arxiv.org/abs/1610.09451v1
PDF http://arxiv.org/pdf/1610.09451v1.pdf
PWC https://paperswithcode.com/paper/keystoneml-optimizing-pipelines-for-large
Repo
Framework

Bayesian Model Selection Methods for Mutual and Symmetric $k$-Nearest Neighbor Classification

Title Bayesian Model Selection Methods for Mutual and Symmetric $k$-Nearest Neighbor Classification
Authors Hyun-Chul Kim
Abstract The $k$-nearest neighbor classification method ($k$-NNC) is one of the simplest nonparametric classification methods. The mutual $k$-NN classification method (M$k$NNC) is a variant of $k$-NNC based on mutual neighborship. We propose another variant of $k$-NNC, the symmetric $k$-NN classification method (S$k$NNC) based on both mutual neighborship and one-sided neighborship. The performance of M$k$NNC and S$k$NNC depends on the parameter $k$ as the one of $k$-NNC does. We propose the ways how M$k$NN and S$k$NN classification can be performed based on Bayesian mutual and symmetric $k$-NN regression methods with the selection schemes for the parameter $k$. Bayesian mutual and symmetric $k$-NN regression methods are based on Gaussian process models, and it turns out that they can do M$k$NN and S$k$NN classification with new encodings of target values (class labels). The simulation results show that the proposed methods are better than or comparable to $k$-NNC, M$k$NNC and S$k$NNC with the parameter $k$ selected by the leave-one-out cross validation method not only for an artificial data set but also for real world data sets.
Tasks Model Selection
Published 2016-08-14
URL http://arxiv.org/abs/1608.04063v1
PDF http://arxiv.org/pdf/1608.04063v1.pdf
PWC https://paperswithcode.com/paper/bayesian-model-selection-methods-for-mutual
Repo
Framework

A note on the triangle inequality for the Jaccard distance

Title A note on the triangle inequality for the Jaccard distance
Authors Sven Kosub
Abstract Two simple proofs of the triangle inequality for the Jaccard distance in terms of nonnegative, monotone, submodular functions are given and discussed.
Tasks
Published 2016-12-08
URL http://arxiv.org/abs/1612.02696v1
PDF http://arxiv.org/pdf/1612.02696v1.pdf
PWC https://paperswithcode.com/paper/a-note-on-the-triangle-inequality-for-the
Repo
Framework

Image segmentation with superpixel-based covariance descriptors in low-rank representation

Title Image segmentation with superpixel-based covariance descriptors in low-rank representation
Authors Xianbin Gu, Jeremiah D. Deng, Martin K. Purvis
Abstract This paper investigates the problem of image segmentation using superpixels. We propose two approaches to enhance the discriminative ability of the superpixel’s covariance descriptors. In the first one, we employ the Log-Euclidean distance as the metric on the covariance manifolds, and then use the RBF kernel to measure the similarities between covariance descriptors. The second method is focused on extracting the subspace structure of the set of covariance descriptors by extending a low rank representation algorithm on to the covariance manifolds. Experiments are carried out with the Berkly Segmentation Dataset, and compared with the state-of-the-art segmentation algorithms, both methods are competitive.
Tasks Semantic Segmentation
Published 2016-05-18
URL http://arxiv.org/abs/1605.05466v1
PDF http://arxiv.org/pdf/1605.05466v1.pdf
PWC https://paperswithcode.com/paper/image-segmentation-with-superpixel-based
Repo
Framework

An Efficient Approach to Boosting Performance of Deep Spiking Network Training

Title An Efficient Approach to Boosting Performance of Deep Spiking Network Training
Authors Seongsik Park, Sang-gil Lee, Hyunha Nam, Sungroh Yoon
Abstract Nowadays deep learning is dominating the field of machine learning with state-of-the-art performance in various application areas. Recently, spiking neural networks (SNNs) have been attracting a great deal of attention, notably owning to their power efficiency, which can potentially allow us to implement a low-power deep learning engine suitable for real-time/mobile applications. However, implementing SNN-based deep learning remains challenging, especially gradient-based training of SNNs by error backpropagation. We cannot simply propagate errors through SNNs in conventional way because of the property of SNNs that process discrete data in the form of a series. Consequently, most of the previous studies employ a workaround technique, which first trains a conventional weighted-sum deep neural network and then maps the learning weights to the SNN under training, instead of training SNN parameters directly. In order to eliminate this workaround, recently proposed is a new class of SNN named deep spiking networks (DSNs), which can be trained directly (without a mapping from conventional deep networks) by error backpropagation with stochastic gradient descent. In this paper, we show that the initialization of the membrane potential on the backward path is an important step in DSN training, through diverse experiments performed under various conditions. Furthermore, we propose a simple and efficient method that can improve DSN training by controlling the initial membrane potential on the backward path. In our experiments, adopting the proposed approach allowed us to boost the performance of DSN training in terms of converging time and accuracy.
Tasks
Published 2016-11-08
URL http://arxiv.org/abs/1611.02416v2
PDF http://arxiv.org/pdf/1611.02416v2.pdf
PWC https://paperswithcode.com/paper/an-efficient-approach-to-boosting-performance
Repo
Framework

Neural Machine Transliteration: Preliminary Results

Title Neural Machine Transliteration: Preliminary Results
Authors Amir H. Jadidinejad
Abstract Machine transliteration is the process of automatically transforming the script of a word from a source language to a target language, while preserving pronunciation. Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. In this paper a character-based encoder-decoder model has been proposed that consists of two Recurrent Neural Networks. The encoder is a Bidirectional recurrent neural network that encodes a sequence of symbols into a fixed-length vector representation, and the decoder generates the target sequence using an attention-based recurrent neural network. The encoder, the decoder and the attention mechanism are jointly trained to maximize the conditional probability of a target sequence given a source sequence. Our experiments on different datasets show that the proposed encoder-decoder model is able to achieve significantly higher transliteration quality over traditional statistical models.
Tasks Transliteration
Published 2016-09-14
URL http://arxiv.org/abs/1609.04253v1
PDF http://arxiv.org/pdf/1609.04253v1.pdf
PWC https://paperswithcode.com/paper/neural-machine-transliteration-preliminary
Repo
Framework

Robust Bayesian Method for Simultaneous Block Sparse Signal Recovery with Applications to Face Recognition

Title Robust Bayesian Method for Simultaneous Block Sparse Signal Recovery with Applications to Face Recognition
Authors Igor Fedorov, Ritwik Giri, Bhaskar D. Rao, Truong Q. Nguyen
Abstract In this paper, we present a novel Bayesian approach to recover simultaneously block sparse signals in the presence of outliers. The key advantage of our proposed method is the ability to handle non-stationary outliers, i.e. outliers which have time varying support. We validate our approach with empirical results showing the superiority of the proposed method over competing approaches in synthetic data experiments as well as the multiple measurement face recognition problem.
Tasks Face Recognition
Published 2016-05-06
URL http://arxiv.org/abs/1605.02057v2
PDF http://arxiv.org/pdf/1605.02057v2.pdf
PWC https://paperswithcode.com/paper/robust-bayesian-method-for-simultaneous-block
Repo
Framework

Adaptive Gray World-Based Color Normalization of Thin Blood Film Images

Title Adaptive Gray World-Based Color Normalization of Thin Blood Film Images
Authors F. Boray Tek, Andrew G. Dempster, İzzet Kale
Abstract This paper presents an effective color normalization method for thin blood film images of peripheral blood specimens. Thin blood film images can easily be separated to foreground (cell) and background (plasma) parts. The color of the plasma region is used to estimate and reduce the differences arising from different illumination conditions. A second stage normalization based on the database-gray world algorithm transforms the color of the foreground objects to match a reference color character. The quantitative experiments demonstrate the effectiveness of the method and its advantages against two other general purpose color correction methods: simple gray world and Retinex.
Tasks
Published 2016-07-14
URL http://arxiv.org/abs/1607.04032v1
PDF http://arxiv.org/pdf/1607.04032v1.pdf
PWC https://paperswithcode.com/paper/adaptive-gray-world-based-color-normalization
Repo
Framework
comments powered by Disqus