October 19, 2019

2962 words 14 mins read

Paper Group ANR 356

Paper Group ANR 356

Global-scale phylogenetic linguistic inference from lexical resources. SparCML: High-Performance Sparse Communication for Machine Learning. 2D-Densely Connected Convolution Neural Networks for automatic Liver and Tumor Segmentation. Augmenting Bottleneck Features of Deep Neural Network Employing Motor State for Speech Recognition at Humanoid Robots …

Global-scale phylogenetic linguistic inference from lexical resources

Title Global-scale phylogenetic linguistic inference from lexical resources
Authors Gerhard Jäger
Abstract Automatic phylogenetic inference plays an increasingly important role in computational historical linguistics. Most pertinent work is currently based on expert cognate judgments. This limits the scope of this approach to a small number of well-studied language families. We used machine learning techniques to compile data suitable for phylogenetic inference from the ASJP database, a collection of almost 7,000 phonetically transcribed word lists over 40 concepts, covering two third of the extant world-wide linguistic diversity. First, we estimated Pointwise Mutual Information scores between sound classes using weighted sequence alignment and general-purpose optimization. From this we computed a dissimilarity matrix over all ASJP word lists. This matrix is suitable for distance-based phylogenetic inference. Second, we applied cognate clustering to the ASJP data, using supervised training of an SVM classifier on expert cognacy judgments. Third, we defined two types of binary characters, based on automatically inferred cognate classes and on sound-class occurrences. Several tests are reported demonstrating the suitability of these characters for character-based phylogenetic inference.
Tasks
Published 2018-02-17
URL http://arxiv.org/abs/1802.06079v1
PDF http://arxiv.org/pdf/1802.06079v1.pdf
PWC https://paperswithcode.com/paper/global-scale-phylogenetic-linguistic
Repo
Framework

SparCML: High-Performance Sparse Communication for Machine Learning

Title SparCML: High-Performance Sparse Communication for Machine Learning
Authors Cedric Renggli, Saleh Ashkboos, Mehdi Aghagolzadeh, Dan Alistarh, Torsten Hoefler
Abstract Applying machine learning techniques to the quickly growing data in science and industry requires highly-scalable algorithms. Large datasets are most commonly processed “data parallel” distributed across many nodes. Each node’s contribution to the overall gradient is summed using a global allreduce. This allreduce is the single communication and thus scalability bottleneck for most machine learning workloads. We observe that frequently, many gradient values are (close to) zero, leading to sparse of sparsifyable communications. To exploit this insight, we analyze, design, and implement a set of communication-efficient protocols for sparse input data, in conjunction with efficient machine learning algorithms which can leverage these primitives. Our communication protocols generalize standard collective operations, by allowing processes to contribute arbitrary sparse input data vectors. Our generic communication library, SparCML, extends MPI to support additional features, such as non-blocking (asynchronous) operations and low-precision data representations. As such, SparCML and its techniques will form the basis of future highly-scalable machine learning frameworks.
Tasks
Published 2018-02-22
URL https://arxiv.org/abs/1802.08021v3
PDF https://arxiv.org/pdf/1802.08021v3.pdf
PWC https://paperswithcode.com/paper/sparcml-high-performance-sparse-communication
Repo
Framework

2D-Densely Connected Convolution Neural Networks for automatic Liver and Tumor Segmentation

Title 2D-Densely Connected Convolution Neural Networks for automatic Liver and Tumor Segmentation
Authors Krishna Chaitanya Kaluva, Mahendra Khened, Avinash Kori, Ganapathy Krishnamurthi
Abstract In this paper we propose a fully automatic 2-stage cascaded approach for segmentation of liver and its tumors in CT (Computed Tomography) images using densely connected fully convolutional neural network (DenseNet). We independently train liver and tumor segmentation models and cascade them for a combined segmentation of the liver and its tumor. The first stage involves segmentation of liver and the second stage uses the first stage’s segmentation results for localization of liver and henceforth tumor segmentations inside liver region. The liver model was trained on the down-sampled axial slices $(256 \times 256)$, whereas for the tumor model no down-sampling of slices was done, but instead it was trained on the CT axial slices windowed at three different Hounsfield (HU) levels. On the test set our model achieved a global dice score of 0.923 and 0.625 on liver and tumor respectively. The computed tumor burden had an rmse of 0.044.
Tasks Automatic Liver And Tumor Segmentation
Published 2018-01-05
URL http://arxiv.org/abs/1802.02182v1
PDF http://arxiv.org/pdf/1802.02182v1.pdf
PWC https://paperswithcode.com/paper/2d-densely-connected-convolution-neural
Repo
Framework

Augmenting Bottleneck Features of Deep Neural Network Employing Motor State for Speech Recognition at Humanoid Robots

Title Augmenting Bottleneck Features of Deep Neural Network Employing Motor State for Speech Recognition at Humanoid Robots
Authors Moa Lee, Joon Hyuk Chang
Abstract As for the humanoid robots, the internal noise, which is generated by motors, fans and mechanical components when the robot is moving or shaking its body, severely degrades the performance of the speech recognition accuracy. In this paper, a novel speech recognition system robust to ego-noise for humanoid robots is proposed, in which on/off state of the motor is employed as auxiliary information for finding the relevant input features. For this, we consider the bottleneck features, which have been successfully applied to deep neural network (DNN) based automatic speech recognition (ASR) system. When learning the bottleneck features to catch, we first exploit the motor on/off state data as supplementary information in addition to the acoustic features as the input of the first deep neural network (DNN) for preliminary acoustic modeling. Then, the second DNN for primary acoustic modeling employs both the bottleneck features tossed from the first DNN and the acoustics features. When the proposed method is evaluated in terms of phoneme error rate (PER) on TIMIT database, the experimental results show that achieve obvious improvement (11% relative) is achieved by our algorithm over the conventional systems.
Tasks Speech Recognition
Published 2018-08-27
URL http://arxiv.org/abs/1808.08702v1
PDF http://arxiv.org/pdf/1808.08702v1.pdf
PWC https://paperswithcode.com/paper/augmenting-bottleneck-features-of-deep-neural
Repo
Framework

The universal approximation power of finite-width deep ReLU networks

Title The universal approximation power of finite-width deep ReLU networks
Authors Dmytro Perekrestenko, Philipp Grohs, Dennis Elbrächter, Helmut Bölcskei
Abstract We show that finite-width deep ReLU neural networks yield rate-distortion optimal approximation (B"olcskei et al., 2018) of polynomials, windowed sinusoidal functions, one-dimensional oscillatory textures, and the Weierstrass function, a fractal function which is continuous but nowhere differentiable. Together with their recently established universal approximation property of affine function systems (B"olcskei et al., 2018), this shows that deep neural networks approximate vastly different signal structures generated by the affine group, the Weyl-Heisenberg group, or through warping, and even certain fractals, all with approximation error decaying exponentially in the number of neurons. We also prove that in the approximation of sufficiently smooth functions finite-width deep networks require strictly smaller connectivity than finite-depth wide networks.
Tasks
Published 2018-06-05
URL http://arxiv.org/abs/1806.01528v1
PDF http://arxiv.org/pdf/1806.01528v1.pdf
PWC https://paperswithcode.com/paper/the-universal-approximation-power-of-finite
Repo
Framework

Compressive sensing adaptation for polynomial chaos expansions

Title Compressive sensing adaptation for polynomial chaos expansions
Authors Panagiotis Tsilifis, Xun Huan, Cosmin Safta, Khachik Sargsyan, Guilhem Lacaze, Joseph C. Oefelein, Habib N. Najm, Roger G. Ghanem
Abstract Basis adaptation in Homogeneous Chaos spaces rely on a suitable rotation of the underlying Gaussian germ. Several rotations have been proposed in the literature resulting in adaptations with different convergence properties. In this paper we present a new adaptation mechanism that builds on compressive sensing algorithms, resulting in a reduced polynomial chaos approximation with optimal sparsity. The developed adaptation algorithm consists of a two-step optimization procedure that computes the optimal coefficients and the input projection matrix of a low dimensional chaos expansion with respect to an optimally rotated basis. We demonstrate the attractive features of our algorithm through several numerical examples including the application on Large-Eddy Simulation (LES) calculations of turbulent combustion in a HIFiRE scramjet engine.
Tasks Compressive Sensing
Published 2018-01-06
URL http://arxiv.org/abs/1801.01961v2
PDF http://arxiv.org/pdf/1801.01961v2.pdf
PWC https://paperswithcode.com/paper/compressive-sensing-adaptation-for-polynomial
Repo
Framework

Automatic Identification of Ineffective Online Student Questions in Computing Education

Title Automatic Identification of Ineffective Online Student Questions in Computing Education
Authors Qiang Hao, April Galyardt, Bradley Barnes, Robert Maribe Branch, Ewan Wright
Abstract This Research Full Paper explores automatic identification of ineffective learning questions in the context of large-scale computer science classes. The immediate and accurate identification of ineffective learning questions opens the door to possible automated facilitation on a large scale, such as alerting learners to revise questions and providing adaptive question revision suggestions. To achieve this, 983 questions were collected from a question & answer platform implemented by an introductory programming course over three semesters in a large research university in the Southeastern United States. Questions were firstly manually classified into three hierarchical categories: 1) learning-irrelevant questions, 2) effective learning-relevant questions, 3) ineffective learningrelevant questions. The inter-rater reliability of the manual classification (Cohen’s Kappa) was .88. Four different machine learning algorithms were then used to automatically classify the questions, including Naive Bayes Multinomial, Logistic Regression, Support Vector Machines, and Boosted Decision Tree. Both flat and single path strategies were explored, and the most effective algorithms under both strategies were identified and discussed. This study contributes to the automatic determination of learning question quality in computer science, and provides evidence for the feasibility of automated facilitation of online question & answer in large scale computer science classes.
Tasks
Published 2018-07-18
URL http://arxiv.org/abs/1807.07173v3
PDF http://arxiv.org/pdf/1807.07173v3.pdf
PWC https://paperswithcode.com/paper/automatic-identification-of-ineffective
Repo
Framework

ASIC Implementation of Time-Domain Digital Backpropagation with Deep-Learned Chromatic Dispersion Filters

Title ASIC Implementation of Time-Domain Digital Backpropagation with Deep-Learned Chromatic Dispersion Filters
Authors Christoffer Fougstedt, Christian Häger, Lars Svensson, Henry D. Pfister, Per Larsson-Edefors
Abstract We consider time-domain digital backpropagation with chromatic dispersion filters jointly optimized and quantized using machine-learning techniques. Compared to the baseline implementations, we show improved BER performance and >40% power dissipation reductions in 28-nm CMOS.
Tasks
Published 2018-06-19
URL http://arxiv.org/abs/1806.07223v2
PDF http://arxiv.org/pdf/1806.07223v2.pdf
PWC https://paperswithcode.com/paper/asic-implementation-of-time-domain-digital
Repo
Framework

Multi-Step Prediction of Dynamic Systems with Recurrent Neural Networks

Title Multi-Step Prediction of Dynamic Systems with Recurrent Neural Networks
Authors Nima Mohajerin, Steven L. Waslander
Abstract Recurrent Neural Networks (RNNs) can encode rich dynamics which makes them suitable for modeling dynamic systems. To train an RNN for multi-step prediction of dynamic systems, it is crucial to efficiently address the state initialization problem, which seeks proper values for the RNN initial states at the beginning of each prediction interval. In this work, the state initialization problem is addressed using Neural Networks (NNs) to effectively train a variety of RNNs for modeling two aerial vehicles, a helicopter and a quadrotor, from experimental data. It is shown that the RNN initialized by the NN-based initialization method outperforms the state of the art. Further, a comprehensive study of RNNs trained for multi-step prediction of the two aerial vehicles is presented. The multi-step prediction of the quadrotor is enhanced using a hybrid model which combines a simplified physics-based motion model of the vehicle with RNNs. While the maximum translational and rotational velocities in the quadrotor dataset are about 4 m/s and 3.8 rad/s, respectively, the hybrid model produces predictions, over 1.9 second, which remain within 9 cm/s and 0.12 rad/s of the measured translational and rotational velocities, with 99% confidence on the test dataset
Tasks
Published 2018-05-20
URL http://arxiv.org/abs/1806.00526v1
PDF http://arxiv.org/pdf/1806.00526v1.pdf
PWC https://paperswithcode.com/paper/multi-step-prediction-of-dynamic-systems-with
Repo
Framework

Reliable uncertainty estimate for antibiotic resistance classification with Stochastic Gradient Langevin Dynamics

Title Reliable uncertainty estimate for antibiotic resistance classification with Stochastic Gradient Langevin Dynamics
Authors Md-Nafiz Hamid, Iddo Friedberg
Abstract Antibiotic resistance monitoring is of paramount importance in the face of this on-going global epidemic. Deep learning models trained with traditional optimization algorithms (e.g. Adam, SGD) provide poor posterior estimates when tested against out-of-distribution (OoD) antibiotic resistant/non-resistant genes. In this paper, we introduce a deep learning model trained with Stochastic Gradient Langevin Dynamics (SGLD) to classify antibiotic resistant genes. The model provides better uncertainty estimates when tested against OoD data compared to traditional optimization methods such as Adam.
Tasks
Published 2018-11-27
URL http://arxiv.org/abs/1811.11145v1
PDF http://arxiv.org/pdf/1811.11145v1.pdf
PWC https://paperswithcode.com/paper/reliable-uncertainty-estimate-for-antibiotic
Repo
Framework

An Overview of Computational Approaches for Interpretation Analysis

Title An Overview of Computational Approaches for Interpretation Analysis
Authors Philipp Blandfort, Jörn Hees, Desmond U. Patton
Abstract It is said that beauty is in the eye of the beholder. But how exactly can we characterize such discrepancies in interpretation? For example, are there any specific features of an image that makes person A regard an image as beautiful while person B finds the same image displeasing? Such questions ultimately aim at explaining our individual ways of interpretation, an intention that has been of fundamental importance to the social sciences from the beginning. More recently, advances in computer science brought up two related questions: First, can computational tools be adopted for analyzing ways of interpretation? Second, what if the “beholder” is a computer model, i.e., how can we explain a computer model’s point of view? Numerous efforts have been made regarding both of these points, while many existing approaches focus on particular aspects and are still rather separate. With this paper, in order to connect these approaches we introduce a theoretical framework for analyzing interpretation, which is applicable to interpretation of both human beings and computer models. We give an overview of relevant computational approaches from various fields, and discuss the most common and promising application areas. The focus of this paper lies on interpretation of text and image data, while many of the presented approaches are applicable to other types of data as well.
Tasks
Published 2018-11-09
URL https://arxiv.org/abs/1811.04028v2
PDF https://arxiv.org/pdf/1811.04028v2.pdf
PWC https://paperswithcode.com/paper/an-overview-of-computational-approaches-for
Repo
Framework

Unmixing urban hyperspectral imagery with a Gaussian mixture model on endmember variability

Title Unmixing urban hyperspectral imagery with a Gaussian mixture model on endmember variability
Authors Yuan Zhou, Erin B. Wetherley, Paul D. Gader
Abstract In this paper, we model a pixel as a linear combination of endmembers sampled from probability distributions of Gaussian mixture models (GMM). The parameters of the GMM distributions are estimated using spectral libraries. Abundances are estimated based on the distribution parameters. The advantage of this algorithm is that the model size grows very slowly as a function of the library size. To validate this method, we used data collected by the AVIRIS sensor over the Santa Barbara region: two 16 m spatial resolution and two 4 m spatial resolution images. 64 validated regions of interest (ROI) (180 m by 180 m) were used to assess estimate accuracy. Ground truth was obtained using 1 m images leading to the following 6 classes: turfgrass, non-photosynthetic vegetation (NPV), paved, roof, soil, and tree. Spectral libraries were built by manually identifying and extracting pure spectra from both resolution images, resulting in 3,287 spectra at 16 m and 15,426 spectra at 4 m. We then unmixed ROIs of each resolution using the following unmixing algorithms: the set-based algorithms MESMA and AAM, and the distribution-based algorithms GMM, NCM, and BCM. The original libraries were used for the distribution-based algorithms whereas set-based methods required a sophisticated reduction method, resulting in reduced libraries of 61 spectra at 16 m and 95 spectra at 4 m. The results show that GMM performs best among the distribution-based methods, producing comparable accuracy to MESMA, and may be more robust across datasets.
Tasks
Published 2018-01-25
URL http://arxiv.org/abs/1801.08513v1
PDF http://arxiv.org/pdf/1801.08513v1.pdf
PWC https://paperswithcode.com/paper/unmixing-urban-hyperspectral-imagery-with-a
Repo
Framework

SpiNNTools: The Execution Engine for the SpiNNaker Platform

Title SpiNNTools: The Execution Engine for the SpiNNaker Platform
Authors Andrew G. D. Rowley, Christian Brenninkmeijer, Simon Davidson, Donal Fellows, Andrew Gait, David R. Lester, Luis A. Plana, Oliver Rhodes, Alan B. Stokes, Steve B. Furber
Abstract Distributed systems are becoming more common place, as computers typically contain multiple computation processors. The SpiNNaker architecture is such a distributed architecture, containing millions of cores connected with a unique communication network, making it one of the largest neuromorphic computing platforms in the world. Utilising these processors efficiently usually requires expert knowledge of the architecture to generate executable code. This work introduces a set of tools (SpiNNTools) that can map computational work described as a graph in to executable code that runs on this novel machine. The SpiNNaker architecture is highly scalable which in turn produces unique challenges in loading data, executing the mapped problem and the retrieval of data. In this paper we describe these challenges in detail and the solutions implemented.
Tasks
Published 2018-10-16
URL http://arxiv.org/abs/1810.06835v1
PDF http://arxiv.org/pdf/1810.06835v1.pdf
PWC https://paperswithcode.com/paper/spinntools-the-execution-engine-for-the
Repo
Framework

Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions

Title Recurrent Neural Networks with Flexible Gates using Kernel Activation Functions
Authors Simone Scardapane, Steven Van Vaerenbergh, Danilo Comminiello, Simone Totaro, Aurelio Uncini
Abstract Gated recurrent neural networks have achieved remarkable results in the analysis of sequential data. Inside these networks, gates are used to control the flow of information, allowing to model even very long-term dependencies in the data. In this paper, we investigate whether the original gate equation (a linear projection followed by an element-wise sigmoid) can be improved. In particular, we design a more flexible architecture, with a small number of adaptable parameters, which is able to model a wider range of gating functions than the classical one. To this end, we replace the sigmoid function in the standard gate with a non-parametric formulation extending the recently proposed kernel activation function (KAF), with the addition of a residual skip-connection. A set of experiments on sequential variants of the MNIST dataset shows that the adoption of this novel gate allows to improve accuracy with a negligible cost in terms of computational power and with a large speed-up in the number of training iterations.
Tasks
Published 2018-07-11
URL http://arxiv.org/abs/1807.04065v1
PDF http://arxiv.org/pdf/1807.04065v1.pdf
PWC https://paperswithcode.com/paper/recurrent-neural-networks-with-flexible-gates
Repo
Framework

Candidate Labeling for Crowd Learning

Title Candidate Labeling for Crowd Learning
Authors Iker Beñaran-Muñoz, Jerónimo Hernández-González, Aritz Pérez
Abstract Crowdsourcing has become very popular among the machine learning community as a way to obtain labels that allow a ground truth to be estimated for a given dataset. In most of the approaches that use crowdsourced labels, annotators are asked to provide, for each presented instance, a single class label. Such a request could be inefficient, that is, considering that the labelers may not be experts, that way to proceed could fail to take real advantage of the knowledge of the labelers. In this paper, the use of candidate labeling for crowd learning is proposed, where the annotators may provide more than a single label per instance to try not to miss the real label. The main hypothesis is that, by allowing candidate labeling, knowledge can be extracted from the labelers more efficiently by than in the standard crowd learning scenario. Empirical evidence which supports that hypothesis is presented.
Tasks
Published 2018-04-26
URL http://arxiv.org/abs/1804.10023v2
PDF http://arxiv.org/pdf/1804.10023v2.pdf
PWC https://paperswithcode.com/paper/candidate-labeling-for-crowd-learning
Repo
Framework
comments powered by Disqus