October 18, 2019

3069 words 15 mins read

Paper Group ANR 650

Paper Group ANR 650

Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models. Viewpoint: Artificial Intelligence and Labour. Exploring Conversational Language Generation for Rich Content about Hotels. Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks. Learning Optical Flow via Dilated Networks and O …

Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models

Title Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models
Authors Rainier Barrett, Shaoyi Jiang, Andrew D White
Abstract Bayesian network models are finding success in characterizing enzyme-catalyzed reactions, slow conformational changes, predicting enzyme inhibition, and genomics. In this work, we apply them to statistical modeling of peptides by simultaneously identifying amino acid sequence motifs and using a motif-based model to clarify the role motifs may play in antimicrobial activity. We construct models of increasing sophistication, demonstrating how chemical knowledge of a peptide system may be embedded without requiring new derivation of model fitting equations after changing model structure. These models are used to construct classifiers with good performance (94% accuracy, Matthews correlation coefficient of 0.87) at predicting antimicrobial activity in peptides, while at the same time being built of interpretable parameters. We demonstrate use of these models to identify peptides that are potentially both antimicrobial and antifouling, and show that the background distribution of amino acids could play a greater role in activity than sequence motifs do. This provides an advancement in the type of peptide activity modeling that can be done and the ease in which models can be constructed.
Tasks
Published 2018-04-17
URL http://arxiv.org/abs/1804.06327v1
PDF http://arxiv.org/pdf/1804.06327v1.pdf
PWC https://paperswithcode.com/paper/classifying-antimicrobial-and-multifunctional
Repo
Framework

Viewpoint: Artificial Intelligence and Labour

Title Viewpoint: Artificial Intelligence and Labour
Authors Spyridon Samothrakis
Abstract The welfare of modern societies has been intrinsically linked to wage labour. With some exceptions, the modern human has to sell her labour-power to be able reproduce biologically and socially. Thus, a lingering fear of technological unemployment features predominately as a theme among Artificial Intelligence researchers. In this short paper we show that, if past trends are anything to go by, this fear is irrational. On the contrary, we argue that the main problem humanity will be facing is the normalisation of extremely long working hours.
Tasks
Published 2018-03-17
URL http://arxiv.org/abs/1803.06563v1
PDF http://arxiv.org/pdf/1803.06563v1.pdf
PWC https://paperswithcode.com/paper/viewpoint-artificial-intelligence-and-labour
Repo
Framework

Exploring Conversational Language Generation for Rich Content about Hotels

Title Exploring Conversational Language Generation for Rich Content about Hotels
Authors Marilyn A. Walker, Albry Smither, Shereen Oraby, Vrindavan Harrison, Hadar Shemtov
Abstract Dialogue systems for hotel and tourist information have typically simplified the richness of the domain, focusing system utterances on only a few selected attributes such as price, location and type of rooms. However, much more content is typically available for hotels, often as many as 50 distinct instantiated attributes for an individual entity. New methods are needed to use this content to generate natural dialogues for hotel information, and in general for any domain with such rich complex content. We describe three experiments aimed at collecting data that can inform an NLG for hotels dialogues, and show, not surprisingly, that the sentences in the original written hotel descriptions provided on webpages for each hotel are stylistically not a very good match for conversational interaction. We quantify the stylistic features that characterize the differences between the original textual data and the collected dialogic data. We plan to use these in stylistic models for generation, and for scoring retrieved utterances for use in hotel dialogues
Tasks Text Generation
Published 2018-05-01
URL http://arxiv.org/abs/1805.00551v1
PDF http://arxiv.org/pdf/1805.00551v1.pdf
PWC https://paperswithcode.com/paper/exploring-conversational-language-generation
Repo
Framework

Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks

Title Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks
Authors Panos Stinis, Tobias Hagge, Alexandre M. Tartakovsky, Enoch Yeung
Abstract We suggest ways to enforce given constraints in the output of a Generative Adversarial Network (GAN) generator both for interpolation and extrapolation (prediction). For the case of dynamical systems, given a time series, we wish to train GAN generators that can be used to predict trajectories starting from a given initial condition. In this setting, the constraints can be in algebraic and/or differential form. Even though we are predominantly interested in the case of extrapolation, we will see that the tasks of interpolation and extrapolation are related. However, they need to be treated differently. For the case of interpolation, the incorporation of constraints is built into the training of the GAN. The incorporation of the constraints respects the primary game-theoretic setup of a GAN so it can be combined with existing algorithms. However, it can exacerbate the problem of instability during training that is well-known for GANs. We suggest adding small noise to the constraints as a simple remedy that has performed well in our numerical experiments. The case of extrapolation (prediction) is more involved. During training, the GAN generator learns to interpolate a noisy version of the data and we enforce the constraints. This approach has connections with model reduction that we can utilize to improve the efficiency and accuracy of the training. Depending on the form of the constraints, we may enforce them also during prediction through a projection step. We provide examples of linear and nonlinear systems of differential equations to illustrate the various constructions.
Tasks Time Series
Published 2018-03-22
URL https://arxiv.org/abs/1803.08182v2
PDF https://arxiv.org/pdf/1803.08182v2.pdf
PWC https://paperswithcode.com/paper/enforcing-constraints-for-interpolation-and
Repo
Framework

Learning Optical Flow via Dilated Networks and Occlusion Reasoning

Title Learning Optical Flow via Dilated Networks and Occlusion Reasoning
Authors Yi Zhu, Shawn Newsam
Abstract Despite the significant progress that has been made on estimating optical flow recently, most estimation methods, including classical and deep learning approaches, still have difficulty with multi-scale estimation, real-time computation, and/or occlusion reasoning. In this paper, we introduce dilated convolution and occlusion reasoning into unsupervised optical flow estimation to address these issues. The dilated convolution allows our network to avoid upsampling via deconvolution and the resulting gridding artifacts. Dilated convolution also results in a smaller memory footprint which speeds up interference. The occlusion reasoning prevents our network from learning incorrect deformations due to occluded image regions during training. Our proposed method outperforms state-of-the-art unsupervised approaches on the KITTI benchmark. We also demonstrate its generalization capability by applying it to action recognition in video.
Tasks Optical Flow Estimation, Temporal Action Localization
Published 2018-05-07
URL http://arxiv.org/abs/1805.02733v1
PDF http://arxiv.org/pdf/1805.02733v1.pdf
PWC https://paperswithcode.com/paper/learning-optical-flow-via-dilated-networks
Repo
Framework

Coopetitive Soft Gating Ensemble

Title Coopetitive Soft Gating Ensemble
Authors Stephan Deist, Maarten Bieshaar, Jens Schreiber, Andre Gensler, Bernhard Sick
Abstract In this article, we propose the Coopetititve Soft Gating Ensemble or CSGE for general machine learning tasks and interwoven systems. The goal of machine learning is to create models that generalize well for unknown datasets. Often, however, the problems are too complex to be solved with a single model, so several models are combined. Similar, Autonomic Computing requires the integration of different systems. Here, especially, the local, temporal online evaluation and the resulting (re-)weighting scheme of the CSGE makes the approach highly applicable for self-improving system integrations. To achieve the best potential performance the CSGE can be optimized according to arbitrary loss functions making it accessible for a broader range of problems. We introduce a novel training procedure including a hyper-parameter initialisation at its heart. We show that the CSGE approach reaches state-of-the-art performance for both classification and regression tasks. Further on, the CSGE provides a human-readable quantification on the influence of all base estimators employing the three weighting aspects. Moreover, we provide a scikit-learn compatible implementation.
Tasks
Published 2018-07-03
URL http://arxiv.org/abs/1807.01020v2
PDF http://arxiv.org/pdf/1807.01020v2.pdf
PWC https://paperswithcode.com/paper/coopetitive-soft-gating-ensemble
Repo
Framework

Comparing morphological complexity of Spanish, Otomi and Nahuatl

Title Comparing morphological complexity of Spanish, Otomi and Nahuatl
Authors Ximena Gutierrez-Vasques, Victor Mijangos
Abstract We use two small parallel corpora for comparing the morphological complexity of Spanish, Otomi and Nahuatl. These are languages that belong to different linguistic families, the latter are low-resourced. We take into account two quantitative criteria, on one hand the distribution of types over tokens in a corpus, on the other, perplexity and entropy as indicators of word structure predictability. We show that a language can be complex in terms of how many different morphological word forms can produce, however, it may be less complex in terms of predictability of its internal structure of words.
Tasks
Published 2018-08-13
URL http://arxiv.org/abs/1808.04314v1
PDF http://arxiv.org/pdf/1808.04314v1.pdf
PWC https://paperswithcode.com/paper/comparing-morphological-complexity-of-spanish
Repo
Framework

Scale-recurrent Network for Deep Image Deblurring

Title Scale-recurrent Network for Deep Image Deblurring
Authors Xin Tao, Hongyun Gao, Yi Wang, Xiaoyong Shen, Jue Wang, Jiaya Jia
Abstract In single image deblurring, the “coarse-to-fine” scheme, i.e. gradually restoring the sharp image on different resolutions in a pyramid, is very successful in both traditional optimization-based methods and recent neural-network-based approaches. In this paper, we investigate this strategy and propose a Scale-recurrent Network (SRN-DeblurNet) for this deblurring task. Compared with the many recent learning-based approaches in [25], it has a simpler network structure, a smaller number of parameters and is easier to train. We evaluate our method on large-scale deblurring datasets with complex motion. Results show that our method can produce better quality results than state-of-the-arts, both quantitatively and qualitatively.
Tasks Deblurring
Published 2018-02-06
URL http://arxiv.org/abs/1802.01770v1
PDF http://arxiv.org/pdf/1802.01770v1.pdf
PWC https://paperswithcode.com/paper/scale-recurrent-network-for-deep-image
Repo
Framework

Pancreas Segmentation in CT and MRI Images via Domain Specific Network Designing and Recurrent Neural Contextual Learning

Title Pancreas Segmentation in CT and MRI Images via Domain Specific Network Designing and Recurrent Neural Contextual Learning
Authors Jinzheng Cai, Le Lu, Fuyong Xing, Lin Yang
Abstract Automatic pancreas segmentation in radiology images, eg., computed tomography (CT) and magnetic resonance imaging (MRI), is frequently required by computer-aided screening, diagnosis, and quantitative assessment. Yet pancreas is a challenging abdominal organ to segment due to the high inter-patient anatomical variability in both shape and volume metrics. Recently, convolutional neural networks (CNNs) have demonstrated promising performance on accurate segmentation of pancreas. However, the CNN-based method often suffers from segmentation discontinuity for reasons such as noisy image quality and blurry pancreatic boundary. From this point, we propose to introduce recurrent neural networks (RNNs) to address the problem of spatial non-smoothness of inter-slice pancreas segmentation across adjacent image slices. To inference initial segmentation, we first train a 2D CNN sub-network, where we modify its network architecture with deep-supervision and multi-scale feature map aggregation so that it can be trained from scratch with small-sized training data and presents superior performance than transferred models. Thereafter, the successive CNN outputs are processed by another RNN sub-network, which refines the consistency of segmented shapes. More specifically, the RNN sub-network consists convolutional long short-term memory (CLSTM) units in both top-down and bottom-up directions, which regularizes the segmentation of an image by integrating predictions of its neighboring slices. We train the stacked CNN-RNN model end-to-end and perform quantitative evaluations on both CT and MRI images.
Tasks Computed Tomography (CT), Pancreas Segmentation
Published 2018-03-30
URL http://arxiv.org/abs/1803.11303v1
PDF http://arxiv.org/pdf/1803.11303v1.pdf
PWC https://paperswithcode.com/paper/pancreas-segmentation-in-ct-and-mri-images
Repo
Framework

GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks

Title GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks
Authors Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh
Abstract Generative Adversarial Networks (GANs) are one of the most recent deep learning models that generate synthetic data from limited genuine datasets. GANs are on the frontier as further extension of deep learning into many domains (e.g., medicine, robotics, content synthesis) requires massive sets of labeled data that is generally either unavailable or prohibitively costly to collect. Although GANs are gaining prominence in various fields, there are no accelerators for these new models. In fact, GANs leverage a new operator, called transposed convolution, that exposes unique challenges for hardware acceleration. This operator first inserts zeros within the multidimensional input, then convolves a kernel over this expanded array to add information to the embedded zeros. Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed. We propose the GANAX architecture to alleviate the sources of inefficiency associated with the acceleration of GANs using conventional convolution accelerators, making the first GAN accelerator design possible. We propose a reorganization of the output computations to allocate compute rows with similar patterns of zeros to adjacent processing engines, which also avoids inconsequential multiply-adds on the zeros. This compulsory adjacency reclaims data reuse across these neighboring processing engines, which had otherwise diminished due to the inserted zeros. The reordering breaks the full SIMD execution model, which is prominent in convolution accelerators. Therefore, we propose a unified MIMD-SIMD design for GANAX that leverages repeated patterns in the computation to create distinct microprograms that execute concurrently in SIMD mode.
Tasks
Published 2018-05-10
URL http://arxiv.org/abs/1806.01107v1
PDF http://arxiv.org/pdf/1806.01107v1.pdf
PWC https://paperswithcode.com/paper/ganax-a-unified-mimd-simd-acceleration-for
Repo
Framework

Automated, predictive, and interpretable inference of C. elegans escape dynamics

Title Automated, predictive, and interpretable inference of C. elegans escape dynamics
Authors Bryan C. Daniels, William S. Ryu, Ilya Nemenman
Abstract The roundworm C. elegans exhibits robust escape behavior in response to rapidly rising temperature. The behavior lasts for a few seconds, shows history dependence, involves both sensory and motor systems, and is too complicated to model mechanistically using currently available knowledge. Instead we model the process phenomenologically, and we use the Sir Isaac dynamical inference platform to infer the model in a fully automated fashion directly from experimental data. The inferred model requires incorporation of an unobserved dynamical variable, and is biologically interpretable. The model makes accurate predictions about the dynamics of the worm behavior, and it can be used to characterize the functional logic of the dynamical system underlying the escape response. This work illustrates the power of modern artificial intelligence to aid in discovery of accurate and interpretable models of complex natural systems.
Tasks
Published 2018-09-25
URL http://arxiv.org/abs/1809.09321v1
PDF http://arxiv.org/pdf/1809.09321v1.pdf
PWC https://paperswithcode.com/paper/automated-predictive-and-interpretable
Repo
Framework

Analysis of Cellular Feature Differences of Astrocytomas with Distinct Mutational Profiles Using Digitized Histopathology Images

Title Analysis of Cellular Feature Differences of Astrocytomas with Distinct Mutational Profiles Using Digitized Histopathology Images
Authors Mousumi Roy, Fusheng Wang, George Teodoro, Jose Velazqeuz Vega, Daniel Brat, Jun Kong
Abstract Cellular phenotypic features derived from histopathology images are the basis of pathologic diagnosis and are thought to be related to underlying molecular profiles. Due to overwhelming cell numbers and population heterogeneity, it remains challenging to quantitatively compute and compare features of cells with distinct molecular signatures. In this study, we propose a self-reliant and efficient analysis framework that supports quantitative analysis of cellular phenotypic difference across distinct molecular groups. To demonstrate efficacy, we quantitatively analyze astrocytomas that are molecularly characterized as either Isocitrate Dehydrogenase (IDH) mutant (MUT) or wildtype (WT) using imaging data from The Cancer Genome Atlas database. Representative cell instances that are phenotypically different between these two groups are retrieved after segmentation, feature computation, data pruning, dimensionality reduction, and unsupervised clustering. Our analysis is generic and can be applied to a wide set of cell-based biomedical research.
Tasks Dimensionality Reduction
Published 2018-06-24
URL http://arxiv.org/abs/1806.09093v1
PDF http://arxiv.org/pdf/1806.09093v1.pdf
PWC https://paperswithcode.com/paper/analysis-of-cellular-feature-differences-of
Repo
Framework

A Dirichlet Process Mixture Model of Discrete Choice

Title A Dirichlet Process Mixture Model of Discrete Choice
Authors Rico Krueger, Akshay Vij, Taha H. Rashidi
Abstract We present a mixed multinomial logit (MNL) model, which leverages the truncated stick-breaking process representation of the Dirichlet process as a flexible nonparametric mixing distribution. The proposed model is a Dirichlet process mixture model and accommodates discrete representations of heterogeneity, like a latent class MNL model. Yet, unlike a latent class MNL model, the proposed discrete choice model does not require the analyst to fix the number of mixture components prior to estimation, as the complexity of the discrete mixing distribution is inferred from the evidence. For posterior inference in the proposed Dirichlet process mixture model of discrete choice, we derive an expectation maximisation algorithm. In a simulation study, we demonstrate that the proposed model framework can flexibly capture differently-shaped taste parameter distributions. Furthermore, we empirically validate the model framework in a case study on motorists’ route choice preferences and find that the proposed Dirichlet process mixture model of discrete choice outperforms a latent class MNL model and mixed MNL models with common parametric mixing distributions in terms of both in-sample fit and out-of-sample predictive ability. Compared to extant modelling approaches, the proposed discrete choice model substantially abbreviates specification searches, as it relies on less restrictive parametric assumptions and does not require the analyst to specify the complexity of the discrete mixing distribution prior to estimation.
Tasks
Published 2018-01-19
URL http://arxiv.org/abs/1801.06296v1
PDF http://arxiv.org/pdf/1801.06296v1.pdf
PWC https://paperswithcode.com/paper/a-dirichlet-process-mixture-model-of-discrete
Repo
Framework

Bayesian Incremental Learning for Deep Neural Networks

Title Bayesian Incremental Learning for Deep Neural Networks
Authors Max Kochurov, Timur Garipov, Dmitry Podoprikhin, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
Abstract In industrial machine learning pipelines, data often arrive in parts. Particularly in the case of deep neural networks, it may be too expensive to train the model from scratch each time, so one would rather use a previously learned model and the new data to improve performance. However, deep neural networks are prone to getting stuck in a suboptimal solution when trained on only new data as compared to the full dataset. Our work focuses on a continuous learning setup where the task is always the same and new parts of data arrive sequentially. We apply a Bayesian approach to update the posterior approximation with each new piece of data and find this method to outperform the traditional approach in our experiments.
Tasks
Published 2018-02-20
URL http://arxiv.org/abs/1802.07329v3
PDF http://arxiv.org/pdf/1802.07329v3.pdf
PWC https://paperswithcode.com/paper/bayesian-incremental-learning-for-deep-neural
Repo
Framework

A Survey on Methods and Theories of Quantized Neural Networks

Title A Survey on Methods and Theories of Quantized Neural Networks
Authors Yunhui Guo
Abstract Deep neural networks are the state-of-the-art methods for many real-world tasks, such as computer vision, natural language processing and speech recognition. For all its popularity, deep neural networks are also criticized for consuming a lot of memory and draining battery life of devices during training and inference. This makes it hard to deploy these models on mobile or embedded devices which have tight resource constraints. Quantization is recognized as one of the most effective approaches to satisfy the extreme memory requirements that deep neural network models demand. Instead of adopting 32-bit floating point format to represent weights, quantized representations store weights using more compact formats such as integers or even binary numbers. Despite a possible degradation in predictive performance, quantization provides a potential solution to greatly reduce the model size and the energy consumption. In this survey, we give a thorough review of different aspects of quantized neural networks. Current challenges and trends of quantized neural networks are also discussed.
Tasks Quantization, Speech Recognition
Published 2018-08-13
URL http://arxiv.org/abs/1808.04752v2
PDF http://arxiv.org/pdf/1808.04752v2.pdf
PWC https://paperswithcode.com/paper/a-survey-on-methods-and-theories-of-quantized
Repo
Framework
comments powered by Disqus