Paper Group ANR 650
Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models. Viewpoint: Artificial Intelligence and Labour. Exploring Conversational Language Generation for Rich Content about Hotels. Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks. Learning Optical Flow via Dilated Networks and O …
Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models
Title | Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models |
Authors | Rainier Barrett, Shaoyi Jiang, Andrew D White |
Abstract | Bayesian network models are finding success in characterizing enzyme-catalyzed reactions, slow conformational changes, predicting enzyme inhibition, and genomics. In this work, we apply them to statistical modeling of peptides by simultaneously identifying amino acid sequence motifs and using a motif-based model to clarify the role motifs may play in antimicrobial activity. We construct models of increasing sophistication, demonstrating how chemical knowledge of a peptide system may be embedded without requiring new derivation of model fitting equations after changing model structure. These models are used to construct classifiers with good performance (94% accuracy, Matthews correlation coefficient of 0.87) at predicting antimicrobial activity in peptides, while at the same time being built of interpretable parameters. We demonstrate use of these models to identify peptides that are potentially both antimicrobial and antifouling, and show that the background distribution of amino acids could play a greater role in activity than sequence motifs do. This provides an advancement in the type of peptide activity modeling that can be done and the ease in which models can be constructed. |
Tasks | |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06327v1 |
http://arxiv.org/pdf/1804.06327v1.pdf | |
PWC | https://paperswithcode.com/paper/classifying-antimicrobial-and-multifunctional |
Repo | |
Framework | |
Viewpoint: Artificial Intelligence and Labour
Title | Viewpoint: Artificial Intelligence and Labour |
Authors | Spyridon Samothrakis |
Abstract | The welfare of modern societies has been intrinsically linked to wage labour. With some exceptions, the modern human has to sell her labour-power to be able reproduce biologically and socially. Thus, a lingering fear of technological unemployment features predominately as a theme among Artificial Intelligence researchers. In this short paper we show that, if past trends are anything to go by, this fear is irrational. On the contrary, we argue that the main problem humanity will be facing is the normalisation of extremely long working hours. |
Tasks | |
Published | 2018-03-17 |
URL | http://arxiv.org/abs/1803.06563v1 |
http://arxiv.org/pdf/1803.06563v1.pdf | |
PWC | https://paperswithcode.com/paper/viewpoint-artificial-intelligence-and-labour |
Repo | |
Framework | |
Exploring Conversational Language Generation for Rich Content about Hotels
Title | Exploring Conversational Language Generation for Rich Content about Hotels |
Authors | Marilyn A. Walker, Albry Smither, Shereen Oraby, Vrindavan Harrison, Hadar Shemtov |
Abstract | Dialogue systems for hotel and tourist information have typically simplified the richness of the domain, focusing system utterances on only a few selected attributes such as price, location and type of rooms. However, much more content is typically available for hotels, often as many as 50 distinct instantiated attributes for an individual entity. New methods are needed to use this content to generate natural dialogues for hotel information, and in general for any domain with such rich complex content. We describe three experiments aimed at collecting data that can inform an NLG for hotels dialogues, and show, not surprisingly, that the sentences in the original written hotel descriptions provided on webpages for each hotel are stylistically not a very good match for conversational interaction. We quantify the stylistic features that characterize the differences between the original textual data and the collected dialogic data. We plan to use these in stylistic models for generation, and for scoring retrieved utterances for use in hotel dialogues |
Tasks | Text Generation |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00551v1 |
http://arxiv.org/pdf/1805.00551v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-conversational-language-generation |
Repo | |
Framework | |
Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks
Title | Enforcing constraints for interpolation and extrapolation in Generative Adversarial Networks |
Authors | Panos Stinis, Tobias Hagge, Alexandre M. Tartakovsky, Enoch Yeung |
Abstract | We suggest ways to enforce given constraints in the output of a Generative Adversarial Network (GAN) generator both for interpolation and extrapolation (prediction). For the case of dynamical systems, given a time series, we wish to train GAN generators that can be used to predict trajectories starting from a given initial condition. In this setting, the constraints can be in algebraic and/or differential form. Even though we are predominantly interested in the case of extrapolation, we will see that the tasks of interpolation and extrapolation are related. However, they need to be treated differently. For the case of interpolation, the incorporation of constraints is built into the training of the GAN. The incorporation of the constraints respects the primary game-theoretic setup of a GAN so it can be combined with existing algorithms. However, it can exacerbate the problem of instability during training that is well-known for GANs. We suggest adding small noise to the constraints as a simple remedy that has performed well in our numerical experiments. The case of extrapolation (prediction) is more involved. During training, the GAN generator learns to interpolate a noisy version of the data and we enforce the constraints. This approach has connections with model reduction that we can utilize to improve the efficiency and accuracy of the training. Depending on the form of the constraints, we may enforce them also during prediction through a projection step. We provide examples of linear and nonlinear systems of differential equations to illustrate the various constructions. |
Tasks | Time Series |
Published | 2018-03-22 |
URL | https://arxiv.org/abs/1803.08182v2 |
https://arxiv.org/pdf/1803.08182v2.pdf | |
PWC | https://paperswithcode.com/paper/enforcing-constraints-for-interpolation-and |
Repo | |
Framework | |
Learning Optical Flow via Dilated Networks and Occlusion Reasoning
Title | Learning Optical Flow via Dilated Networks and Occlusion Reasoning |
Authors | Yi Zhu, Shawn Newsam |
Abstract | Despite the significant progress that has been made on estimating optical flow recently, most estimation methods, including classical and deep learning approaches, still have difficulty with multi-scale estimation, real-time computation, and/or occlusion reasoning. In this paper, we introduce dilated convolution and occlusion reasoning into unsupervised optical flow estimation to address these issues. The dilated convolution allows our network to avoid upsampling via deconvolution and the resulting gridding artifacts. Dilated convolution also results in a smaller memory footprint which speeds up interference. The occlusion reasoning prevents our network from learning incorrect deformations due to occluded image regions during training. Our proposed method outperforms state-of-the-art unsupervised approaches on the KITTI benchmark. We also demonstrate its generalization capability by applying it to action recognition in video. |
Tasks | Optical Flow Estimation, Temporal Action Localization |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02733v1 |
http://arxiv.org/pdf/1805.02733v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-optical-flow-via-dilated-networks |
Repo | |
Framework | |
Coopetitive Soft Gating Ensemble
Title | Coopetitive Soft Gating Ensemble |
Authors | Stephan Deist, Maarten Bieshaar, Jens Schreiber, Andre Gensler, Bernhard Sick |
Abstract | In this article, we propose the Coopetititve Soft Gating Ensemble or CSGE for general machine learning tasks and interwoven systems. The goal of machine learning is to create models that generalize well for unknown datasets. Often, however, the problems are too complex to be solved with a single model, so several models are combined. Similar, Autonomic Computing requires the integration of different systems. Here, especially, the local, temporal online evaluation and the resulting (re-)weighting scheme of the CSGE makes the approach highly applicable for self-improving system integrations. To achieve the best potential performance the CSGE can be optimized according to arbitrary loss functions making it accessible for a broader range of problems. We introduce a novel training procedure including a hyper-parameter initialisation at its heart. We show that the CSGE approach reaches state-of-the-art performance for both classification and regression tasks. Further on, the CSGE provides a human-readable quantification on the influence of all base estimators employing the three weighting aspects. Moreover, we provide a scikit-learn compatible implementation. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01020v2 |
http://arxiv.org/pdf/1807.01020v2.pdf | |
PWC | https://paperswithcode.com/paper/coopetitive-soft-gating-ensemble |
Repo | |
Framework | |
Comparing morphological complexity of Spanish, Otomi and Nahuatl
Title | Comparing morphological complexity of Spanish, Otomi and Nahuatl |
Authors | Ximena Gutierrez-Vasques, Victor Mijangos |
Abstract | We use two small parallel corpora for comparing the morphological complexity of Spanish, Otomi and Nahuatl. These are languages that belong to different linguistic families, the latter are low-resourced. We take into account two quantitative criteria, on one hand the distribution of types over tokens in a corpus, on the other, perplexity and entropy as indicators of word structure predictability. We show that a language can be complex in terms of how many different morphological word forms can produce, however, it may be less complex in terms of predictability of its internal structure of words. |
Tasks | |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04314v1 |
http://arxiv.org/pdf/1808.04314v1.pdf | |
PWC | https://paperswithcode.com/paper/comparing-morphological-complexity-of-spanish |
Repo | |
Framework | |
Scale-recurrent Network for Deep Image Deblurring
Title | Scale-recurrent Network for Deep Image Deblurring |
Authors | Xin Tao, Hongyun Gao, Yi Wang, Xiaoyong Shen, Jue Wang, Jiaya Jia |
Abstract | In single image deblurring, the “coarse-to-fine” scheme, i.e. gradually restoring the sharp image on different resolutions in a pyramid, is very successful in both traditional optimization-based methods and recent neural-network-based approaches. In this paper, we investigate this strategy and propose a Scale-recurrent Network (SRN-DeblurNet) for this deblurring task. Compared with the many recent learning-based approaches in [25], it has a simpler network structure, a smaller number of parameters and is easier to train. We evaluate our method on large-scale deblurring datasets with complex motion. Results show that our method can produce better quality results than state-of-the-arts, both quantitatively and qualitatively. |
Tasks | Deblurring |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.01770v1 |
http://arxiv.org/pdf/1802.01770v1.pdf | |
PWC | https://paperswithcode.com/paper/scale-recurrent-network-for-deep-image |
Repo | |
Framework | |
Pancreas Segmentation in CT and MRI Images via Domain Specific Network Designing and Recurrent Neural Contextual Learning
Title | Pancreas Segmentation in CT and MRI Images via Domain Specific Network Designing and Recurrent Neural Contextual Learning |
Authors | Jinzheng Cai, Le Lu, Fuyong Xing, Lin Yang |
Abstract | Automatic pancreas segmentation in radiology images, eg., computed tomography (CT) and magnetic resonance imaging (MRI), is frequently required by computer-aided screening, diagnosis, and quantitative assessment. Yet pancreas is a challenging abdominal organ to segment due to the high inter-patient anatomical variability in both shape and volume metrics. Recently, convolutional neural networks (CNNs) have demonstrated promising performance on accurate segmentation of pancreas. However, the CNN-based method often suffers from segmentation discontinuity for reasons such as noisy image quality and blurry pancreatic boundary. From this point, we propose to introduce recurrent neural networks (RNNs) to address the problem of spatial non-smoothness of inter-slice pancreas segmentation across adjacent image slices. To inference initial segmentation, we first train a 2D CNN sub-network, where we modify its network architecture with deep-supervision and multi-scale feature map aggregation so that it can be trained from scratch with small-sized training data and presents superior performance than transferred models. Thereafter, the successive CNN outputs are processed by another RNN sub-network, which refines the consistency of segmented shapes. More specifically, the RNN sub-network consists convolutional long short-term memory (CLSTM) units in both top-down and bottom-up directions, which regularizes the segmentation of an image by integrating predictions of its neighboring slices. We train the stacked CNN-RNN model end-to-end and perform quantitative evaluations on both CT and MRI images. |
Tasks | Computed Tomography (CT), Pancreas Segmentation |
Published | 2018-03-30 |
URL | http://arxiv.org/abs/1803.11303v1 |
http://arxiv.org/pdf/1803.11303v1.pdf | |
PWC | https://paperswithcode.com/paper/pancreas-segmentation-in-ct-and-mri-images |
Repo | |
Framework | |
GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks
Title | GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks |
Authors | Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh |
Abstract | Generative Adversarial Networks (GANs) are one of the most recent deep learning models that generate synthetic data from limited genuine datasets. GANs are on the frontier as further extension of deep learning into many domains (e.g., medicine, robotics, content synthesis) requires massive sets of labeled data that is generally either unavailable or prohibitively costly to collect. Although GANs are gaining prominence in various fields, there are no accelerators for these new models. In fact, GANs leverage a new operator, called transposed convolution, that exposes unique challenges for hardware acceleration. This operator first inserts zeros within the multidimensional input, then convolves a kernel over this expanded array to add information to the embedded zeros. Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed. We propose the GANAX architecture to alleviate the sources of inefficiency associated with the acceleration of GANs using conventional convolution accelerators, making the first GAN accelerator design possible. We propose a reorganization of the output computations to allocate compute rows with similar patterns of zeros to adjacent processing engines, which also avoids inconsequential multiply-adds on the zeros. This compulsory adjacency reclaims data reuse across these neighboring processing engines, which had otherwise diminished due to the inserted zeros. The reordering breaks the full SIMD execution model, which is prominent in convolution accelerators. Therefore, we propose a unified MIMD-SIMD design for GANAX that leverages repeated patterns in the computation to create distinct microprograms that execute concurrently in SIMD mode. |
Tasks | |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1806.01107v1 |
http://arxiv.org/pdf/1806.01107v1.pdf | |
PWC | https://paperswithcode.com/paper/ganax-a-unified-mimd-simd-acceleration-for |
Repo | |
Framework | |
Automated, predictive, and interpretable inference of C. elegans escape dynamics
Title | Automated, predictive, and interpretable inference of C. elegans escape dynamics |
Authors | Bryan C. Daniels, William S. Ryu, Ilya Nemenman |
Abstract | The roundworm C. elegans exhibits robust escape behavior in response to rapidly rising temperature. The behavior lasts for a few seconds, shows history dependence, involves both sensory and motor systems, and is too complicated to model mechanistically using currently available knowledge. Instead we model the process phenomenologically, and we use the Sir Isaac dynamical inference platform to infer the model in a fully automated fashion directly from experimental data. The inferred model requires incorporation of an unobserved dynamical variable, and is biologically interpretable. The model makes accurate predictions about the dynamics of the worm behavior, and it can be used to characterize the functional logic of the dynamical system underlying the escape response. This work illustrates the power of modern artificial intelligence to aid in discovery of accurate and interpretable models of complex natural systems. |
Tasks | |
Published | 2018-09-25 |
URL | http://arxiv.org/abs/1809.09321v1 |
http://arxiv.org/pdf/1809.09321v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-predictive-and-interpretable |
Repo | |
Framework | |
Analysis of Cellular Feature Differences of Astrocytomas with Distinct Mutational Profiles Using Digitized Histopathology Images
Title | Analysis of Cellular Feature Differences of Astrocytomas with Distinct Mutational Profiles Using Digitized Histopathology Images |
Authors | Mousumi Roy, Fusheng Wang, George Teodoro, Jose Velazqeuz Vega, Daniel Brat, Jun Kong |
Abstract | Cellular phenotypic features derived from histopathology images are the basis of pathologic diagnosis and are thought to be related to underlying molecular profiles. Due to overwhelming cell numbers and population heterogeneity, it remains challenging to quantitatively compute and compare features of cells with distinct molecular signatures. In this study, we propose a self-reliant and efficient analysis framework that supports quantitative analysis of cellular phenotypic difference across distinct molecular groups. To demonstrate efficacy, we quantitatively analyze astrocytomas that are molecularly characterized as either Isocitrate Dehydrogenase (IDH) mutant (MUT) or wildtype (WT) using imaging data from The Cancer Genome Atlas database. Representative cell instances that are phenotypically different between these two groups are retrieved after segmentation, feature computation, data pruning, dimensionality reduction, and unsupervised clustering. Our analysis is generic and can be applied to a wide set of cell-based biomedical research. |
Tasks | Dimensionality Reduction |
Published | 2018-06-24 |
URL | http://arxiv.org/abs/1806.09093v1 |
http://arxiv.org/pdf/1806.09093v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-cellular-feature-differences-of |
Repo | |
Framework | |
A Dirichlet Process Mixture Model of Discrete Choice
Title | A Dirichlet Process Mixture Model of Discrete Choice |
Authors | Rico Krueger, Akshay Vij, Taha H. Rashidi |
Abstract | We present a mixed multinomial logit (MNL) model, which leverages the truncated stick-breaking process representation of the Dirichlet process as a flexible nonparametric mixing distribution. The proposed model is a Dirichlet process mixture model and accommodates discrete representations of heterogeneity, like a latent class MNL model. Yet, unlike a latent class MNL model, the proposed discrete choice model does not require the analyst to fix the number of mixture components prior to estimation, as the complexity of the discrete mixing distribution is inferred from the evidence. For posterior inference in the proposed Dirichlet process mixture model of discrete choice, we derive an expectation maximisation algorithm. In a simulation study, we demonstrate that the proposed model framework can flexibly capture differently-shaped taste parameter distributions. Furthermore, we empirically validate the model framework in a case study on motorists’ route choice preferences and find that the proposed Dirichlet process mixture model of discrete choice outperforms a latent class MNL model and mixed MNL models with common parametric mixing distributions in terms of both in-sample fit and out-of-sample predictive ability. Compared to extant modelling approaches, the proposed discrete choice model substantially abbreviates specification searches, as it relies on less restrictive parametric assumptions and does not require the analyst to specify the complexity of the discrete mixing distribution prior to estimation. |
Tasks | |
Published | 2018-01-19 |
URL | http://arxiv.org/abs/1801.06296v1 |
http://arxiv.org/pdf/1801.06296v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dirichlet-process-mixture-model-of-discrete |
Repo | |
Framework | |
Bayesian Incremental Learning for Deep Neural Networks
Title | Bayesian Incremental Learning for Deep Neural Networks |
Authors | Max Kochurov, Timur Garipov, Dmitry Podoprikhin, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov |
Abstract | In industrial machine learning pipelines, data often arrive in parts. Particularly in the case of deep neural networks, it may be too expensive to train the model from scratch each time, so one would rather use a previously learned model and the new data to improve performance. However, deep neural networks are prone to getting stuck in a suboptimal solution when trained on only new data as compared to the full dataset. Our work focuses on a continuous learning setup where the task is always the same and new parts of data arrive sequentially. We apply a Bayesian approach to update the posterior approximation with each new piece of data and find this method to outperform the traditional approach in our experiments. |
Tasks | |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.07329v3 |
http://arxiv.org/pdf/1802.07329v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-incremental-learning-for-deep-neural |
Repo | |
Framework | |
A Survey on Methods and Theories of Quantized Neural Networks
Title | A Survey on Methods and Theories of Quantized Neural Networks |
Authors | Yunhui Guo |
Abstract | Deep neural networks are the state-of-the-art methods for many real-world tasks, such as computer vision, natural language processing and speech recognition. For all its popularity, deep neural networks are also criticized for consuming a lot of memory and draining battery life of devices during training and inference. This makes it hard to deploy these models on mobile or embedded devices which have tight resource constraints. Quantization is recognized as one of the most effective approaches to satisfy the extreme memory requirements that deep neural network models demand. Instead of adopting 32-bit floating point format to represent weights, quantized representations store weights using more compact formats such as integers or even binary numbers. Despite a possible degradation in predictive performance, quantization provides a potential solution to greatly reduce the model size and the energy consumption. In this survey, we give a thorough review of different aspects of quantized neural networks. Current challenges and trends of quantized neural networks are also discussed. |
Tasks | Quantization, Speech Recognition |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04752v2 |
http://arxiv.org/pdf/1808.04752v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-methods-and-theories-of-quantized |
Repo | |
Framework | |