Paper Group ANR 397
Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks. Deep Neural Net with Attention for Multi-channel Multi-touch Attribution. Trivial Transfer Learning for Low-Resource Neural Machine Translation. WNGrad: Learn the Learning Rate in Gradient Descent. Joint Aspect and Polarity Classification for Aspe …
Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks
Title | Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks |
Authors | Marc Aubreville, Christof A. Bertram, Robert Klopfleisch, Andreas Maier |
Abstract | Most tumor grading systems for human as for veterinary histopathology are based upon the absolute count of mitotic figures in a certain reference area of a histology slide. Since time for prognostication is limited in a diagnostic setting, the pathologist will often almost arbitrarily choose a certain field of interest assumed to have the highest mitotic activity. However, as mitotic figures are commonly very sparse on the slide and often have a patchy distribution, this poses a sampling problem which is known to be able to influence the tumor prognostication. On the other hand, automatic detection of mitotic figures can’t yet be considered reliable enough for clinical application. In order to aid the work of the human expert and at the same time reduce variance in tumor grading, it is beneficial to assess the whole slide image (WSI) for the highest mitotic activity and use this as a reference region for human counting. For this task, we compare two methods for region of interest proposal, both based on convolutional neural networks (CNN). For both approaches, the CNN performs a segmentation of the WSI to assess mitotic activity. The first method performs a segmentation at the original image resolution, while the second approach performs a segmentation operation at a significantly reduced resolution, cutting down on processing complexity. We evaluate the approach using a dataset of 32 completely annotated whole slide images of canine mast cell tumors, where 22 were used for training of the network and 10 for test. Our results indicate that, while the overall correlation to the ground truth mitotic activity is considerably higher (0.94 vs. 0.83) for the approach based upon the fine resolution network, the field of interest choices are only marginally better. Both approaches propose fields of interest that contain a mitotic count in the upper quartile of respective slides. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09197v1 |
http://arxiv.org/pdf/1810.09197v1.pdf | |
PWC | https://paperswithcode.com/paper/field-of-interest-proposal-for-augmented |
Repo | |
Framework | |
Deep Neural Net with Attention for Multi-channel Multi-touch Attribution
Title | Deep Neural Net with Attention for Multi-channel Multi-touch Attribution |
Authors | Ning li, Sai Kumar Arava, Chen Dong, Zhenyu Yan, Abhishek Pani |
Abstract | Customers are usually exposed to online digital advertisement channels, such as email marketing, display advertising, paid search engine marketing, along their way to purchase or subscribe products( aka. conversion). The marketers track all the customer journey data and try to measure the effectiveness of each advertising channel. The inference about the influence of each channel plays an important role in budget allocation and inventory pricing decisions. Several simplistic rule-based strategies and data-driven algorithmic strategies have been widely used in marketing field, but they do not address the issues, such as channel interaction, time dependency, user characteristics. In this paper, we propose a novel attribution algorithm based on deep learning to assess the impact of each advertising channel. We present Deep Neural Net With Attention multi-touch attribution model (DNAMTA) model in a supervised learning fashion of predicting if a series of events leads to conversion, and it leads us to have a deep understanding of the dynamic interaction effects between media channels. DNAMTA also incorporates user-context information, such as user demographics and behavior, as control variables to reduce the estimation biases of media effects. We used computational experiment of large real world marketing dataset to demonstrate that our proposed model is superior to existing methods in both conversion prediction and media channel influence evaluation. |
Tasks | |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02230v1 |
http://arxiv.org/pdf/1809.02230v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-net-with-attention-for-multi |
Repo | |
Framework | |
Trivial Transfer Learning for Low-Resource Neural Machine Translation
Title | Trivial Transfer Learning for Low-Resource Neural Machine Translation |
Authors | Tom Kocmi, Ondřej Bojar |
Abstract | Transfer learning has been proven as an effective technique for neural machine translation under low-resource conditions. Existing methods require a common target language, language relatedness, or specific training tricks and regimes. We present a simple transfer learning method, where we first train a “parent” model for a high-resource language pair and then continue the training on a lowresource pair only by replacing the training corpus. This “child” model performs significantly better than the baseline trained for lowresource pair only. We are the first to show this for targeting different languages, and we observe the improvements even for unrelated languages with different alphabets. |
Tasks | Low-Resource Neural Machine Translation, Machine Translation, Transfer Learning |
Published | 2018-09-02 |
URL | http://arxiv.org/abs/1809.00357v1 |
http://arxiv.org/pdf/1809.00357v1.pdf | |
PWC | https://paperswithcode.com/paper/trivial-transfer-learning-for-low-resource |
Repo | |
Framework | |
WNGrad: Learn the Learning Rate in Gradient Descent
Title | WNGrad: Learn the Learning Rate in Gradient Descent |
Authors | Xiaoxia Wu, Rachel Ward, Léon Bottou |
Abstract | Adjusting the learning rate schedule in stochastic gradient methods is an important unresolved problem which requires tuning in practice. If certain parameters of the loss function such as smoothness or strong convexity constants are known, theoretical learning rate schedules can be applied. However, in practice, such parameters are not known, and the loss function of interest is not convex in any case. The recently proposed batch normalization reparametrization is widely adopted in most neural network architectures today because, among other advantages, it is robust to the choice of Lipschitz constant of the gradient in loss function, allowing one to set a large learning rate without worry. Inspired by batch normalization, we propose a general nonlinear update rule for the learning rate in batch and stochastic gradient descent so that the learning rate can be initialized at a high value, and is subsequently decreased according to gradient observations along the way. The proposed method is shown to achieve robustness to the relationship between the learning rate and the Lipschitz constant, and near-optimal convergence rates in both the batch and stochastic settings ($O(1/T)$ for smooth loss in the batch setting, and $O(1/\sqrt{T})$ for convex loss in the stochastic setting). We also show through numerical evidence that such robustness of the proposed method extends to highly nonconvex and possibly non-smooth loss function in deep learning problems.Our analysis establishes some first theoretical understanding into the observed robustness for batch normalization and weight normalization. |
Tasks | |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02865v1 |
http://arxiv.org/pdf/1803.02865v1.pdf | |
PWC | https://paperswithcode.com/paper/wngrad-learn-the-learning-rate-in-gradient |
Repo | |
Framework | |
Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks
Title | Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks |
Authors | Martin Schmitt, Simon Steinheber, Konrad Schreiber, Benjamin Roth |
Abstract | In this work, we propose a new model for aspect-based sentiment analysis. In contrast to previous approaches, we jointly model the detection of aspects and the classification of their polarity in an end-to-end trainable neural network. We conduct experiments with different neural architectures and word representations on the recent GermEval 2017 dataset. We were able to show considerable performance gains by using the joint modeling approach in all settings compared to pipeline approaches. The combination of a convolutional neural network and fasttext embeddings outperformed the best submission of the shared task in 2017, establishing a new state of the art. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09238v1 |
http://arxiv.org/pdf/1808.09238v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-aspect-and-polarity-classification-for |
Repo | |
Framework | |
Proximal boosting and its acceleration
Title | Proximal boosting and its acceleration |
Authors | Erwan Fouillen, Claire Boyer, Maxime Sangnier |
Abstract | Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm when the empirical risk to minimize is not differentiable to introduce a novel boosting approach, called proximal boosting. Besides being motivated by non-differentiable optimization, the proposed algorithm benefits from Nesterov’s acceleration in the same way as gradient boosting [Biau et al., 2018]. This leads to a variant, called accelerated proximal boosting. Advantages of leveraging proximal methods for boosting are illustrated by numerical experiments on simulated and real-world data. In particular, we exhibit a favorable comparison over gradient boosting regarding convergence rate and prediction accuracy. |
Tasks | |
Published | 2018-08-29 |
URL | https://arxiv.org/abs/1808.09670v2 |
https://arxiv.org/pdf/1808.09670v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-proximal-boosting |
Repo | |
Framework | |
Theory of Parameter Control for Discrete Black-Box Optimization: Provable Performance Gains Through Dynamic Parameter Choices
Title | Theory of Parameter Control for Discrete Black-Box Optimization: Provable Performance Gains Through Dynamic Parameter Choices |
Authors | Benjamin Doerr, Carola Doerr |
Abstract | Parameter control aims at realizing performance gains through a dynamic choice of the parameters which determine the behavior of the underlying optimization algorithm. In the context of evolutionary algorithms this research line has for a long time been dominated by empirical approaches. With the significant advances in running time analysis achieved in the last ten years, the parameter control question has become accessible to theoretical investigations. A number of running time results for a broad range of different parameter control mechanisms have been obtained in recent years. This book chapter surveys these works, and puts them into context, by proposing an updated classification scheme for parameter control. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05650v2 |
http://arxiv.org/pdf/1804.05650v2.pdf | |
PWC | https://paperswithcode.com/paper/theory-of-parameter-control-for-discrete |
Repo | |
Framework | |
Structure-Based Networks for Drug Validation
Title | Structure-Based Networks for Drug Validation |
Authors | Cătălina Cangea, Arturas Grauslys, Pietro Liò, Francesco Falciani |
Abstract | Classifying chemicals according to putative modes of action (MOAs) is of paramount importance in the context of risk assessment. However, current methods are only able to handle a very small proportion of the existing chemicals. We address this issue by proposing an integrative deep learning architecture that learns a joint representation from molecular structures of drugs and their effects on human cells. Our choice of architecture is motivated by the significant influence of a drug’s chemical structure on its MOA. We improve on the strong ability of a unimodal architecture (F1 score of 0.803) to classify drugs by their toxic MOAs (Verhaar scheme) through adding another learning stream that processes transcriptional responses of human cells affected by drugs. Our integrative model achieves an even higher classification performance on the LINCS L1000 dataset - the error is reduced by 4.6%. We believe that our method can be used to extend the current Verhaar scheme and constitute a basis for fast drug validation and risk assessment. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.09714v1 |
http://arxiv.org/pdf/1811.09714v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-based-networks-for-drug-validation |
Repo | |
Framework | |
Bayesian Inference of Regular Expressions from Human-Generated Example Strings
Title | Bayesian Inference of Regular Expressions from Human-Generated Example Strings |
Authors | Long Ouyang |
Abstract | In programming by example, users “write” programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs. We consider a challenging problem in this domain: learning regular expressions (regexes) from positive and negative example strings. This problem is challenging, as (1) user-generated examples may not be informative enough to sufficiently constrain the hypothesis space, and (2) even if user-generated examples are in principle informative, there is still a massive search space to examine. We frame regex induction as the problem of inferring a probabilistic regular grammar and propose an efficient inference approach that uses a novel stochastic process recognition model. This model incrementally “grows” a grammar using positive examples as a scaffold. We show that this approach is competitive with human ability to learn regexes from examples. |
Tasks | Bayesian Inference |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08427v2 |
http://arxiv.org/pdf/1805.08427v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-inference-of-regular-expressions |
Repo | |
Framework | |
Speaker-Follower Models for Vision-and-Language Navigation
Title | Speaker-Follower Models for Vision-and-Language Navigation |
Authors | Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, Trevor Darrell |
Abstract | Navigation guided by natural language instructions presents a challenging reasoning problem for instruction followers. Natural language instructions typically identify only a few high-level decisions and landmarks rather than complete low-level motor behaviors; much of the missing information must be inferred based on perceptual context. In machine learning settings, this is doubly challenging: it is difficult to collect enough annotated data to enable learning of this reasoning process from scratch, and also difficult to implement the reasoning process using generic sequence models. Here we describe an approach to vision-and-language navigation that addresses both these issues with an embedded speaker model. We use this speaker model to (1) synthesize new instructions for data augmentation and to (2) implement pragmatic reasoning, which evaluates how well candidate action sequences explain an instruction. Both steps are supported by a panoramic action space that reflects the granularity of human-generated instructions. Experiments show that all three components of this approach—speaker-driven data augmentation, pragmatic reasoning and panoramic action space—dramatically improve the performance of a baseline instruction follower, more than doubling the success rate over the best existing approach on a standard benchmark. |
Tasks | Data Augmentation |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02724v2 |
http://arxiv.org/pdf/1806.02724v2.pdf | |
PWC | https://paperswithcode.com/paper/speaker-follower-models-for-vision-and |
Repo | |
Framework | |
Robust Dual View Deep Agent
Title | Robust Dual View Deep Agent |
Authors | Ibrahim M. Sobh, Nevin M. Darwish |
Abstract | Motivated by recent advance of machine learning using Deep Reinforcement Learning this paper proposes a modified architecture that produces more robust agents and speeds up the training process. Our architecture is based on Asynchronous Advantage Actor-Critic (A3C) algorithm where the total input dimensionality is halved by dividing the input into two independent streams. We use ViZDoom, 3D world software that is based on the classical first person shooter video game, Doom, as a test case. The experiments show that in comparison to single input agents, the proposed architecture succeeds to have the same playing performance and shows more robust behavior, achieving significant reduction in the number of training parameters of almost 30%. |
Tasks | |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.05120v2 |
http://arxiv.org/pdf/1804.05120v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-dual-view-deep-agent |
Repo | |
Framework | |
Blockchain and Artificial Intelligence
Title | Blockchain and Artificial Intelligence |
Authors | Tshilidzi Marwala, Bo Xing |
Abstract | It is undeniable that artificial intelligence (AI) and blockchain concepts are spreading at a phenomenal rate. Both technologies have distinct degree of technological complexity and multi-dimensional business implications. However, a common misunderstanding about blockchain concept, in particular, is that blockchain is decentralized and is not controlled by anyone. But the underlying development of a blockchain system is still attributed to a cluster of core developers. Take smart contract as an example, it is essentially a collection of codes (or functions) and data (or states) that are programmed and deployed on a blockchain (say, Ethereum) by different human programmers. It is thus, unfortunately, less likely to be free of loopholes and flaws. In this article, through a brief overview about how artificial intelligence could be used to deliver bug-free smart contract so as to achieve the goal of blockchain 2.0, we to emphasize that the blockchain implementation can be assisted or enhanced via various AI techniques. The alliance of AI and blockchain is expected to create numerous possibilities. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04451v2 |
http://arxiv.org/pdf/1802.04451v2.pdf | |
PWC | https://paperswithcode.com/paper/blockchain-and-artificial-intelligence |
Repo | |
Framework | |
Blockchain Enabled Trustless API Marketplace
Title | Blockchain Enabled Trustless API Marketplace |
Authors | Vijay Arya, Sayandeep Sen, Palani Kodeswaran |
Abstract | There has been an unprecedented surge in the number of service providers offering a wide range of machine learning prediction APIs for tasks such as image classification, language translation, etc. thereby monetizing the underlying data and trained models. Typically, a data owner (API provider) develops a model, often over proprietary data, and leverages the infrastructure services of a cloud vendor for hosting and serving API requests. Clearly, this model assumes complete trust between the API Provider and cloud vendor. On the other hand, a malicious/buggy cloud vendor may copy the APIs and offer an identical service, under-report model usage metrics, or unfairly discriminate between different API providers by offering them a nominal share of the revenue. In this work, we present the design of a blockchain based decentralized trustless API marketplace that enables all the stakeholders in the API ecosystem to audit the behavior of the parties without having to trust a single centralized entity. In particular, our system divides an AI model into multiple pieces and deploys them among multiple cloud vendors who then collaboratively execute the APIs. Our design ensures that cloud vendors cannot collude with each other to steal the combined model, while individual cloud vendors and clients cannot repudiate their input or model executions. |
Tasks | Image Classification |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.02154v1 |
http://arxiv.org/pdf/1812.02154v1.pdf | |
PWC | https://paperswithcode.com/paper/blockchain-enabled-trustless-api-marketplace |
Repo | |
Framework | |
Parameter Learning and Change Detection Using a Particle Filter With Accelerated Adaptation
Title | Parameter Learning and Change Detection Using a Particle Filter With Accelerated Adaptation |
Authors | Karol Gellert, Erik Schlögl |
Abstract | This paper presents the construction of a particle filter, which incorporates elements inspired by genetic algorithms, in order to achieve accelerated adaptation of the estimated posterior distribution to changes in model parameters. Specifically, the filter is designed for the situation where the subsequent data in online sequential filtering does not match the model posterior filtered based on data up to a current point in time. The examples considered encompass parameter regime shifts and stochastic volatility. The filter adapts to regime shifts extremely rapidly and delivers a clear heuristic for distinguishing between regime shifts and stochastic volatility, even though the model dynamics assumed by the filter exhibit neither of those features. |
Tasks | |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05387v1 |
http://arxiv.org/pdf/1806.05387v1.pdf | |
PWC | https://paperswithcode.com/paper/parameter-learning-and-change-detection-using |
Repo | |
Framework | |
Code Review Comments: Language Matters
Title | Code Review Comments: Language Matters |
Authors | Vasiliki Efstathiou, Diomidis Spinellis |
Abstract | Recent research provides evidence that effective communication in collaborative software development has significant impact on the software development lifecycle. Although related qualitative and quantitative studies point out textual characteristics of well-formed messages, the underlying semantics of the intertwined linguistic structures still remain largely misinterpreted or ignored. Especially, regarding quality of code reviews the importance of thorough feedback, and explicit rationale is often mentioned but rarely linked with related linguistic features. As a first step towards addressing this shortcoming, we propose grounding these studies on theories of linguistics. We particularly focus on linguistic structures of coherent speech and explain how they can be exploited in practice. We reflect on related approaches and examine through a preliminary study on four open source projects, possible links between existing findings and the directions we suggest for detecting textual features of useful code reviews. |
Tasks | |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02205v1 |
http://arxiv.org/pdf/1803.02205v1.pdf | |
PWC | https://paperswithcode.com/paper/code-review-comments-language-matters |
Repo | |
Framework | |