October 19, 2019

3036 words 15 mins read

Paper Group ANR 397

Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks. Deep Neural Net with Attention for Multi-channel Multi-touch Attribution. Trivial Transfer Learning for Low-Resource Neural Machine Translation. WNGrad: Learn the Learning Rate in Gradient Descent. Joint Aspect and Polarity Classification for Aspe …

Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks


Title	Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks
Authors	Marc Aubreville, Christof A. Bertram, Robert Klopfleisch, Andreas Maier
Abstract	Most tumor grading systems for human as for veterinary histopathology are based upon the absolute count of mitotic figures in a certain reference area of a histology slide. Since time for prognostication is limited in a diagnostic setting, the pathologist will often almost arbitrarily choose a certain field of interest assumed to have the highest mitotic activity. However, as mitotic figures are commonly very sparse on the slide and often have a patchy distribution, this poses a sampling problem which is known to be able to influence the tumor prognostication. On the other hand, automatic detection of mitotic figures can’t yet be considered reliable enough for clinical application. In order to aid the work of the human expert and at the same time reduce variance in tumor grading, it is beneficial to assess the whole slide image (WSI) for the highest mitotic activity and use this as a reference region for human counting. For this task, we compare two methods for region of interest proposal, both based on convolutional neural networks (CNN). For both approaches, the CNN performs a segmentation of the WSI to assess mitotic activity. The first method performs a segmentation at the original image resolution, while the second approach performs a segmentation operation at a significantly reduced resolution, cutting down on processing complexity. We evaluate the approach using a dataset of 32 completely annotated whole slide images of canine mast cell tumors, where 22 were used for training of the network and 10 for test. Our results indicate that, while the overall correlation to the ground truth mitotic activity is considerably higher (0.94 vs. 0.83) for the approach based upon the fine resolution network, the field of interest choices are only marginally better. Both approaches propose fields of interest that contain a mitotic count in the upper quartile of respective slides.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09197v1
PDF	http://arxiv.org/pdf/1810.09197v1.pdf
PWC	https://paperswithcode.com/paper/field-of-interest-proposal-for-augmented
Repo
Framework

Deep Neural Net with Attention for Multi-channel Multi-touch Attribution


Title	Deep Neural Net with Attention for Multi-channel Multi-touch Attribution
Authors	Ning li, Sai Kumar Arava, Chen Dong, Zhenyu Yan, Abhishek Pani
Abstract	Customers are usually exposed to online digital advertisement channels, such as email marketing, display advertising, paid search engine marketing, along their way to purchase or subscribe products( aka. conversion). The marketers track all the customer journey data and try to measure the effectiveness of each advertising channel. The inference about the influence of each channel plays an important role in budget allocation and inventory pricing decisions. Several simplistic rule-based strategies and data-driven algorithmic strategies have been widely used in marketing field, but they do not address the issues, such as channel interaction, time dependency, user characteristics. In this paper, we propose a novel attribution algorithm based on deep learning to assess the impact of each advertising channel. We present Deep Neural Net With Attention multi-touch attribution model (DNAMTA) model in a supervised learning fashion of predicting if a series of events leads to conversion, and it leads us to have a deep understanding of the dynamic interaction effects between media channels. DNAMTA also incorporates user-context information, such as user demographics and behavior, as control variables to reduce the estimation biases of media effects. We used computational experiment of large real world marketing dataset to demonstrate that our proposed model is superior to existing methods in both conversion prediction and media channel influence evaluation.
Tasks
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02230v1
PDF	http://arxiv.org/pdf/1809.02230v1.pdf
PWC	https://paperswithcode.com/paper/deep-neural-net-with-attention-for-multi
Repo
Framework

Trivial Transfer Learning for Low-Resource Neural Machine Translation


Title	Trivial Transfer Learning for Low-Resource Neural Machine Translation
Authors	Tom Kocmi, Ondřej Bojar
Abstract	Transfer learning has been proven as an effective technique for neural machine translation under low-resource conditions. Existing methods require a common target language, language relatedness, or specific training tricks and regimes. We present a simple transfer learning method, where we first train a “parent” model for a high-resource language pair and then continue the training on a lowresource pair only by replacing the training corpus. This “child” model performs significantly better than the baseline trained for lowresource pair only. We are the first to show this for targeting different languages, and we observe the improvements even for unrelated languages with different alphabets.
Tasks	Low-Resource Neural Machine Translation, Machine Translation, Transfer Learning
Published	2018-09-02
URL	http://arxiv.org/abs/1809.00357v1
PDF	http://arxiv.org/pdf/1809.00357v1.pdf
PWC	https://paperswithcode.com/paper/trivial-transfer-learning-for-low-resource
Repo
Framework

WNGrad: Learn the Learning Rate in Gradient Descent


Title	WNGrad: Learn the Learning Rate in Gradient Descent
Authors	Xiaoxia Wu, Rachel Ward, Léon Bottou
Abstract	Adjusting the learning rate schedule in stochastic gradient methods is an important unresolved problem which requires tuning in practice. If certain parameters of the loss function such as smoothness or strong convexity constants are known, theoretical learning rate schedules can be applied. However, in practice, such parameters are not known, and the loss function of interest is not convex in any case. The recently proposed batch normalization reparametrization is widely adopted in most neural network architectures today because, among other advantages, it is robust to the choice of Lipschitz constant of the gradient in loss function, allowing one to set a large learning rate without worry. Inspired by batch normalization, we propose a general nonlinear update rule for the learning rate in batch and stochastic gradient descent so that the learning rate can be initialized at a high value, and is subsequently decreased according to gradient observations along the way. The proposed method is shown to achieve robustness to the relationship between the learning rate and the Lipschitz constant, and near-optimal convergence rates in both the batch and stochastic settings ($O(1/T)$ for smooth loss in the batch setting, and $O(1/\sqrt{T})$ for convex loss in the stochastic setting). We also show through numerical evidence that such robustness of the proposed method extends to highly nonconvex and possibly non-smooth loss function in deep learning problems.Our analysis establishes some first theoretical understanding into the observed robustness for batch normalization and weight normalization.
Tasks
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02865v1
PDF	http://arxiv.org/pdf/1803.02865v1.pdf
PWC	https://paperswithcode.com/paper/wngrad-learn-the-learning-rate-in-gradient
Repo
Framework

Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks


Title	Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks
Authors	Martin Schmitt, Simon Steinheber, Konrad Schreiber, Benjamin Roth
Abstract	In this work, we propose a new model for aspect-based sentiment analysis. In contrast to previous approaches, we jointly model the detection of aspects and the classification of their polarity in an end-to-end trainable neural network. We conduct experiments with different neural architectures and word representations on the recent GermEval 2017 dataset. We were able to show considerable performance gains by using the joint modeling approach in all settings compared to pipeline approaches. The combination of a convolutional neural network and fasttext embeddings outperformed the best submission of the shared task in 2017, establishing a new state of the art.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09238v1
PDF	http://arxiv.org/pdf/1808.09238v1.pdf
PWC	https://paperswithcode.com/paper/joint-aspect-and-polarity-classification-for
Repo
Framework

Proximal boosting and its acceleration


Title	Proximal boosting and its acceleration
Authors	Erwan Fouillen, Claire Boyer, Maxime Sangnier
Abstract	Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model. From an optimization point of view, the learning procedure of gradient boosting mimics a gradient descent on a functional variable. This paper proposes to build upon the proximal point algorithm when the empirical risk to minimize is not differentiable to introduce a novel boosting approach, called proximal boosting. Besides being motivated by non-differentiable optimization, the proposed algorithm benefits from Nesterov’s acceleration in the same way as gradient boosting [Biau et al., 2018]. This leads to a variant, called accelerated proximal boosting. Advantages of leveraging proximal methods for boosting are illustrated by numerical experiments on simulated and real-world data. In particular, we exhibit a favorable comparison over gradient boosting regarding convergence rate and prediction accuracy.
Tasks
Published	2018-08-29
URL	https://arxiv.org/abs/1808.09670v2
PDF	https://arxiv.org/pdf/1808.09670v2.pdf
PWC	https://paperswithcode.com/paper/accelerated-proximal-boosting
Repo
Framework

Theory of Parameter Control for Discrete Black-Box Optimization: Provable Performance Gains Through Dynamic Parameter Choices


Title	Theory of Parameter Control for Discrete Black-Box Optimization: Provable Performance Gains Through Dynamic Parameter Choices
Authors	Benjamin Doerr, Carola Doerr
Abstract	Parameter control aims at realizing performance gains through a dynamic choice of the parameters which determine the behavior of the underlying optimization algorithm. In the context of evolutionary algorithms this research line has for a long time been dominated by empirical approaches. With the significant advances in running time analysis achieved in the last ten years, the parameter control question has become accessible to theoretical investigations. A number of running time results for a broad range of different parameter control mechanisms have been obtained in recent years. This book chapter surveys these works, and puts them into context, by proposing an updated classification scheme for parameter control.
Tasks
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05650v2
PDF	http://arxiv.org/pdf/1804.05650v2.pdf
PWC	https://paperswithcode.com/paper/theory-of-parameter-control-for-discrete
Repo
Framework

Structure-Based Networks for Drug Validation


Title	Structure-Based Networks for Drug Validation
Authors	Cătălina Cangea, Arturas Grauslys, Pietro Liò, Francesco Falciani
Abstract	Classifying chemicals according to putative modes of action (MOAs) is of paramount importance in the context of risk assessment. However, current methods are only able to handle a very small proportion of the existing chemicals. We address this issue by proposing an integrative deep learning architecture that learns a joint representation from molecular structures of drugs and their effects on human cells. Our choice of architecture is motivated by the significant influence of a drug’s chemical structure on its MOA. We improve on the strong ability of a unimodal architecture (F1 score of 0.803) to classify drugs by their toxic MOAs (Verhaar scheme) through adding another learning stream that processes transcriptional responses of human cells affected by drugs. Our integrative model achieves an even higher classification performance on the LINCS L1000 dataset - the error is reduced by 4.6%. We believe that our method can be used to extend the current Verhaar scheme and constitute a basis for fast drug validation and risk assessment.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1811.09714v1
PDF	http://arxiv.org/pdf/1811.09714v1.pdf
PWC	https://paperswithcode.com/paper/structure-based-networks-for-drug-validation
Repo
Framework

Bayesian Inference of Regular Expressions from Human-Generated Example Strings


Title	Bayesian Inference of Regular Expressions from Human-Generated Example Strings
Authors	Long Ouyang
Abstract	In programming by example, users “write” programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs. We consider a challenging problem in this domain: learning regular expressions (regexes) from positive and negative example strings. This problem is challenging, as (1) user-generated examples may not be informative enough to sufficiently constrain the hypothesis space, and (2) even if user-generated examples are in principle informative, there is still a massive search space to examine. We frame regex induction as the problem of inferring a probabilistic regular grammar and propose an efficient inference approach that uses a novel stochastic process recognition model. This model incrementally “grows” a grammar using positive examples as a scaffold. We show that this approach is competitive with human ability to learn regexes from examples.
Tasks	Bayesian Inference
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08427v2
PDF	http://arxiv.org/pdf/1805.08427v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-inference-of-regular-expressions
Repo
Framework


Title	Speaker-Follower Models for Vision-and-Language Navigation
Authors	Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, Trevor Darrell
Abstract	Navigation guided by natural language instructions presents a challenging reasoning problem for instruction followers. Natural language instructions typically identify only a few high-level decisions and landmarks rather than complete low-level motor behaviors; much of the missing information must be inferred based on perceptual context. In machine learning settings, this is doubly challenging: it is difficult to collect enough annotated data to enable learning of this reasoning process from scratch, and also difficult to implement the reasoning process using generic sequence models. Here we describe an approach to vision-and-language navigation that addresses both these issues with an embedded speaker model. We use this speaker model to (1) synthesize new instructions for data augmentation and to (2) implement pragmatic reasoning, which evaluates how well candidate action sequences explain an instruction. Both steps are supported by a panoramic action space that reflects the granularity of human-generated instructions. Experiments show that all three components of this approach—speaker-driven data augmentation, pragmatic reasoning and panoramic action space—dramatically improve the performance of a baseline instruction follower, more than doubling the success rate over the best existing approach on a standard benchmark.
Tasks	Data Augmentation
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02724v2
PDF	http://arxiv.org/pdf/1806.02724v2.pdf
PWC	https://paperswithcode.com/paper/speaker-follower-models-for-vision-and
Repo
Framework

Robust Dual View Deep Agent


Title	Robust Dual View Deep Agent
Authors	Ibrahim M. Sobh, Nevin M. Darwish
Abstract	Motivated by recent advance of machine learning using Deep Reinforcement Learning this paper proposes a modified architecture that produces more robust agents and speeds up the training process. Our architecture is based on Asynchronous Advantage Actor-Critic (A3C) algorithm where the total input dimensionality is halved by dividing the input into two independent streams. We use ViZDoom, 3D world software that is based on the classical first person shooter video game, Doom, as a test case. The experiments show that in comparison to single input agents, the proposed architecture succeeds to have the same playing performance and shows more robust behavior, achieving significant reduction in the number of training parameters of almost 30%.
Tasks
Published	2018-04-13
URL	http://arxiv.org/abs/1804.05120v2
PDF	http://arxiv.org/pdf/1804.05120v2.pdf
PWC	https://paperswithcode.com/paper/robust-dual-view-deep-agent
Repo
Framework

Blockchain and Artificial Intelligence


Title	Blockchain and Artificial Intelligence
Authors	Tshilidzi Marwala, Bo Xing
Abstract	It is undeniable that artificial intelligence (AI) and blockchain concepts are spreading at a phenomenal rate. Both technologies have distinct degree of technological complexity and multi-dimensional business implications. However, a common misunderstanding about blockchain concept, in particular, is that blockchain is decentralized and is not controlled by anyone. But the underlying development of a blockchain system is still attributed to a cluster of core developers. Take smart contract as an example, it is essentially a collection of codes (or functions) and data (or states) that are programmed and deployed on a blockchain (say, Ethereum) by different human programmers. It is thus, unfortunately, less likely to be free of loopholes and flaws. In this article, through a brief overview about how artificial intelligence could be used to deliver bug-free smart contract so as to achieve the goal of blockchain 2.0, we to emphasize that the blockchain implementation can be assisted or enhanced via various AI techniques. The alliance of AI and blockchain is expected to create numerous possibilities.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04451v2
PDF	http://arxiv.org/pdf/1802.04451v2.pdf
PWC	https://paperswithcode.com/paper/blockchain-and-artificial-intelligence
Repo
Framework

Blockchain Enabled Trustless API Marketplace


Title	Blockchain Enabled Trustless API Marketplace
Authors	Vijay Arya, Sayandeep Sen, Palani Kodeswaran
Abstract	There has been an unprecedented surge in the number of service providers offering a wide range of machine learning prediction APIs for tasks such as image classification, language translation, etc. thereby monetizing the underlying data and trained models. Typically, a data owner (API provider) develops a model, often over proprietary data, and leverages the infrastructure services of a cloud vendor for hosting and serving API requests. Clearly, this model assumes complete trust between the API Provider and cloud vendor. On the other hand, a malicious/buggy cloud vendor may copy the APIs and offer an identical service, under-report model usage metrics, or unfairly discriminate between different API providers by offering them a nominal share of the revenue. In this work, we present the design of a blockchain based decentralized trustless API marketplace that enables all the stakeholders in the API ecosystem to audit the behavior of the parties without having to trust a single centralized entity. In particular, our system divides an AI model into multiple pieces and deploys them among multiple cloud vendors who then collaboratively execute the APIs. Our design ensures that cloud vendors cannot collude with each other to steal the combined model, while individual cloud vendors and clients cannot repudiate their input or model executions.
Tasks	Image Classification
Published	2018-12-05
URL	http://arxiv.org/abs/1812.02154v1
PDF	http://arxiv.org/pdf/1812.02154v1.pdf
PWC	https://paperswithcode.com/paper/blockchain-enabled-trustless-api-marketplace
Repo
Framework

Parameter Learning and Change Detection Using a Particle Filter With Accelerated Adaptation


Title	Parameter Learning and Change Detection Using a Particle Filter With Accelerated Adaptation
Authors	Karol Gellert, Erik Schlögl
Abstract	This paper presents the construction of a particle filter, which incorporates elements inspired by genetic algorithms, in order to achieve accelerated adaptation of the estimated posterior distribution to changes in model parameters. Specifically, the filter is designed for the situation where the subsequent data in online sequential filtering does not match the model posterior filtered based on data up to a current point in time. The examples considered encompass parameter regime shifts and stochastic volatility. The filter adapts to regime shifts extremely rapidly and delivers a clear heuristic for distinguishing between regime shifts and stochastic volatility, even though the model dynamics assumed by the filter exhibit neither of those features.
Tasks
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05387v1
PDF	http://arxiv.org/pdf/1806.05387v1.pdf
PWC	https://paperswithcode.com/paper/parameter-learning-and-change-detection-using
Repo
Framework

Code Review Comments: Language Matters


Title	Code Review Comments: Language Matters
Authors	Vasiliki Efstathiou, Diomidis Spinellis
Abstract	Recent research provides evidence that effective communication in collaborative software development has significant impact on the software development lifecycle. Although related qualitative and quantitative studies point out textual characteristics of well-formed messages, the underlying semantics of the intertwined linguistic structures still remain largely misinterpreted or ignored. Especially, regarding quality of code reviews the importance of thorough feedback, and explicit rationale is often mentioned but rarely linked with related linguistic features. As a first step towards addressing this shortcoming, we propose grounding these studies on theories of linguistics. We particularly focus on linguistic structures of coherent speech and explain how they can be exploited in practice. We reflect on related approaches and examine through a preliminary study on four open source projects, possible links between existing findings and the directions we suggest for detecting textual features of useful code reviews.
Tasks
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02205v1
PDF	http://arxiv.org/pdf/1803.02205v1.pdf
PWC	https://paperswithcode.com/paper/code-review-comments-language-matters
Repo
Framework