January 29, 2020

2951 words 14 mins read

Paper Group ANR 517

Paper Group ANR 517

Adversary-resilient Distributed and Decentralized Statistical Inference and Machine Learning. Dialog on a canvas with a machine. An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems. DAL: Dual Adversarial Learning for Dialogue Generation. Automated pulmonary nodule detection using 3D deep co …

Adversary-resilient Distributed and Decentralized Statistical Inference and Machine Learning

Title Adversary-resilient Distributed and Decentralized Statistical Inference and Machine Learning
Authors Zhixiong Yang, Arpita Gang, Waheed U. Bajwa
Abstract While the last few decades have witnessed a huge body of work devoted to inference and learning in distributed and decentralized setups, much of this work assumes a non-adversarial setting in which individual nodes—apart from occasional statistical failures—operate as intended within the algorithmic framework. In recent years, however, cybersecurity threats from malicious non-state actors and rogue entities have forced practitioners and researchers to rethink the robustness of distributed and decentralized algorithms against adversarial attacks. As a result, we now have a plethora of algorithmic approaches that guarantee robustness of distributed and/or decentralized inference and learning under different adversarial threat models. Driven in part by the world’s growing appetite for data-driven decision making, however, securing of distributed/decentralized frameworks for inference and learning against adversarial threats remains a rapidly evolving research area. In this article, we provide an overview of some of the most recent developments in this area under the threat model of Byzantine attacks.
Tasks Decision Making
Published 2019-08-23
URL https://arxiv.org/abs/1908.08649v2
PDF https://arxiv.org/pdf/1908.08649v2.pdf
PWC https://paperswithcode.com/paper/adversary-resilient-inference-and-machine
Repo
Framework

Dialog on a canvas with a machine

Title Dialog on a canvas with a machine
Authors Vivien Cabannes, Thomas Kerdreux, Louis Thiry, Tina Campana, Charly Ferrandes
Abstract We propose a new form of human-machine interaction. It is a pictorial game consisting of interactive rounds of creation between artists and a machine. They repetitively paint one after the other. At its rounds, the computer partially completes the drawing using machine learning algorithms, and projects its additions directly on the canvas, which the artists are free to insert or modify. Alongside fostering creativity, the process is designed to question the growing interaction between humans and machines.
Tasks
Published 2019-10-10
URL https://arxiv.org/abs/1910.04386v2
PDF https://arxiv.org/pdf/1910.04386v2.pdf
PWC https://paperswithcode.com/paper/dialog-on-a-canvas-with-a-machine
Repo
Framework

An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems

Title An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems
Authors Heinke Hihn, Sebastian Gottwald, Daniel A. Braun
Abstract Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an information-theoretic principle that drives division of labor and specialization when decision-makers with information constraints are joined together. We devise an on-line learning rule of this principle that learns a partitioning of the problem space such that it can be solved by specialized linear policies. We demonstrate the approach for decision-making problems whose complexity exceeds the capabilities of individual decision-makers, but can be solved by combining the decision-makers optimally. The strength of the model is that it is abstract and principled, yet has direct applications in classification, regression, reinforcement learning and adaptive control.
Tasks Decision Making
Published 2019-07-26
URL https://arxiv.org/abs/1907.11452v3
PDF https://arxiv.org/pdf/1907.11452v3.pdf
PWC https://paperswithcode.com/paper/an-information-theoretic-on-line-learning
Repo
Framework

DAL: Dual Adversarial Learning for Dialogue Generation

Title DAL: Dual Adversarial Learning for Dialogue Generation
Authors Shaobo Cui, Rongzhong Lian, Di Jiang, Yuanfeng Song, Siqi Bao, Yong Jiang
Abstract In open-domain dialogue systems, generative approaches have attracted much attention for response generation. However, existing methods are heavily plagued by generating safe responses and unnatural responses. To alleviate these two problems, we propose a novel framework named Dual Adversarial Learning (DAL) for high-quality response generation. DAL is the first work to innovatively utilizes the duality between query generation and response generation to avoid safe responses and increase the diversity of the generated responses. Additionally, DAL uses adversarial learning to mimic human judges and guides the system to generate natural responses. Experimental results demonstrate that DAL effectively improves both diversity and overall quality of the generated responses. DAL outperforms the state-of-the-art methods regarding automatic metrics and human evaluations.
Tasks Dialogue Generation
Published 2019-06-23
URL https://arxiv.org/abs/1906.09556v1
PDF https://arxiv.org/pdf/1906.09556v1.pdf
PWC https://paperswithcode.com/paper/dal-dual-adversarial-learning-for-dialogue-1
Repo
Framework

Automated pulmonary nodule detection using 3D deep convolutional neural networks

Title Automated pulmonary nodule detection using 3D deep convolutional neural networks
Authors Hao Tang, Daniel R. Kim, Xiaohui Xie
Abstract Early detection of pulmonary nodules in computed tomography (CT) images is essential for successful outcomes among lung cancer patients. Much attention has been given to deep convolutional neural network (DCNN)-based approaches to this task, but models have relied at least partly on 2D or 2.5D components for inherently 3D data. In this paper, we introduce a novel DCNN approach, consisting of two stages, that is fully three-dimensional end-to-end and utilizes the state-of-the-art in object detection. First, nodule candidates are identified with a U-Net-inspired 3D Faster R-CNN trained using online hard negative mining. Second, false positive reduction is performed by 3D DCNN classifiers trained on difficult examples produced during candidate screening. Finally, we introduce a method to ensemble models from both stages via consensus to give the final predictions. By using this framework, we ranked first of 2887 teams in Season One of Alibaba’s 2017 TianChi AI Competition for Healthcare.
Tasks Computed Tomography (CT), Object Detection
Published 2019-03-23
URL http://arxiv.org/abs/1903.09876v1
PDF http://arxiv.org/pdf/1903.09876v1.pdf
PWC https://paperswithcode.com/paper/automated-pulmonary-nodule-detection-using-3d
Repo
Framework

Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps for Time Series Prediction

Title Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps for Time Series Prediction
Authors Bryan Lim, Stefan Zohren, Stephen Roberts
Abstract Despite the recent popularity of deep generative state space models, few comparisons have been made between network architectures and the inference steps of the Bayesian filtering framework – with most models simultaneously approximating both state transition and update steps with a single recurrent neural network (RNN). In this paper, we introduce the Recurrent Neural Filter (RNF), a novel recurrent autoencoder architecture that learns distinct representations for each Bayesian filtering step, captured by a series of encoders and decoders. Testing this on three real-world time series datasets, we demonstrate that the decoupled representations learnt not only improve the accuracy of one-step-ahead forecasts while providing realistic uncertainty estimates, but also facilitate multistep prediction through the separation of encoder stages.
Tasks Time Series, Time Series Prediction
Published 2019-01-23
URL https://arxiv.org/abs/1901.08096v5
PDF https://arxiv.org/pdf/1901.08096v5.pdf
PWC https://paperswithcode.com/paper/recurrent-neural-filters-learning-independent
Repo
Framework

Predicting Weather Uncertainty with Deep Convnets

Title Predicting Weather Uncertainty with Deep Convnets
Authors Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, Torsten Hoefler
Abstract Modern weather forecast models perform uncertainty quantification using ensemble prediction systems, which collect nonparametric statistics based on multiple perturbed simulations. To provide accurate estimation, dozens of such computationally intensive simulations must be run. We show that deep neural networks can be used on a small set of numerical weather simulations to estimate the spread of a weather forecast, significantly reducing computational cost. To train the system, we both modify the 3D U-Net architecture and explore models that incorporate temporal data. Our models serve as a starting point to improve uncertainty quantification in current real-time weather forecasting systems, which is vital for predicting extreme events.
Tasks Weather Forecasting
Published 2019-11-02
URL https://arxiv.org/abs/1911.00630v2
PDF https://arxiv.org/pdf/1911.00630v2.pdf
PWC https://paperswithcode.com/paper/predicting-weather-uncertainty-with-deep
Repo
Framework

Persona-Aware Tips Generation

Title Persona-Aware Tips Generation
Authors Piji Li, Zihao Wang, Lidong Bing, Wai Lam
Abstract Tips, as a compacted and concise form of reviews, were paid less attention by researchers. In this paper, we investigate the task of tips generation by considering the `persona’ information which captures the intrinsic language style of the users or the different characteristics of the product items. In order to exploit the persona information, we propose a framework based on adversarial variational auto-encoders (aVAE) for persona modeling from the historical tips and reviews of users and items. The latent variables from aVAE are regarded as persona embeddings. Besides representing persona using the latent embeddings, we design a persona memory for storing the persona related words for users and items. Pointer Network is used to retrieve persona wordings from the memory when generating tips. Moreover, the persona embeddings are used as latent factors by a rating prediction component to predict the sentiment of a user over an item. Finally, the persona embeddings and the sentiment information are incorporated into a recurrent neural networks based tips generation component. Extensive experimental results are reported and discussed to elaborate the peculiarities of our framework. |
Tasks
Published 2019-03-06
URL http://arxiv.org/abs/1903.02156v2
PDF http://arxiv.org/pdf/1903.02156v2.pdf
PWC https://paperswithcode.com/paper/persona-aware-tips-generation
Repo
Framework

Land Cover Change Detection via Semantic Segmentation

Title Land Cover Change Detection via Semantic Segmentation
Authors Renee Su, Rong Chen
Abstract This paper presents a change detection method that identifies land cover changes from aerial imagery, using semantic segmentation, a machine learning approach. We present a land cover classification training pipeline with Deeplab v3+, state-of-the-art semantic segmentation technology, including data preparation, model training for seven land cover types, and model exporting modules. In the land cover change detection system, the inputs are images retrieved from Google Earth at the same location but from different times. The system then predicts semantic segmentation results on these images using the trained model and calculates the land cover class percentage for each input image. We see an improvement in the accuracy of the land cover semantic segmentation model, with a mean IoU of 0.756 compared to 0.433, as reported in the DeepGlobe land cover classification challenge. The land cover change detection system that leverages the state-of-the-art semantic segmentation technology is proposed and can be used for deforestation analysis, land management, and urban planning.
Tasks Semantic Segmentation
Published 2019-11-28
URL https://arxiv.org/abs/1911.12903v1
PDF https://arxiv.org/pdf/1911.12903v1.pdf
PWC https://paperswithcode.com/paper/land-cover-change-detection-via-semantic
Repo
Framework

Sentiment-Aware Recommendation System for Healthcare using Social Media

Title Sentiment-Aware Recommendation System for Healthcare using Social Media
Authors Alan Aipe, Mukuntha Narayanan Sundararaman, Asif Ekbal
Abstract Over the last decade, health communities (known as forums) have evolved into platforms where more and more users share their medical experiences, thereby seeking guidance and interacting with people of the community. The shared content, though informal and unstructured in nature, contains valuable medical and/or health-related information and can be leveraged to produce structured suggestions to the common people. In this paper, at first we propose a stacked deep learning model for sentiment analysis from the medical forum data. The stacked model comprises of Convolutional Neural Network (CNN) followed by a Long Short Term Memory (LSTM) and then by another CNN. For a blog classified with positive sentiment, we retrieve the top-n similar posts. Thereafter, we develop a probabilistic model for suggesting the suitable treatments or procedures for a particular disease or health condition. We believe that integration of medical sentiment and suggestion would be beneficial to the users for finding the relevant contents regarding medications and medical conditions, without having to manually stroll through a large amount of unstructured contents.
Tasks Sentiment Analysis
Published 2019-09-18
URL https://arxiv.org/abs/1909.08686v1
PDF https://arxiv.org/pdf/1909.08686v1.pdf
PWC https://paperswithcode.com/paper/sentiment-aware-recommendation-system-for
Repo
Framework

Learning Explicit and Implicit Structures for Targeted Sentiment Analysis

Title Learning Explicit and Implicit Structures for Targeted Sentiment Analysis
Authors Hao Li, Wei Lu
Abstract Targeted sentiment analysis is the task of jointly predicting target entities and their associated sentiment information. Existing research efforts mostly regard this joint task as a sequence labeling problem, building models that can capture explicit structures in the output space. However, the importance of capturing implicit global structural information that resides in the input space is largely unexplored. In this work, we argue that both types of information (implicit and explicit structural information) are crucial for building a successful targeted sentiment analysis model. Our experimental results show that properly capturing both information is able to lead to better performance than competitive existing approaches. We also conduct extensive experiments to investigate our model’s effectiveness and robustness.
Tasks Sentiment Analysis
Published 2019-09-17
URL https://arxiv.org/abs/1909.07593v1
PDF https://arxiv.org/pdf/1909.07593v1.pdf
PWC https://paperswithcode.com/paper/learning-explicit-and-implicit-structures-for
Repo
Framework

Deep topic modeling by multilayer bootstrap network and lasso

Title Deep topic modeling by multilayer bootstrap network and lasso
Authors Jianyu Wang, Xiao-Lei Zhang
Abstract Topic modeling is widely studied for the dimension reduction and analysis of documents. However, it is formulated as a difficult optimization problem. Current approximate solutions also suffer from inaccurate model- or data-assumptions. To deal with the above problems, we propose a polynomial-time deep topic model with no model and data assumptions. Specifically, we first apply multilayer bootstrap network (MBN), which is an unsupervised deep model, to reduce the dimension of documents, and then use the low-dimensional data representations or their clustering results as the target of supervised Lasso for topic word discovery. To our knowledge, this is the first time that MBN and Lasso are applied to unsupervised topic modeling. Experimental comparison results with five representative topic models on the 20-newsgroups and TDT2 corpora illustrate the effectiveness of the proposed algorithm.
Tasks Dimensionality Reduction, Topic Models
Published 2019-10-24
URL https://arxiv.org/abs/1910.10953v1
PDF https://arxiv.org/pdf/1910.10953v1.pdf
PWC https://paperswithcode.com/paper/deep-topic-modeling-by-multilayer-bootstrap
Repo
Framework

Accepted or Abandoned? Predicting the Fate of Code Changes

Title Accepted or Abandoned? Predicting the Fate of Code Changes
Authors Md. Khairul Islam, Toufique Ahmed, Fahim Ahmed, Dr. Anindya Iqbal
Abstract Many mature Open-Source Software (OSS), as well as commercial, organizations have adopted peer code review as an integral part of the development process to ensure the quality of the product. Of particular interest are code changes that end up “abandoned,” either because they are rejected, or (more commonly) because they are never accepted at all (at least not through the review tool). Several factors such as resource allocation, job environment, and efficiency mismatch between the author and the reviewer may cause a code change to be abandoned even after months of efforts from the developers and the reviewers. Predicting the review outcome of such code changes can ease the prioritization of tasks and the utilization of limited resources by saving time spent on low-quality code changes. In this paper, we conducted a comprehensive study to predict whether a code change is merged or abandoned and applied various well-known supervised machine learning algorithms. We propose PredCR, a Random Forest based model that predicts the review outcome of a code change with 0.91 f-measure at the beginning of the code change on the test set. Also, it improves predictions of abandoned changes by 27%-103% and merged changes by 5%-11%. Our model accurately classifies 93% of the top 25% code changes (with average 196 days duration) that go longest without being merged. PredCR can also adapt to the changes in feature values at different stages of the review process although it achieves very high performance at the very early stage (within 10% of the review process). This way, prediction quality for a particular code change can improve as the code review progresses. We also conducted a study to find out the properties of an ideal training set for our tool. We found that training with the instances from the same projects ensures 9%-25% performance increase.
Tasks
Published 2019-12-07
URL https://arxiv.org/abs/1912.03437v1
PDF https://arxiv.org/pdf/1912.03437v1.pdf
PWC https://paperswithcode.com/paper/accepted-or-abandoned-predicting-the-fate-of
Repo
Framework

A Human-Centered Approach to Interactive Machine Learning

Title A Human-Centered Approach to Interactive Machine Learning
Authors Kory W. Mathewson
Abstract The interactive machine learning (IML) community aims to augment humans’ ability to learn and make decisions over time through the development of automated decision-making systems. This interaction represents a collaboration between multiple intelligent systems—humans and machines. A lack of appropriate consideration for the humans involved can lead to problematic system behaviour, and issues of fairness, accountability, and transparency. This work presents a human-centred thinking approach to applying IML methods. This guide is intended to be used by AI practitioners who incorporate human factors in their work. These practitioners are responsible for the health, safety, and well-being of interacting humans. An obligation of responsibility for public interaction means acting with integrity, honesty, fairness, and abiding by applicable legal statutes. With these values and principles in mind, we as a research community can better achieve the collective goal of augmenting human ability. This practical guide aims to support many of the responsible decisions necessary throughout iterative design, development, and dissemination of IML systems.
Tasks Decision Making
Published 2019-05-15
URL https://arxiv.org/abs/1905.06289v1
PDF https://arxiv.org/pdf/1905.06289v1.pdf
PWC https://paperswithcode.com/paper/a-human-centered-approach-to-interactive
Repo
Framework

Reversible designs for extreme memory cost reduction of CNN training

Title Reversible designs for extreme memory cost reduction of CNN training
Authors Tristan Hascoet, Quentin Febvre, Yasuo Ariki, Tetsuya Takiguchi
Abstract Training Convolutional Neural Networks (CNN) is a resource intensive task that requires specialized hardware for efficient computation. One of the most limiting bottleneck of CNN training is the memory cost associated with storing the activation values of hidden layers needed for the computation of the weights gradient during the backward pass of the backpropagation algorithm. Recently, reversible architectures have been proposed to reduce the memory cost of training large CNN by reconstructing the input activation values of hidden layers from their output during the backward pass, circumventing the need to accumulate these activations in memory during the forward pass. In this paper, we push this idea to the extreme and analyze reversible network designs yielding minimal training memory footprint. We investigate the propagation of numerical errors in long chains of invertible operations and analyze their effect on training. We introduce the notion of pixel-wise memory cost to characterize the memory footprint of model training, and propose a new model architecture able to efficiently train arbitrarily deep neural networks with a minimum memory cost of 352 bytes per input pixel. This new kind of architecture enables training large neural networks on very limited memory, opening the door for neural network training on embedded devices or non-specialized hardware. For instance, we demonstrate training of our model to 93.3% accuracy on the CIFAR10 dataset within 67 minutes on a low-end Nvidia GTX750 GPU with only 1GB of memory.
Tasks
Published 2019-10-24
URL https://arxiv.org/abs/1910.11127v1
PDF https://arxiv.org/pdf/1910.11127v1.pdf
PWC https://paperswithcode.com/paper/reversible-designs-for-extreme-memory-cost
Repo
Framework
comments powered by Disqus