January 26, 2020

3135 words 15 mins read

Paper Group ANR 1428

Paper Group ANR 1428

Computationally Efficient Neural Image Compression. Differential Description Length for Hyperparameter Selection in Machine Learning. Electroencephalogram (EEG) for Delineating Objective Measure of Autism Spectrum Disorder (ASD) (Extended Version). Domain Adaptive Dialog Generation via Meta Learning. ACUTE-EVAL: Improved Dialogue Evaluation with Op …

Computationally Efficient Neural Image Compression

Title Computationally Efficient Neural Image Compression
Authors Nick Johnston, Elad Eban, Ariel Gordon, Johannes Ballé
Abstract Image compression using neural networks have reached or exceeded non-neural methods (such as JPEG, WebP, BPG). While these networks are state of the art in ratedistortion performance, computational feasibility of these models remains a challenge. We apply automatic network optimization techniques to reduce the computational complexity of a popular architecture used in neural image compression, analyze the decoder complexity in execution runtime and explore the trade-offs between two distortion metrics, rate-distortion performance and run-time performance to design and research more computationally efficient neural image compression. We find that our method decreases the decoder run-time requirements by over 50% for a stateof-the-art neural architecture.
Tasks Image Compression
Published 2019-12-18
URL https://arxiv.org/abs/1912.08771v1
PDF https://arxiv.org/pdf/1912.08771v1.pdf
PWC https://paperswithcode.com/paper/computationally-efficient-neural-image
Repo
Framework

Differential Description Length for Hyperparameter Selection in Machine Learning

Title Differential Description Length for Hyperparameter Selection in Machine Learning
Authors Mojtaba Abolfazli, Anders Host-Madsen, June Zhang
Abstract This paper introduces a new method for model selection and more generally hyperparameter selection in machine learning. Minimum description length (MDL) is an established method for model selection, which is however not directly aimed at minimizing generalization error, which is often the primary goal in machine learning. The paper demonstrates a relationship between generalization error and a difference of description lengths of the training data; we call this difference differential description length (DDL). This allows prediction of generalization error from the training data alone by performing encoding of the training data. DDL can then be used for model selection by choosing the model with the smallest predicted generalization error. We show how this method can be used for linear regression and neural networks and deep learning. Experimental results show that DDL leads to smaller generalization error than cross-validation and traditional MDL and Bayes methods.
Tasks Model Selection
Published 2019-02-13
URL https://arxiv.org/abs/1902.04699v2
PDF https://arxiv.org/pdf/1902.04699v2.pdf
PWC https://paperswithcode.com/paper/differential-description-length-for
Repo
Framework

Electroencephalogram (EEG) for Delineating Objective Measure of Autism Spectrum Disorder (ASD) (Extended Version)

Title Electroencephalogram (EEG) for Delineating Objective Measure of Autism Spectrum Disorder (ASD) (Extended Version)
Authors Yasith Jayawardana, Mark Jaime, Sashi Thapaliya, Sampath Jayarathna
Abstract Autism Spectrum Disorder (ASD) is a developmental disorder that often impairs a child’s normal development of the brain. According to CDC, it is estimated that 1 in 6 children in the US suffer from development disorders, and 1 in 68 children in the US suffer from ASD. This condition has a negative impact on a person’s ability to hear, socialize and communicate. Overall, ASD has a broad range of symptoms and severity; hence the term spectrum is used. One of the main contributors to ASD is known to be genetics. Up to date, no suitable cure for ASD has been found. Early diagnosis is crucial for the long-term treatment of ASD, but this is challenging due to the lack of a proper objective measures. Subjective measures often take more time, resources, and have false positives or false negatives. There is a need for efficient objective measures that can help in diagnosing this disease early as possible with less effort. EEG measures the electric signals of the brain via electrodes placed on various places on the scalp. These signals can be used to study complex neuropsychiatric issues. Studies have shown that EEG has the potential to be used as a biomarker for various neurological conditions including ASD. This chapter will outline the usage of EEG measurement for the classification of ASD using machine learning algorithms.
Tasks EEG
Published 2019-06-26
URL https://arxiv.org/abs/1907.01515v1
PDF https://arxiv.org/pdf/1907.01515v1.pdf
PWC https://paperswithcode.com/paper/electroencephalogram-eeg-for-delineating
Repo
Framework

Domain Adaptive Dialog Generation via Meta Learning

Title Domain Adaptive Dialog Generation via Meta Learning
Authors Kun Qian, Zhou Yu
Abstract Domain adaptation is an essential task in dialog system building because there are so many new dialog tasks created for different needs every day. Collecting and annotating training data for these new tasks is costly since it involves real user interactions. We propose a domain adaptive dialog generation method based on meta-learning (DAML). DAML is an end-to-end trainable dialog system model that learns from multiple rich-resource tasks and then adapts to new domains with minimal training samples. We train a dialog system model using multiple rich-resource single-domain dialog data by applying the model-agnostic meta-learning algorithm to dialog domain. The model is capable of learning a competitive dialog system on a new domain with only a few training examples in an efficient manner. The two-step gradient updates in DAML enable the model to learn general features across multiple tasks. We evaluate our method on a simulated dialog dataset and achieve state-of-the-art performance, which is generalizable to new tasks.
Tasks Domain Adaptation, Meta-Learning
Published 2019-06-08
URL https://arxiv.org/abs/1906.03520v2
PDF https://arxiv.org/pdf/1906.03520v2.pdf
PWC https://paperswithcode.com/paper/domain-adaptive-dialog-generation-via-meta
Repo
Framework

ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons

Title ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons
Authors Margaret Li, Jason Weston, Stephen Roller
Abstract While dialogue remains an important end-goal of natural language research, the difficulty of evaluation is an oft-quoted reason why it remains troublesome to make real progress towards its solution. Evaluation difficulties are actually two-fold: not only do automatic metrics not correlate well with human judgments, but also human judgments themselves are in fact difficult to measure. The two most used human judgment tests, single-turn pairwise evaluation and multi-turn Likert scores, both have serious flaws as we discuss in this work. We instead provide a novel procedure involving comparing two full dialogues, where a human judge is asked to pay attention to only one speaker within each, and make a pairwise judgment. The questions themselves are optimized to maximize the robustness of judgments across different annotators, resulting in better tests. We also show how these tests work in self-play model chat setups, resulting in faster, cheaper tests. We hope these tests become the de facto standard, and will release open-source code to that end.
Tasks
Published 2019-09-06
URL https://arxiv.org/abs/1909.03087v1
PDF https://arxiv.org/pdf/1909.03087v1.pdf
PWC https://paperswithcode.com/paper/acute-eval-improved-dialogue-evaluation-with
Repo
Framework

Ten AI Stepping Stones for Cybersecurity

Title Ten AI Stepping Stones for Cybersecurity
Authors Ricardo Morla
Abstract With the turmoil in cybersecurity and the mind-blowing advances in AI, it is only natural that cybersecurity practitioners consider further employing learning techniques to help secure their organizations and improve the efficiency of their security operation centers. But with great fears come great opportunities for both the good and the evil, and a myriad of bad deals. This paper discusses ten issues in cybersecurity that hopefully will make it easier for practitioners to ask detailed questions about what they want from an AI system in their cybersecurity operations. We draw on the state of the art to provide factual arguments for a discussion on well-established AI in cybersecurity issues, including the current scope of AI and its application to cybersecurity, the impact of privacy concerns on the cybersecurity data that can be collected and shared externally to the organization, how an AI decision can be explained to the person running the operations center, and the implications of the adversarial nature of cybersecurity in the learning techniques. We then discuss the use of AI by attackers on a level playing field including several issues in an AI battlefield, and an AI perspective on the old cat-and-mouse game including how the adversary may assess your AI power.
Tasks
Published 2019-12-14
URL https://arxiv.org/abs/1912.06817v1
PDF https://arxiv.org/pdf/1912.06817v1.pdf
PWC https://paperswithcode.com/paper/ten-ai-stepping-stones-for-cybersecurity
Repo
Framework

Bounds for Approximate Regret-Matching Algorithms

Title Bounds for Approximate Regret-Matching Algorithms
Authors Ryan D’Orazio, Dustin Morrill, James R. Wright
Abstract A dominant approach to solving large imperfect-information games is Counterfactural Regret Minimization (CFR). In CFR, many regret minimization problems are combined to solve the game. For very large games, abstraction is typically needed to render CFR tractable. Abstractions are often manually tuned, possibly removing important strategic differences in the full game and harming performance. Function approximation provides a natural solution to finding good abstractions to approximate the full game. A common approach to incorporating function approximation is to learn the inputs needed for a regret minimizing algorithm, allowing for generalization across many regret minimization problems. This paper gives regret bounds when a regret minimizing algorithm uses estimates instead of true values. This form of analysis is the first to generalize to a larger class of $(\Phi, f)$-regret matching algorithms, and includes different forms of regret such as swap, internal, and external regret. We demonstrate how these results give a slightly tighter bound for Regression Regret-Matching (RRM), and present a novel bound for combining regression with Hedge.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1910.01706v2
PDF https://arxiv.org/pdf/1910.01706v2.pdf
PWC https://paperswithcode.com/paper/bounds-for-approximate-regret-matching
Repo
Framework

Accelerating Federated Learning via Momentum Gradient Descent

Title Accelerating Federated Learning via Momentum Gradient Descent
Authors Wei Liu, Li Chen, Yunfei Chen, Wenyi Zhang
Abstract Federated learning (FL) provides a communication-efficient approach to solve machine learning problems concerning distributed data, without sending raw data to a central server. However, existing works on FL only utilize first-order gradient descent (GD) and do not consider the preceding iterations to gradient update which can potentially accelerate convergence. In this paper, we consider momentum term which relates to the last iteration. The proposed momentum federated learning (MFL) uses momentum gradient descent (MGD) in the local update step of FL system. We establish global convergence properties of MFL and derive an upper bound on MFL convergence rate. Comparing the upper bounds on MFL and FL convergence rate, we provide conditions in which MFL accelerates the convergence. For different machine learning models, the convergence performance of MFL is evaluated based on experiments with MNIST dataset. Simulation results comfirm that MFL is globally convergent and further reveal significant convergence improvement over FL.
Tasks
Published 2019-10-08
URL https://arxiv.org/abs/1910.03197v2
PDF https://arxiv.org/pdf/1910.03197v2.pdf
PWC https://paperswithcode.com/paper/accelerating-federated-learning-via-momentum
Repo
Framework

Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors

Title Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumors
Authors Jakub Nalepa, Pablo Ribalta Lorenzo, Michal Marcinkiewicz, Barbara Bobek-Billewicz, Pawel Wawrzyniak, Maksym Walczak, Michal Kawulok, Wojciech Dudzik, Grzegorz Mrukwa, Pawel Ulrych, Michael P. Hayball
Abstract Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) plays an important role in diagnosis and grading of brain tumor. Although manual DCE biomarker extraction algorithms boost the diagnostic yield of DCE-MRI by providing quantitative information on tumor prognosis and prediction, they are time-consuming and prone to human error. In this paper, we propose a fully-automated, end-to-end system for DCE-MRI analysis of brain tumors. Our deep learning-powered technique does not require any user interaction, it yields reproducible results, and it is rigorously validated against benchmark (BraTS’17 for tumor segmentation, and a test dataset released by the Quantitative Imaging Biomarkers Alliance for the contrast-concentration fitting) and clinical (44 low-grade glioma patients) data. Also, we introduce a cubic model of the vascular input function used for pharmacokinetic modeling which significantly decreases the fitting error when compared with the state of the art, alongside a real-time algorithm for determination of the vascular input region. An extensive experimental study, backed up with statistical tests, showed that our system delivers state-of-the-art results (in terms of segmentation accuracy and contrast-concentration fitting) while requiring less than 3 minutes to process an entire input DCE-MRI study using a single GPU.
Tasks
Published 2019-07-18
URL https://arxiv.org/abs/1907.08303v1
PDF https://arxiv.org/pdf/1907.08303v1.pdf
PWC https://paperswithcode.com/paper/fully-automated-deep-learning-powered-system
Repo
Framework

Protein Classification using Machine Learning and Statistical Techniques: A Comparative Analysis

Title Protein Classification using Machine Learning and Statistical Techniques: A Comparative Analysis
Authors Chhote Lal Prasad Gupta, Anand Bihari, Sudhakar Tripathi
Abstract In recent era prediction of enzyme class from an unknown protein is one of the challenging tasks in bioinformatics. Day to day the number of proteins is increases as result the prediction of enzyme class gives a new opportunity to bioinformatics scholars. The prime objective of this article is to implement the machine learning classification technique for feature selection and predictions also find out an appropriate classification technique for function prediction. In this article the seven different classification technique like CRT, QUEST, CHAID, C5.0, ANN (Artificial Neural Network), SVM and Bayesian has been implemented on 4368 protein data that has been extracted from UniprotKB databank and categories into six different class. The proteins data is high dimensional sequence data and contain a maximum of 48 features.To manipulate the high dimensional sequential protein data with different classification technique, the SPSS has been used as an experimental tool. Different classification techniques give different results for every model and shows that the data are imbalanced for class C4, C5 and C6. The imbalanced data affect the performance of model. In these three classes the precision and recall value is very less or negligible. The experimental results highlight that the C5.0 classification technique accuracy is more suited for protein feature classification and predictions. The C5.0 classification technique gives 95.56% accuracy and also gives high precision and recall value. Finally, we conclude that the features that is selected can be used for function prediction.
Tasks Feature Selection
Published 2019-01-18
URL http://arxiv.org/abs/1901.06152v1
PDF http://arxiv.org/pdf/1901.06152v1.pdf
PWC https://paperswithcode.com/paper/protein-classification-using-machine-learning
Repo
Framework

Hallucinating Beyond Observation: Learning to Complete with Partial Observation and Unpaired Prior Knowledge

Title Hallucinating Beyond Observation: Learning to Complete with Partial Observation and Unpaired Prior Knowledge
Authors Chenyang Lu, Gijs Dubbelman
Abstract We propose a novel single-step training strategy that allows convolutional encoder-decoder networks that use skip connections, to complete partially observed data by means of hallucination. This strategy is demonstrated for the task of completing 2-D road layouts as well as 3-D vehicle shapes. As input, it takes data from a partially observed domain, for which no ground truth is available, and data from an unpaired prior knowledge domain and trains the network in an end-to-end manner. Our single-step training strategy is compared against two state-of-the-art baselines, one using a two-step auto-encoder training strategy and one using an adversarial strategy. Our novel strategy achieves an improvement up to +12.2% F-measure on the Cityscapes dataset. The learned network intrinsically generalizes better than the baselines on unseen datasets, which is demonstrated by an improvement up to +23.8% F-measure on the unseen KITTI dataset. Moreover, our approach outperforms the baselines using the same backbone network on the 3-D shape completion benchmark by a margin of 0.006 Hamming distance.
Tasks
Published 2019-07-23
URL https://arxiv.org/abs/1907.09786v2
PDF https://arxiv.org/pdf/1907.09786v2.pdf
PWC https://paperswithcode.com/paper/hallucinating-beyond-observation-learning-to
Repo
Framework

Information Privacy Opinions on Twitter: A Cross-Language Study

Title Information Privacy Opinions on Twitter: A Cross-Language Study
Authors Felipe González, Andrea Figueroa, Claudia López, Cecilia Aragón
Abstract The Cambridge Analytica scandal triggered a conversation on Twitter about data practices and their implications. Our research proposes to leverage this conversation to extend the understanding of how information privacy is framed by users worldwide. We collected tweets about the scandal written in Spanish and English between April and July 2018. We created a word embedding to create a reduced multi-dimensional representation of the tweets in each language. For each embedding, we conducted open coding to characterize the semantic contexts of key concepts: “information”, “privacy”, “company” and “users” (and their Spanish translations). Through a comparative analysis, we found a broader emphasis on privacy-related words associated with companies in English. We also identified more terms related to data collection in English and fewer associated with security mechanisms, control, and risks. Our findings hint at the potential of cross-language comparisons of text to extend the understanding of worldwide differences in information privacy perspectives.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02852v1
PDF https://arxiv.org/pdf/1912.02852v1.pdf
PWC https://paperswithcode.com/paper/information-privacy-opinions-on-twitter-a
Repo
Framework

On Understanding Knowledge Graph Representation

Title On Understanding Knowledge Graph Representation
Authors Carl Allen, Ivana Balazevic, Timothy M. Hospedales
Abstract Many methods have been developed to represent knowledge graph data, which implicitly exploit low-rank latent structure in the data to encode known information and enable unknown facts to be inferred. To predict whether a relationship holds between entities, their embeddings are typically compared in the latent space following a relation-specific mapping. Whilst link prediction has steadily improved, the latent structure, and hence why such models capture semantic information, remains unexplained. We build on recent theoretical interpretation of word embeddings as a basis to consider an explicit structure for representations of relations between entities. For identifiable relation types, we are able to predict properties and justify the relative performance of leading knowledge graph representation methods, including their often overlooked ability to make independent predictions.
Tasks Link Prediction, Word Embeddings
Published 2019-09-25
URL https://arxiv.org/abs/1909.11611v1
PDF https://arxiv.org/pdf/1909.11611v1.pdf
PWC https://paperswithcode.com/paper/on-understanding-knowledge-graph-1
Repo
Framework

Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders

Title Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders
Authors Ioana Bica, Ahmed M. Alaa, Mihaela van der Schaar
Abstract The estimation of treatment effects is a pervasive problem in medicine. Existing methods for estimating treatment effects from longitudinal observational data assume that there are no hidden confounders. This assumption is not testable in practice and, if it does not hold, leads to biased estimates. In this paper, we develop the Time Series Deconfounder, a method that leverages the assignment of multiple treatments over time to enable the estimation of treatment effects even in the presence of hidden confounders. The Time Series Deconfounder uses a novel recurrent neural network architecture with multitask output to build a factor model over time and infer substitute confounders that render the assigned treatments conditionally independent. Then it performs causal inference using the substitute confounders. We provide a theoretical analysis for obtaining unbiased causal effects of time-varying exposures using the Time Series Deconfounder. Using simulations we show the effectiveness of our method in deconfounding the estimation of treatment responses in longitudinal data.
Tasks Causal Inference, Time Series
Published 2019-02-01
URL https://arxiv.org/abs/1902.00450v2
PDF https://arxiv.org/pdf/1902.00450v2.pdf
PWC https://paperswithcode.com/paper/time-series-deconfounder-estimating-treatment
Repo
Framework

Training Variational Autoencoders with Buffered Stochastic Variational Inference

Title Training Variational Autoencoders with Buffered Stochastic Variational Inference
Authors Rui Shu, Hung H. Bui, Jay Whang, Stefano Ermon
Abstract The recognition network in deep latent variable models such as variational autoencoders (VAEs) relies on amortized inference for efficient posterior approximation that can scale up to large datasets. However, this technique has also been demonstrated to select suboptimal variational parameters, often resulting in considerable additional error called the amortization gap. To close the amortization gap and improve the training of the generative model, recent works have introduced an additional refinement step that applies stochastic variational inference (SVI) to improve upon the variational parameters returned by the amortized inference model. In this paper, we propose the Buffered Stochastic Variational Inference (BSVI), a new refinement procedure that makes use of SVI’s sequence of intermediate variational proposal distributions and their corresponding importance weights to construct a new generalized importance-weighted lower bound. We demonstrate empirically that training the variational autoencoders with BSVI consistently out-performs SVI, yielding an improved training procedure for VAEs.
Tasks Latent Variable Models
Published 2019-02-27
URL http://arxiv.org/abs/1902.10294v1
PDF http://arxiv.org/pdf/1902.10294v1.pdf
PWC https://paperswithcode.com/paper/training-variational-autoencoders-with
Repo
Framework
comments powered by Disqus