October 17, 2019

3098 words 15 mins read

Paper Group ANR 825

Paper Group ANR 825

Supervised Domain Enablement Attention for Personalized Domain Classification. Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy. A Genetic Programming Framework for 2D Platform AI. Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks. Statistical …

Supervised Domain Enablement Attention for Personalized Domain Classification

Title Supervised Domain Enablement Attention for Personalized Domain Classification
Authors Joo-Kyung Kim, Young-Bum Kim
Abstract In large-scale domain classification for natural language understanding, leveraging each user’s domain enablement information, which refers to the preferred or authenticated domains by the user, with attention mechanism has been shown to improve the overall domain classification performance. In this paper, we propose a supervised enablement attention mechanism, which utilizes sigmoid activation for the attention weighting so that the attention can be computed with more expressive power without the weight sum constraint of softmax attention. The attention weights are explicitly encouraged to be similar to the corresponding elements of the ground-truth’s one-hot vector by supervised attention, and the attention information of the other enabled domains is leveraged through self-distillation. By evaluating on the actual utterances from a large-scale IPDA, we show that our approach significantly improves domain classification performance.
Tasks
Published 2018-12-18
URL http://arxiv.org/abs/1812.07546v1
PDF http://arxiv.org/pdf/1812.07546v1.pdf
PWC https://paperswithcode.com/paper/supervised-domain-enablement-attention-for
Repo
Framework

Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy

Title Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy
Authors Florian Scheidegger, Roxana Istrate, Giovanni Mariani, Luca Benini, Costas Bekas, Cristiano Malossi
Abstract In the deep-learning community new algorithms are published at an incredible pace. Therefore, solving an image classification problem for new datasets becomes a challenging task, as it requires to re-evaluate published algorithms and their different configurations in order to find a close to optimal classifier. To facilitate this process, before biasing our decision towards a class of neural networks or running an expensive search over the network space, we propose to estimate the classification difficulty of the dataset. Our method computes a single number that characterizes the dataset difficulty 27x faster than training state-of-the-art networks. The proposed method can be used in combination with network topology and hyper-parameter search optimizers to efficiently drive the search towards promising neural-network configurations.
Tasks Image Classification
Published 2018-03-26
URL http://arxiv.org/abs/1803.09588v1
PDF http://arxiv.org/pdf/1803.09588v1.pdf
PWC https://paperswithcode.com/paper/efficient-image-dataset-classification
Repo
Framework

A Genetic Programming Framework for 2D Platform AI

Title A Genetic Programming Framework for 2D Platform AI
Authors Swen E. Gaudl
Abstract There currently exists a wide range of techniques to model and evolve artificial players for games. Existing techniques range from black box neural networks to entirely hand-designed solutions. In this paper, we demonstrate the feasibility of a genetic programming framework using human controller input to derive meaningful artificial players which can, later on, be optimised by hand. The current state of the art in game character design relies heavily on human designers to manually create and edit scripts and rules for game characters. To address this manual editing bottleneck, current computational intelligence techniques approach the issue with fully autonomous character generators, replacing most of the design process using black box solutions such as neural networks or the like. Our GP approach to this problem creates character controllers which can be further authored and developed by a designer it also offers designers to included their play style without the need to use a programming language. This keeps the designer in the loop while reducing repetitive manual labour. Our system also provides insights into how players express themselves in games and into deriving appropriate models for representing those insights. We present our framework, supporting findings and open challenges.
Tasks
Published 2018-03-05
URL http://arxiv.org/abs/1803.01648v1
PDF http://arxiv.org/pdf/1803.01648v1.pdf
PWC https://paperswithcode.com/paper/a-genetic-programming-framework-for-2d
Repo
Framework

Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks

Title Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks
Authors Bin Han, Lianghai Ji, Hans D. Schotten
Abstract In the context of Fifth Generation (5G) mobile networks, the concept of “Slice as a Service” (SlaaS) promotes mobile network operators to flexibly share infrastructures with mobile service providers and stakeholders. However, it also challenges with an emerging demand for efficient online algorithms to optimize the request-and-decision-based inter-slice resource management strategy. Based on genetic algorithms, this paper presents a novel online optimizer that efficiently approaches towards the ideal slicing strategy with maximized long-term network utility. The proposed method encodes slicing strategies into binary sequences to cope with the request-and-decision mechanism. It requires no a priori knowledge about the traffic/utility models, and therefore supports heterogeneous slices, while providing solid effectiveness, good robustness against non-stationary service scenarios, and high scalability.
Tasks
Published 2018-02-13
URL http://arxiv.org/abs/1802.04491v3
PDF http://arxiv.org/pdf/1802.04491v3.pdf
PWC https://paperswithcode.com/paper/slice-as-an-evolutionary-service-genetic
Repo
Framework

Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space

Title Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space
Authors Geoffrey Wolfer, Aryeh Kontorovich
Abstract We investigate the statistical complexity of estimating the parameters of a discrete-state Markov chain kernel from a single long sequence of state observations. In the finite case, we characterize (modulo logarithmic factors) the minimax sample complexity of estimation with respect to the operator infinity norm, while in the countably infinite case, we analyze the problem with respect to a natural entry-wise norm derived from total variation. We show that in both cases, the sample complexity is governed by the mixing properties of the unknown chain, for which, in the finite-state case, there are known finite-sample estimators with fully empirical confidence intervals.
Tasks
Published 2018-09-13
URL https://arxiv.org/abs/1809.05014v3
PDF https://arxiv.org/pdf/1809.05014v3.pdf
PWC https://paperswithcode.com/paper/minimax-learning-of-ergodic-markov-chains
Repo
Framework

Concept Learning with Energy-Based Models

Title Concept Learning with Energy-Based Models
Authors Igor Mordatch
Abstract Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning. We present a framework that defines a concept by an energy function over events in the environment, as well as an attention mask over entities participating in the event. Given few demonstration events, our method uses inference-time optimization procedure to generate events involving similar concepts or identify entities involved in the concept. We evaluate our framework on learning visual, quantitative, relational, temporal concepts from demonstration events in an unsupervised manner. Our approach is able to successfully generate and identify concepts in a few-shot setting and resulting learned concepts can be reused across environments. Example videos of our results are available at sites.google.com/site/energyconceptmodels
Tasks
Published 2018-11-06
URL http://arxiv.org/abs/1811.02486v1
PDF http://arxiv.org/pdf/1811.02486v1.pdf
PWC https://paperswithcode.com/paper/concept-learning-with-energy-based-models
Repo
Framework

On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection

Title On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection
Authors Vivian Lai, Chenhao Tan
Abstract Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone slightly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff.
Tasks Deception Detection, Medical Diagnosis
Published 2018-11-19
URL http://arxiv.org/abs/1811.07901v4
PDF http://arxiv.org/pdf/1811.07901v4.pdf
PWC https://paperswithcode.com/paper/on-human-predictions-with-explanations-and
Repo
Framework

A characterization of the Edge of Criticality in Binary Echo State Networks

Title A characterization of the Edge of Criticality in Binary Echo State Networks
Authors Pietro Verzelli, Lorenzo Livi, Cesare Alippi
Abstract Echo State Networks (ESNs) are simplified recurrent neural network models composed of a reservoir and a linear, trainable readout layer. The reservoir is tunable by some hyper-parameters that control the network behaviour. ESNs are known to be effective in solving tasks when configured on a region in (hyper-)parameter space called \emph{Edge of Criticality} (EoC), where the system is maximally sensitive to perturbations hence affecting its behaviour. In this paper, we propose binary ESNs, which are architecturally equivalent to standard ESNs but consider binary activation functions and binary recurrent weights. For these networks, we derive a closed-form expression for the EoC in the autonomous case and perform simulations in order to assess their behavior in the case of noisy neurons and in the presence of a signal. We propose a theoretical explanation for the fact that the variance of the input plays a major role in characterizing the EoC.
Tasks
Published 2018-10-03
URL http://arxiv.org/abs/1810.01742v1
PDF http://arxiv.org/pdf/1810.01742v1.pdf
PWC https://paperswithcode.com/paper/a-characterization-of-the-edge-of-criticality
Repo
Framework

Convergence Problems with Generative Adversarial Networks (GANs)

Title Convergence Problems with Generative Adversarial Networks (GANs)
Authors Samuel A. Barnett
Abstract Generative adversarial networks (GANs) are a novel approach to generative modelling, a task whose goal it is to learn a distribution of real data points. They have often proved difficult to train: GANs are unlike many techniques in machine learning, in that they are best described as a two-player game between a discriminator and generator. This has yielded both unreliability in the training process, and a general lack of understanding as to how GANs converge, and if so, to what. The purpose of this dissertation is to provide an account of the theory of GANs suitable for the mathematician, highlighting both positive and negative results. This involves identifying the problems when training GANs, and how topological and game-theoretic perspectives of GANs have contributed to our understanding and improved our techniques in recent years.
Tasks
Published 2018-06-29
URL http://arxiv.org/abs/1806.11382v1
PDF http://arxiv.org/pdf/1806.11382v1.pdf
PWC https://paperswithcode.com/paper/convergence-problems-with-generative
Repo
Framework

A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation

Title A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
Authors Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
Abstract The convergence rate and final performance of common deep learning models have significantly benefited from heuristics such as learning rate schedules, knowledge distillation, skip connections, and normalization layers. In the absence of theoretical underpinnings, controlled experiments aimed at explaining these strategies can aid our understanding of deep learning landscapes and the training dynamics. Existing approaches for empirical analysis rely on tools of linear interpolation and visualizations with dimensionality reduction, each with their limitations. Instead, we revisit such analysis of heuristics through the lens of recently proposed methods for loss surface and representation analysis, viz., mode connectivity and canonical correlation analysis (CCA), and hypothesize reasons for the success of the heuristics. In particular, we explore knowledge distillation and learning rate heuristics of (cosine) restarts and warmup using mode connectivity and CCA. Our empirical analysis suggests that: (a) the reasons often quoted for the success of cosine annealing are not evidenced in practice; (b) that the effect of learning rate warmup is to prevent the deeper layers from creating training instability; and (c) that the latent knowledge shared by the teacher is primarily disbursed to the deeper layers.
Tasks Dimensionality Reduction
Published 2018-10-29
URL https://arxiv.org/abs/1810.13243v1
PDF https://arxiv.org/pdf/1810.13243v1.pdf
PWC https://paperswithcode.com/paper/a-closer-look-at-deep-learning-heuristics-1
Repo
Framework

Robust Machine Comprehension Models via Adversarial Training

Title Robust Machine Comprehension Models via Adversarial Training
Authors Yicheng Wang, Mohit Bansal
Abstract It is shown that many published models for the Stanford Question Answering Dataset (Rajpurkar et al., 2016) lack robustness, suffering an over 50% decrease in F1 score during adversarial evaluation based on the AddSent (Jia and Liang, 2017) algorithm. It has also been shown that retraining models on data generated by AddSent has limited effect on their robustness. We propose a novel alternative adversary-generation algorithm, AddSentDiverse, that significantly increases the variance within the adversarial training data by providing effective examples that punish the model for making certain superficial assumptions. Further, in order to improve robustness to AddSent’s semantic perturbations (e.g., antonyms), we jointly improve the model’s semantic-relationship learning capabilities in addition to our AddSentDiverse-based adversarial training data augmentation. With these additions, we show that we can make a state-of-the-art model significantly more robust, achieving a 36.5% increase in F1 score under many different types of adversarial evaluation while maintaining performance on the regular SQuAD task.
Tasks Data Augmentation, Question Answering, Reading Comprehension
Published 2018-04-17
URL http://arxiv.org/abs/1804.06473v1
PDF http://arxiv.org/pdf/1804.06473v1.pdf
PWC https://paperswithcode.com/paper/robust-machine-comprehension-models-via
Repo
Framework

Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data

Title Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data
Authors Samuel J. Weisenthal, Caroline Quill, Samir Farooq, Henry Kautz, Martin S. Zand
Abstract Acute Kidney Injury (AKI), a sudden decline in kidney function, is associated with increased mortality, morbidity, length of stay, and hospital cost. Since AKI is sometimes preventable, there is great interest in prediction. Most existing studies consider all patients and therefore restrict to features available in the first hours of hospitalization. Here, the focus is instead on rehospitalized patients, a cohort in which rich longitudinal features from prior hospitalizations can be analyzed. Our objective is to provide a risk score directly at hospital re-entry. Gradient boosting, penalized logistic regression (with and without stability selection), and a recurrent neural network are trained on two years of adult inpatient EHR data (3,387 attributes for 34,505 patients who generated 90,013 training samples with 5,618 cases and 84,395 controls). Predictions are internally evaluated with 50 iterations of 5-fold grouped cross-validation with special emphasis on calibration, an analysis of which is performed at the patient as well as hospitalization level. Error is assessed with respect to diagnosis, race, age, gender, AKI identification method, and hospital utilization. In an additional experiment, the regularization penalty is severely increased to induce parsimony and interpretability. Predictors identified for rehospitalized patients are also reported with a special analysis of medications that might be modifiable risk factors. Insights from this study might be used to construct a predictive tool for AKI in rehospitalized patients. An accurate estimate of AKI risk at hospital entry might serve as a prior for an admitting provider or another predictive algorithm.
Tasks Calibration
Published 2018-07-25
URL http://arxiv.org/abs/1807.09865v2
PDF http://arxiv.org/pdf/1807.09865v2.pdf
PWC https://paperswithcode.com/paper/predicting-acute-kidney-injury-at-hospital-re
Repo
Framework

Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval

Title Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval
Authors Sonakshi Mathur, Mallika Chaudhary, Hemant Verma, Murari Mandal, S. K. Vipparthi, Subrahmanyam Murala
Abstract A novel color feature descriptor, Multichannel Distributed Local Pattern (MDLP) is proposed in this manuscript. The MDLP combines the salient features of both local binary and local mesh patterns in the neighborhood. The multi-distance information computed by the MDLP aids in robust extraction of the texture arrangement. Further, MDLP features are extracted for each color channel of an image. The retrieval performance of the MDLP is evaluated on the three benchmark datasets for CBIR, namely Corel-5000, Corel-10000 and MIT-Color Vistex respectively. The proposed technique attains substantial improvement as compared to other state-of- the-art feature descriptors in terms of various evaluation parameters such as ARP and ARR on the respective databases.
Tasks
Published 2018-05-07
URL http://arxiv.org/abs/1805.02679v1
PDF http://arxiv.org/pdf/1805.02679v1.pdf
PWC https://paperswithcode.com/paper/multichannel-distributed-local-pattern-for
Repo
Framework

Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV

Title Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV
Authors Xiaoliang Wang, Peng Cheng, Xinchuan Liu, Benedict Uzochukwu
Abstract Unmanned Aerial Vehicles (UAVs), have intrigued different people from all walks of life, because of their pervasive computing capabilities. UAV equipped with vision techniques, could be leveraged to establish navigation autonomous control for UAV itself. Also, object detection from UAV could be used to broaden the utilization of drone to provide ubiquitous surveillance and monitoring services towards military operation, urban administration and agriculture management. As the data-driven technologies evolved, machine learning algorithm, especially the deep learning approach has been intensively utilized to solve different traditional computer vision research problems. Modern Convolutional Neural Networks based object detectors could be divided into two major categories: one-stage object detector and two-stage object detector. In this study, we utilize some representative CNN based object detectors to execute the computer vision task over Stanford Drone Dataset (SDD). State-of-the-art performance has been achieved in utilizing focal loss dense detector RetinaNet based approach for object detection from UAV in a fast and accurate manner.
Tasks Object Detection
Published 2018-08-16
URL http://arxiv.org/abs/1808.05756v2
PDF http://arxiv.org/pdf/1808.05756v2.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-convolutional-neural
Repo
Framework

How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology

Title How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology
Authors Dhaval Adjodah, Dan Calacci, Abhimanyu Dubey, Peter Krafft, Esteban Moro, Alex `Sandy’ Pentland |
Abstract In this empirical paper, we investigate how learning agents can be arranged in more efficient communication topologies for improved learning. This is an important problem because a common technique to improve speed and robustness of learning in deep reinforcement learning and many other machine learning algorithms is to run multiple learning agents in parallel. The standard communication architecture typically involves all agents intermittently communicating with each other (fully connected topology) or with a centralized server (star topology). Unfortunately, optimizing the topology of communication over the space of all possible graphs is a hard problem, so we borrow results from the networked optimization and collective intelligence literatures which suggest that certain families of network topologies can lead to strong improvements over fully-connected networks. We start by introducing alternative network topologies to DRL benchmark tasks under the Evolution Strategies paradigm which we call Network Evolution Strategies. We explore the relative performance of the four main graph families and observe that one such family (Erdos-Renyi random graphs) empirically outperforms all other families, including the de facto fully-connected communication topologies. Additionally, the use of alternative network topologies has a multiplicative performance effect: we observe that when 1000 learning agents are arranged in a carefully designed communication topology, they can compete with 3000 agents arranged in the de facto fully-connected topology. Overall, our work suggests that distributed machine learning algorithms would learn more efficiently if the communication topology between learning agents was optimized.
Tasks
Published 2018-11-30
URL http://arxiv.org/abs/1811.12556v2
PDF http://arxiv.org/pdf/1811.12556v2.pdf
PWC https://paperswithcode.com/paper/how-to-organize-your-deep-reinforcement
Repo
Framework
comments powered by Disqus