October 17, 2019

3098 words 15 mins read

Paper Group ANR 825

Supervised Domain Enablement Attention for Personalized Domain Classification. Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy. A Genetic Programming Framework for 2D Platform AI. Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks. Statistical …

Supervised Domain Enablement Attention for Personalized Domain Classification


Title	Supervised Domain Enablement Attention for Personalized Domain Classification
Authors	Joo-Kyung Kim, Young-Bum Kim
Abstract	In large-scale domain classification for natural language understanding, leveraging each user’s domain enablement information, which refers to the preferred or authenticated domains by the user, with attention mechanism has been shown to improve the overall domain classification performance. In this paper, we propose a supervised enablement attention mechanism, which utilizes sigmoid activation for the attention weighting so that the attention can be computed with more expressive power without the weight sum constraint of softmax attention. The attention weights are explicitly encouraged to be similar to the corresponding elements of the ground-truth’s one-hot vector by supervised attention, and the attention information of the other enabled domains is leveraged through self-distillation. By evaluating on the actual utterances from a large-scale IPDA, we show that our approach significantly improves domain classification performance.
Tasks
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07546v1
PDF	http://arxiv.org/pdf/1812.07546v1.pdf
PWC	https://paperswithcode.com/paper/supervised-domain-enablement-attention-for
Repo
Framework

Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy


Title	Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy
Authors	Florian Scheidegger, Roxana Istrate, Giovanni Mariani, Luca Benini, Costas Bekas, Cristiano Malossi
Abstract	In the deep-learning community new algorithms are published at an incredible pace. Therefore, solving an image classification problem for new datasets becomes a challenging task, as it requires to re-evaluate published algorithms and their different configurations in order to find a close to optimal classifier. To facilitate this process, before biasing our decision towards a class of neural networks or running an expensive search over the network space, we propose to estimate the classification difficulty of the dataset. Our method computes a single number that characterizes the dataset difficulty 27x faster than training state-of-the-art networks. The proposed method can be used in combination with network topology and hyper-parameter search optimizers to efficiently drive the search towards promising neural-network configurations.
Tasks	Image Classification
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09588v1
PDF	http://arxiv.org/pdf/1803.09588v1.pdf
PWC	https://paperswithcode.com/paper/efficient-image-dataset-classification
Repo
Framework

A Genetic Programming Framework for 2D Platform AI


Title	A Genetic Programming Framework for 2D Platform AI
Authors	Swen E. Gaudl
Abstract	There currently exists a wide range of techniques to model and evolve artificial players for games. Existing techniques range from black box neural networks to entirely hand-designed solutions. In this paper, we demonstrate the feasibility of a genetic programming framework using human controller input to derive meaningful artificial players which can, later on, be optimised by hand. The current state of the art in game character design relies heavily on human designers to manually create and edit scripts and rules for game characters. To address this manual editing bottleneck, current computational intelligence techniques approach the issue with fully autonomous character generators, replacing most of the design process using black box solutions such as neural networks or the like. Our GP approach to this problem creates character controllers which can be further authored and developed by a designer it also offers designers to included their play style without the need to use a programming language. This keeps the designer in the loop while reducing repetitive manual labour. Our system also provides insights into how players express themselves in games and into deriving appropriate models for representing those insights. We present our framework, supporting findings and open challenges.
Tasks
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01648v1
PDF	http://arxiv.org/pdf/1803.01648v1.pdf
PWC	https://paperswithcode.com/paper/a-genetic-programming-framework-for-2d
Repo
Framework

Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks


Title	Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks
Authors	Bin Han, Lianghai Ji, Hans D. Schotten
Abstract	In the context of Fifth Generation (5G) mobile networks, the concept of “Slice as a Service” (SlaaS) promotes mobile network operators to flexibly share infrastructures with mobile service providers and stakeholders. However, it also challenges with an emerging demand for efficient online algorithms to optimize the request-and-decision-based inter-slice resource management strategy. Based on genetic algorithms, this paper presents a novel online optimizer that efficiently approaches towards the ideal slicing strategy with maximized long-term network utility. The proposed method encodes slicing strategies into binary sequences to cope with the request-and-decision mechanism. It requires no a priori knowledge about the traffic/utility models, and therefore supports heterogeneous slices, while providing solid effectiveness, good robustness against non-stationary service scenarios, and high scalability.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04491v3
PDF	http://arxiv.org/pdf/1802.04491v3.pdf
PWC	https://paperswithcode.com/paper/slice-as-an-evolutionary-service-genetic
Repo
Framework

Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space


Title	Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space
Authors	Geoffrey Wolfer, Aryeh Kontorovich
Abstract	We investigate the statistical complexity of estimating the parameters of a discrete-state Markov chain kernel from a single long sequence of state observations. In the finite case, we characterize (modulo logarithmic factors) the minimax sample complexity of estimation with respect to the operator infinity norm, while in the countably infinite case, we analyze the problem with respect to a natural entry-wise norm derived from total variation. We show that in both cases, the sample complexity is governed by the mixing properties of the unknown chain, for which, in the finite-state case, there are known finite-sample estimators with fully empirical confidence intervals.
Tasks
Published	2018-09-13
URL	https://arxiv.org/abs/1809.05014v3
PDF	https://arxiv.org/pdf/1809.05014v3.pdf
PWC	https://paperswithcode.com/paper/minimax-learning-of-ergodic-markov-chains
Repo
Framework

Concept Learning with Energy-Based Models


Title	Concept Learning with Energy-Based Models
Authors	Igor Mordatch
Abstract	Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning. We present a framework that defines a concept by an energy function over events in the environment, as well as an attention mask over entities participating in the event. Given few demonstration events, our method uses inference-time optimization procedure to generate events involving similar concepts or identify entities involved in the concept. We evaluate our framework on learning visual, quantitative, relational, temporal concepts from demonstration events in an unsupervised manner. Our approach is able to successfully generate and identify concepts in a few-shot setting and resulting learned concepts can be reused across environments. Example videos of our results are available at sites.google.com/site/energyconceptmodels
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02486v1
PDF	http://arxiv.org/pdf/1811.02486v1.pdf
PWC	https://paperswithcode.com/paper/concept-learning-with-energy-based-models
Repo
Framework

On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection


Title	On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection
Authors	Vivian Lai, Chenhao Tan
Abstract	Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone slightly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff.
Tasks	Deception Detection, Medical Diagnosis
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07901v4
PDF	http://arxiv.org/pdf/1811.07901v4.pdf
PWC	https://paperswithcode.com/paper/on-human-predictions-with-explanations-and
Repo
Framework

A characterization of the Edge of Criticality in Binary Echo State Networks


Title	A characterization of the Edge of Criticality in Binary Echo State Networks
Authors	Pietro Verzelli, Lorenzo Livi, Cesare Alippi
Abstract	Echo State Networks (ESNs) are simplified recurrent neural network models composed of a reservoir and a linear, trainable readout layer. The reservoir is tunable by some hyper-parameters that control the network behaviour. ESNs are known to be effective in solving tasks when configured on a region in (hyper-)parameter space called \emph{Edge of Criticality} (EoC), where the system is maximally sensitive to perturbations hence affecting its behaviour. In this paper, we propose binary ESNs, which are architecturally equivalent to standard ESNs but consider binary activation functions and binary recurrent weights. For these networks, we derive a closed-form expression for the EoC in the autonomous case and perform simulations in order to assess their behavior in the case of noisy neurons and in the presence of a signal. We propose a theoretical explanation for the fact that the variance of the input plays a major role in characterizing the EoC.
Tasks
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01742v1
PDF	http://arxiv.org/pdf/1810.01742v1.pdf
PWC	https://paperswithcode.com/paper/a-characterization-of-the-edge-of-criticality
Repo
Framework

Convergence Problems with Generative Adversarial Networks (GANs)


Title	Convergence Problems with Generative Adversarial Networks (GANs)
Authors	Samuel A. Barnett
Abstract	Generative adversarial networks (GANs) are a novel approach to generative modelling, a task whose goal it is to learn a distribution of real data points. They have often proved difficult to train: GANs are unlike many techniques in machine learning, in that they are best described as a two-player game between a discriminator and generator. This has yielded both unreliability in the training process, and a general lack of understanding as to how GANs converge, and if so, to what. The purpose of this dissertation is to provide an account of the theory of GANs suitable for the mathematician, highlighting both positive and negative results. This involves identifying the problems when training GANs, and how topological and game-theoretic perspectives of GANs have contributed to our understanding and improved our techniques in recent years.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11382v1
PDF	http://arxiv.org/pdf/1806.11382v1.pdf
PWC	https://paperswithcode.com/paper/convergence-problems-with-generative
Repo
Framework

A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation


Title	A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
Authors	Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
Abstract	The convergence rate and final performance of common deep learning models have significantly benefited from heuristics such as learning rate schedules, knowledge distillation, skip connections, and normalization layers. In the absence of theoretical underpinnings, controlled experiments aimed at explaining these strategies can aid our understanding of deep learning landscapes and the training dynamics. Existing approaches for empirical analysis rely on tools of linear interpolation and visualizations with dimensionality reduction, each with their limitations. Instead, we revisit such analysis of heuristics through the lens of recently proposed methods for loss surface and representation analysis, viz., mode connectivity and canonical correlation analysis (CCA), and hypothesize reasons for the success of the heuristics. In particular, we explore knowledge distillation and learning rate heuristics of (cosine) restarts and warmup using mode connectivity and CCA. Our empirical analysis suggests that: (a) the reasons often quoted for the success of cosine annealing are not evidenced in practice; (b) that the effect of learning rate warmup is to prevent the deeper layers from creating training instability; and (c) that the latent knowledge shared by the teacher is primarily disbursed to the deeper layers.
Tasks	Dimensionality Reduction
Published	2018-10-29
URL	https://arxiv.org/abs/1810.13243v1
PDF	https://arxiv.org/pdf/1810.13243v1.pdf
PWC	https://paperswithcode.com/paper/a-closer-look-at-deep-learning-heuristics-1
Repo
Framework

Robust Machine Comprehension Models via Adversarial Training


Title	Robust Machine Comprehension Models via Adversarial Training
Authors	Yicheng Wang, Mohit Bansal
Abstract	It is shown that many published models for the Stanford Question Answering Dataset (Rajpurkar et al., 2016) lack robustness, suffering an over 50% decrease in F1 score during adversarial evaluation based on the AddSent (Jia and Liang, 2017) algorithm. It has also been shown that retraining models on data generated by AddSent has limited effect on their robustness. We propose a novel alternative adversary-generation algorithm, AddSentDiverse, that significantly increases the variance within the adversarial training data by providing effective examples that punish the model for making certain superficial assumptions. Further, in order to improve robustness to AddSent’s semantic perturbations (e.g., antonyms), we jointly improve the model’s semantic-relationship learning capabilities in addition to our AddSentDiverse-based adversarial training data augmentation. With these additions, we show that we can make a state-of-the-art model significantly more robust, achieving a 36.5% increase in F1 score under many different types of adversarial evaluation while maintaining performance on the regular SQuAD task.
Tasks	Data Augmentation, Question Answering, Reading Comprehension
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06473v1
PDF	http://arxiv.org/pdf/1804.06473v1.pdf
PWC	https://paperswithcode.com/paper/robust-machine-comprehension-models-via
Repo
Framework

Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data


Title	Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data
Authors	Samuel J. Weisenthal, Caroline Quill, Samir Farooq, Henry Kautz, Martin S. Zand
Abstract	Acute Kidney Injury (AKI), a sudden decline in kidney function, is associated with increased mortality, morbidity, length of stay, and hospital cost. Since AKI is sometimes preventable, there is great interest in prediction. Most existing studies consider all patients and therefore restrict to features available in the first hours of hospitalization. Here, the focus is instead on rehospitalized patients, a cohort in which rich longitudinal features from prior hospitalizations can be analyzed. Our objective is to provide a risk score directly at hospital re-entry. Gradient boosting, penalized logistic regression (with and without stability selection), and a recurrent neural network are trained on two years of adult inpatient EHR data (3,387 attributes for 34,505 patients who generated 90,013 training samples with 5,618 cases and 84,395 controls). Predictions are internally evaluated with 50 iterations of 5-fold grouped cross-validation with special emphasis on calibration, an analysis of which is performed at the patient as well as hospitalization level. Error is assessed with respect to diagnosis, race, age, gender, AKI identification method, and hospital utilization. In an additional experiment, the regularization penalty is severely increased to induce parsimony and interpretability. Predictors identified for rehospitalized patients are also reported with a special analysis of medications that might be modifiable risk factors. Insights from this study might be used to construct a predictive tool for AKI in rehospitalized patients. An accurate estimate of AKI risk at hospital entry might serve as a prior for an admitting provider or another predictive algorithm.
Tasks	Calibration
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09865v2
PDF	http://arxiv.org/pdf/1807.09865v2.pdf
PWC	https://paperswithcode.com/paper/predicting-acute-kidney-injury-at-hospital-re
Repo
Framework

Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval


Title	Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval
Authors	Sonakshi Mathur, Mallika Chaudhary, Hemant Verma, Murari Mandal, S. K. Vipparthi, Subrahmanyam Murala
Abstract	A novel color feature descriptor, Multichannel Distributed Local Pattern (MDLP) is proposed in this manuscript. The MDLP combines the salient features of both local binary and local mesh patterns in the neighborhood. The multi-distance information computed by the MDLP aids in robust extraction of the texture arrangement. Further, MDLP features are extracted for each color channel of an image. The retrieval performance of the MDLP is evaluated on the three benchmark datasets for CBIR, namely Corel-5000, Corel-10000 and MIT-Color Vistex respectively. The proposed technique attains substantial improvement as compared to other state-of- the-art feature descriptors in terms of various evaluation parameters such as ARP and ARR on the respective databases.
Tasks
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02679v1
PDF	http://arxiv.org/pdf/1805.02679v1.pdf
PWC	https://paperswithcode.com/paper/multichannel-distributed-local-pattern-for
Repo
Framework

Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV


Title	Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV
Authors	Xiaoliang Wang, Peng Cheng, Xinchuan Liu, Benedict Uzochukwu
Abstract	Unmanned Aerial Vehicles (UAVs), have intrigued different people from all walks of life, because of their pervasive computing capabilities. UAV equipped with vision techniques, could be leveraged to establish navigation autonomous control for UAV itself. Also, object detection from UAV could be used to broaden the utilization of drone to provide ubiquitous surveillance and monitoring services towards military operation, urban administration and agriculture management. As the data-driven technologies evolved, machine learning algorithm, especially the deep learning approach has been intensively utilized to solve different traditional computer vision research problems. Modern Convolutional Neural Networks based object detectors could be divided into two major categories: one-stage object detector and two-stage object detector. In this study, we utilize some representative CNN based object detectors to execute the computer vision task over Stanford Drone Dataset (SDD). State-of-the-art performance has been achieved in utilizing focal loss dense detector RetinaNet based approach for object detection from UAV in a fast and accurate manner.
Tasks	Object Detection
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05756v2
PDF	http://arxiv.org/pdf/1808.05756v2.pdf
PWC	https://paperswithcode.com/paper/fast-and-accurate-convolutional-neural
Repo
Framework

How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology


Title	How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology
Authors	Dhaval Adjodah, Dan Calacci, Abhimanyu Dubey, Peter Krafft, Esteban Moro, Alex `Sandy’ Pentland \|
Abstract	In this empirical paper, we investigate how learning agents can be arranged in more efficient communication topologies for improved learning. This is an important problem because a common technique to improve speed and robustness of learning in deep reinforcement learning and many other machine learning algorithms is to run multiple learning agents in parallel. The standard communication architecture typically involves all agents intermittently communicating with each other (fully connected topology) or with a centralized server (star topology). Unfortunately, optimizing the topology of communication over the space of all possible graphs is a hard problem, so we borrow results from the networked optimization and collective intelligence literatures which suggest that certain families of network topologies can lead to strong improvements over fully-connected networks. We start by introducing alternative network topologies to DRL benchmark tasks under the Evolution Strategies paradigm which we call Network Evolution Strategies. We explore the relative performance of the four main graph families and observe that one such family (Erdos-Renyi random graphs) empirically outperforms all other families, including the de facto fully-connected communication topologies. Additionally, the use of alternative network topologies has a multiplicative performance effect: we observe that when 1000 learning agents are arranged in a carefully designed communication topology, they can compete with 3000 agents arranged in the de facto fully-connected topology. Overall, our work suggests that distributed machine learning algorithms would learn more efficiently if the communication topology between learning agents was optimized.
Tasks
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12556v2
PDF	http://arxiv.org/pdf/1811.12556v2.pdf
PWC	https://paperswithcode.com/paper/how-to-organize-your-deep-reinforcement
Repo
Framework