Paper Group ANR 825
Supervised Domain Enablement Attention for Personalized Domain Classification. Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy. A Genetic Programming Framework for 2D Platform AI. Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks. Statistical …
Supervised Domain Enablement Attention for Personalized Domain Classification
Title | Supervised Domain Enablement Attention for Personalized Domain Classification |
Authors | Joo-Kyung Kim, Young-Bum Kim |
Abstract | In large-scale domain classification for natural language understanding, leveraging each user’s domain enablement information, which refers to the preferred or authenticated domains by the user, with attention mechanism has been shown to improve the overall domain classification performance. In this paper, we propose a supervised enablement attention mechanism, which utilizes sigmoid activation for the attention weighting so that the attention can be computed with more expressive power without the weight sum constraint of softmax attention. The attention weights are explicitly encouraged to be similar to the corresponding elements of the ground-truth’s one-hot vector by supervised attention, and the attention information of the other enabled domains is leveraged through self-distillation. By evaluating on the actual utterances from a large-scale IPDA, we show that our approach significantly improves domain classification performance. |
Tasks | |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07546v1 |
http://arxiv.org/pdf/1812.07546v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-domain-enablement-attention-for |
Repo | |
Framework | |
Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy
Title | Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy |
Authors | Florian Scheidegger, Roxana Istrate, Giovanni Mariani, Luca Benini, Costas Bekas, Cristiano Malossi |
Abstract | In the deep-learning community new algorithms are published at an incredible pace. Therefore, solving an image classification problem for new datasets becomes a challenging task, as it requires to re-evaluate published algorithms and their different configurations in order to find a close to optimal classifier. To facilitate this process, before biasing our decision towards a class of neural networks or running an expensive search over the network space, we propose to estimate the classification difficulty of the dataset. Our method computes a single number that characterizes the dataset difficulty 27x faster than training state-of-the-art networks. The proposed method can be used in combination with network topology and hyper-parameter search optimizers to efficiently drive the search towards promising neural-network configurations. |
Tasks | Image Classification |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09588v1 |
http://arxiv.org/pdf/1803.09588v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-image-dataset-classification |
Repo | |
Framework | |
A Genetic Programming Framework for 2D Platform AI
Title | A Genetic Programming Framework for 2D Platform AI |
Authors | Swen E. Gaudl |
Abstract | There currently exists a wide range of techniques to model and evolve artificial players for games. Existing techniques range from black box neural networks to entirely hand-designed solutions. In this paper, we demonstrate the feasibility of a genetic programming framework using human controller input to derive meaningful artificial players which can, later on, be optimised by hand. The current state of the art in game character design relies heavily on human designers to manually create and edit scripts and rules for game characters. To address this manual editing bottleneck, current computational intelligence techniques approach the issue with fully autonomous character generators, replacing most of the design process using black box solutions such as neural networks or the like. Our GP approach to this problem creates character controllers which can be further authored and developed by a designer it also offers designers to included their play style without the need to use a programming language. This keeps the designer in the loop while reducing repetitive manual labour. Our system also provides insights into how players express themselves in games and into deriving appropriate models for representing those insights. We present our framework, supporting findings and open challenges. |
Tasks | |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01648v1 |
http://arxiv.org/pdf/1803.01648v1.pdf | |
PWC | https://paperswithcode.com/paper/a-genetic-programming-framework-for-2d |
Repo | |
Framework | |
Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks
Title | Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks |
Authors | Bin Han, Lianghai Ji, Hans D. Schotten |
Abstract | In the context of Fifth Generation (5G) mobile networks, the concept of “Slice as a Service” (SlaaS) promotes mobile network operators to flexibly share infrastructures with mobile service providers and stakeholders. However, it also challenges with an emerging demand for efficient online algorithms to optimize the request-and-decision-based inter-slice resource management strategy. Based on genetic algorithms, this paper presents a novel online optimizer that efficiently approaches towards the ideal slicing strategy with maximized long-term network utility. The proposed method encodes slicing strategies into binary sequences to cope with the request-and-decision mechanism. It requires no a priori knowledge about the traffic/utility models, and therefore supports heterogeneous slices, while providing solid effectiveness, good robustness against non-stationary service scenarios, and high scalability. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04491v3 |
http://arxiv.org/pdf/1802.04491v3.pdf | |
PWC | https://paperswithcode.com/paper/slice-as-an-evolutionary-service-genetic |
Repo | |
Framework | |
Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space
Title | Statistical Estimation of Ergodic Markov Chain Kernel over Discrete State Space |
Authors | Geoffrey Wolfer, Aryeh Kontorovich |
Abstract | We investigate the statistical complexity of estimating the parameters of a discrete-state Markov chain kernel from a single long sequence of state observations. In the finite case, we characterize (modulo logarithmic factors) the minimax sample complexity of estimation with respect to the operator infinity norm, while in the countably infinite case, we analyze the problem with respect to a natural entry-wise norm derived from total variation. We show that in both cases, the sample complexity is governed by the mixing properties of the unknown chain, for which, in the finite-state case, there are known finite-sample estimators with fully empirical confidence intervals. |
Tasks | |
Published | 2018-09-13 |
URL | https://arxiv.org/abs/1809.05014v3 |
https://arxiv.org/pdf/1809.05014v3.pdf | |
PWC | https://paperswithcode.com/paper/minimax-learning-of-ergodic-markov-chains |
Repo | |
Framework | |
Concept Learning with Energy-Based Models
Title | Concept Learning with Energy-Based Models |
Authors | Igor Mordatch |
Abstract | Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning. We present a framework that defines a concept by an energy function over events in the environment, as well as an attention mask over entities participating in the event. Given few demonstration events, our method uses inference-time optimization procedure to generate events involving similar concepts or identify entities involved in the concept. We evaluate our framework on learning visual, quantitative, relational, temporal concepts from demonstration events in an unsupervised manner. Our approach is able to successfully generate and identify concepts in a few-shot setting and resulting learned concepts can be reused across environments. Example videos of our results are available at sites.google.com/site/energyconceptmodels |
Tasks | |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02486v1 |
http://arxiv.org/pdf/1811.02486v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-learning-with-energy-based-models |
Repo | |
Framework | |
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection
Title | On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection |
Authors | Vivian Lai, Chenhao Tan |
Abstract | Humans are the final decision makers in critical tasks that involve ethical and legal concerns, ranging from recidivism prediction, to medical diagnosis, to fighting against fake news. Although machine learning models can sometimes achieve impressive performance in these tasks, these tasks are not amenable to full automation. To realize the potential of machine learning for improving human decisions, it is important to understand how assistance from machine learning models affects human performance and human agency. In this paper, we use deception detection as a testbed and investigate how we can harness explanations and predictions of machine learning models to improve human performance while retaining human agency. We propose a spectrum between full human agency and full automation, and develop varying levels of machine assistance along the spectrum that gradually increase the influence of machine predictions. We find that without showing predicted labels, explanations alone slightly improve human performance in the end task. In comparison, human performance is greatly improved by showing predicted labels (>20% relative improvement) and can be further improved by explicitly suggesting strong machine performance. Interestingly, when predicted labels are shown, explanations of machine predictions induce a similar level of accuracy as an explicit statement of strong machine performance. Our results demonstrate a tradeoff between human performance and human agency and show that explanations of machine predictions can moderate this tradeoff. |
Tasks | Deception Detection, Medical Diagnosis |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07901v4 |
http://arxiv.org/pdf/1811.07901v4.pdf | |
PWC | https://paperswithcode.com/paper/on-human-predictions-with-explanations-and |
Repo | |
Framework | |
A characterization of the Edge of Criticality in Binary Echo State Networks
Title | A characterization of the Edge of Criticality in Binary Echo State Networks |
Authors | Pietro Verzelli, Lorenzo Livi, Cesare Alippi |
Abstract | Echo State Networks (ESNs) are simplified recurrent neural network models composed of a reservoir and a linear, trainable readout layer. The reservoir is tunable by some hyper-parameters that control the network behaviour. ESNs are known to be effective in solving tasks when configured on a region in (hyper-)parameter space called \emph{Edge of Criticality} (EoC), where the system is maximally sensitive to perturbations hence affecting its behaviour. In this paper, we propose binary ESNs, which are architecturally equivalent to standard ESNs but consider binary activation functions and binary recurrent weights. For these networks, we derive a closed-form expression for the EoC in the autonomous case and perform simulations in order to assess their behavior in the case of noisy neurons and in the presence of a signal. We propose a theoretical explanation for the fact that the variance of the input plays a major role in characterizing the EoC. |
Tasks | |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01742v1 |
http://arxiv.org/pdf/1810.01742v1.pdf | |
PWC | https://paperswithcode.com/paper/a-characterization-of-the-edge-of-criticality |
Repo | |
Framework | |
Convergence Problems with Generative Adversarial Networks (GANs)
Title | Convergence Problems with Generative Adversarial Networks (GANs) |
Authors | Samuel A. Barnett |
Abstract | Generative adversarial networks (GANs) are a novel approach to generative modelling, a task whose goal it is to learn a distribution of real data points. They have often proved difficult to train: GANs are unlike many techniques in machine learning, in that they are best described as a two-player game between a discriminator and generator. This has yielded both unreliability in the training process, and a general lack of understanding as to how GANs converge, and if so, to what. The purpose of this dissertation is to provide an account of the theory of GANs suitable for the mathematician, highlighting both positive and negative results. This involves identifying the problems when training GANs, and how topological and game-theoretic perspectives of GANs have contributed to our understanding and improved our techniques in recent years. |
Tasks | |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1806.11382v1 |
http://arxiv.org/pdf/1806.11382v1.pdf | |
PWC | https://paperswithcode.com/paper/convergence-problems-with-generative |
Repo | |
Framework | |
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
Title | A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation |
Authors | Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher |
Abstract | The convergence rate and final performance of common deep learning models have significantly benefited from heuristics such as learning rate schedules, knowledge distillation, skip connections, and normalization layers. In the absence of theoretical underpinnings, controlled experiments aimed at explaining these strategies can aid our understanding of deep learning landscapes and the training dynamics. Existing approaches for empirical analysis rely on tools of linear interpolation and visualizations with dimensionality reduction, each with their limitations. Instead, we revisit such analysis of heuristics through the lens of recently proposed methods for loss surface and representation analysis, viz., mode connectivity and canonical correlation analysis (CCA), and hypothesize reasons for the success of the heuristics. In particular, we explore knowledge distillation and learning rate heuristics of (cosine) restarts and warmup using mode connectivity and CCA. Our empirical analysis suggests that: (a) the reasons often quoted for the success of cosine annealing are not evidenced in practice; (b) that the effect of learning rate warmup is to prevent the deeper layers from creating training instability; and (c) that the latent knowledge shared by the teacher is primarily disbursed to the deeper layers. |
Tasks | Dimensionality Reduction |
Published | 2018-10-29 |
URL | https://arxiv.org/abs/1810.13243v1 |
https://arxiv.org/pdf/1810.13243v1.pdf | |
PWC | https://paperswithcode.com/paper/a-closer-look-at-deep-learning-heuristics-1 |
Repo | |
Framework | |
Robust Machine Comprehension Models via Adversarial Training
Title | Robust Machine Comprehension Models via Adversarial Training |
Authors | Yicheng Wang, Mohit Bansal |
Abstract | It is shown that many published models for the Stanford Question Answering Dataset (Rajpurkar et al., 2016) lack robustness, suffering an over 50% decrease in F1 score during adversarial evaluation based on the AddSent (Jia and Liang, 2017) algorithm. It has also been shown that retraining models on data generated by AddSent has limited effect on their robustness. We propose a novel alternative adversary-generation algorithm, AddSentDiverse, that significantly increases the variance within the adversarial training data by providing effective examples that punish the model for making certain superficial assumptions. Further, in order to improve robustness to AddSent’s semantic perturbations (e.g., antonyms), we jointly improve the model’s semantic-relationship learning capabilities in addition to our AddSentDiverse-based adversarial training data augmentation. With these additions, we show that we can make a state-of-the-art model significantly more robust, achieving a 36.5% increase in F1 score under many different types of adversarial evaluation while maintaining performance on the regular SQuAD task. |
Tasks | Data Augmentation, Question Answering, Reading Comprehension |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06473v1 |
http://arxiv.org/pdf/1804.06473v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-machine-comprehension-models-via |
Repo | |
Framework | |
Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data
Title | Predicting Acute Kidney Injury at Hospital Re-entry Using High-dimensional Electronic Health Record Data |
Authors | Samuel J. Weisenthal, Caroline Quill, Samir Farooq, Henry Kautz, Martin S. Zand |
Abstract | Acute Kidney Injury (AKI), a sudden decline in kidney function, is associated with increased mortality, morbidity, length of stay, and hospital cost. Since AKI is sometimes preventable, there is great interest in prediction. Most existing studies consider all patients and therefore restrict to features available in the first hours of hospitalization. Here, the focus is instead on rehospitalized patients, a cohort in which rich longitudinal features from prior hospitalizations can be analyzed. Our objective is to provide a risk score directly at hospital re-entry. Gradient boosting, penalized logistic regression (with and without stability selection), and a recurrent neural network are trained on two years of adult inpatient EHR data (3,387 attributes for 34,505 patients who generated 90,013 training samples with 5,618 cases and 84,395 controls). Predictions are internally evaluated with 50 iterations of 5-fold grouped cross-validation with special emphasis on calibration, an analysis of which is performed at the patient as well as hospitalization level. Error is assessed with respect to diagnosis, race, age, gender, AKI identification method, and hospital utilization. In an additional experiment, the regularization penalty is severely increased to induce parsimony and interpretability. Predictors identified for rehospitalized patients are also reported with a special analysis of medications that might be modifiable risk factors. Insights from this study might be used to construct a predictive tool for AKI in rehospitalized patients. An accurate estimate of AKI risk at hospital entry might serve as a prior for an admitting provider or another predictive algorithm. |
Tasks | Calibration |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09865v2 |
http://arxiv.org/pdf/1807.09865v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-acute-kidney-injury-at-hospital-re |
Repo | |
Framework | |
Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval
Title | Multichannel Distributed Local Pattern for Content Based Indexing and Retrieval |
Authors | Sonakshi Mathur, Mallika Chaudhary, Hemant Verma, Murari Mandal, S. K. Vipparthi, Subrahmanyam Murala |
Abstract | A novel color feature descriptor, Multichannel Distributed Local Pattern (MDLP) is proposed in this manuscript. The MDLP combines the salient features of both local binary and local mesh patterns in the neighborhood. The multi-distance information computed by the MDLP aids in robust extraction of the texture arrangement. Further, MDLP features are extracted for each color channel of an image. The retrieval performance of the MDLP is evaluated on the three benchmark datasets for CBIR, namely Corel-5000, Corel-10000 and MIT-Color Vistex respectively. The proposed technique attains substantial improvement as compared to other state-of- the-art feature descriptors in terms of various evaluation parameters such as ARP and ARR on the respective databases. |
Tasks | |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02679v1 |
http://arxiv.org/pdf/1805.02679v1.pdf | |
PWC | https://paperswithcode.com/paper/multichannel-distributed-local-pattern-for |
Repo | |
Framework | |
Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV
Title | Fast and Accurate, Convolutional Neural Network Based Approach for Object Detection from UAV |
Authors | Xiaoliang Wang, Peng Cheng, Xinchuan Liu, Benedict Uzochukwu |
Abstract | Unmanned Aerial Vehicles (UAVs), have intrigued different people from all walks of life, because of their pervasive computing capabilities. UAV equipped with vision techniques, could be leveraged to establish navigation autonomous control for UAV itself. Also, object detection from UAV could be used to broaden the utilization of drone to provide ubiquitous surveillance and monitoring services towards military operation, urban administration and agriculture management. As the data-driven technologies evolved, machine learning algorithm, especially the deep learning approach has been intensively utilized to solve different traditional computer vision research problems. Modern Convolutional Neural Networks based object detectors could be divided into two major categories: one-stage object detector and two-stage object detector. In this study, we utilize some representative CNN based object detectors to execute the computer vision task over Stanford Drone Dataset (SDD). State-of-the-art performance has been achieved in utilizing focal loss dense detector RetinaNet based approach for object detection from UAV in a fast and accurate manner. |
Tasks | Object Detection |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05756v2 |
http://arxiv.org/pdf/1808.05756v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-accurate-convolutional-neural |
Repo | |
Framework | |
How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology
Title | How to Organize your Deep Reinforcement Learning Agents: The Importance of Communication Topology |
Authors | Dhaval Adjodah, Dan Calacci, Abhimanyu Dubey, Peter Krafft, Esteban Moro, Alex `Sandy’ Pentland | |
Abstract | In this empirical paper, we investigate how learning agents can be arranged in more efficient communication topologies for improved learning. This is an important problem because a common technique to improve speed and robustness of learning in deep reinforcement learning and many other machine learning algorithms is to run multiple learning agents in parallel. The standard communication architecture typically involves all agents intermittently communicating with each other (fully connected topology) or with a centralized server (star topology). Unfortunately, optimizing the topology of communication over the space of all possible graphs is a hard problem, so we borrow results from the networked optimization and collective intelligence literatures which suggest that certain families of network topologies can lead to strong improvements over fully-connected networks. We start by introducing alternative network topologies to DRL benchmark tasks under the Evolution Strategies paradigm which we call Network Evolution Strategies. We explore the relative performance of the four main graph families and observe that one such family (Erdos-Renyi random graphs) empirically outperforms all other families, including the de facto fully-connected communication topologies. Additionally, the use of alternative network topologies has a multiplicative performance effect: we observe that when 1000 learning agents are arranged in a carefully designed communication topology, they can compete with 3000 agents arranged in the de facto fully-connected topology. Overall, our work suggests that distributed machine learning algorithms would learn more efficiently if the communication topology between learning agents was optimized. |
Tasks | |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12556v2 |
http://arxiv.org/pdf/1811.12556v2.pdf | |
PWC | https://paperswithcode.com/paper/how-to-organize-your-deep-reinforcement |
Repo | |
Framework | |