Paper Group ANR 249
GritNet: Student Performance Prediction with Deep Learning. ICPRAI 2018 SI: On dynamic ensemble selection and data preprocessing for multi-class imbalance learning. XOR_p A maximally intertwined p-classes problem used as a benchmark with built-in truth for neural networks gradient descent optimization. On Deep Domain Adaptation: Some Theoretical Un …
GritNet: Student Performance Prediction with Deep Learning
Title | GritNet: Student Performance Prediction with Deep Learning |
Authors | Byung-Hak Kim, Ethan Vizitei, Varun Ganapathi |
Abstract | Student performance prediction - where a machine forecasts the future performance of students as they interact with online coursework - is a challenging problem. Reliable early-stage predictions of a student’s future performance could be critical to facilitate timely educational interventions during a course. However, very few prior studies have explored this problem from a deep learning perspective. In this paper, we recast the student performance prediction problem as a sequential event prediction problem and propose a new deep learning based algorithm, termed GritNet, which builds upon the bidirectional long short term memory (BLSTM). Our results, from real Udacity students’ graduation predictions, show that the GritNet not only consistently outperforms the standard logistic-regression based method, but that improvements are substantially pronounced in the first few weeks when accurate predictions are most challenging. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07405v1 |
http://arxiv.org/pdf/1804.07405v1.pdf | |
PWC | https://paperswithcode.com/paper/gritnet-student-performance-prediction-with |
Repo | |
Framework | |
ICPRAI 2018 SI: On dynamic ensemble selection and data preprocessing for multi-class imbalance learning
Title | ICPRAI 2018 SI: On dynamic ensemble selection and data preprocessing for multi-class imbalance learning |
Authors | Rafael M. O. Cruz, Mariana A. Souza, Robert Sabourin, George D. C. Cavalcanti |
Abstract | Class-imbalance refers to classification problems in which many more instances are available for certain classes than for others. Such imbalanced datasets require special attention because traditional classifiers generally favor the majority class which has a large number of instances. Ensemble of classifiers have been reported to yield promising results. However, the majority of ensemble methods applied to imbalanced learning are static ones. Moreover, they only deal with binary imbalanced problems. Hence, this paper presents an empirical analysis of dynamic selection techniques and data preprocessing methods for dealing with multi-class imbalanced problems. We considered five variations of preprocessing methods and fourteen dynamic selection schemes. Our experiments conducted on 26 multi-class imbalanced problems show that the dynamic ensemble improves the AUC and the G-mean as compared to the static ensemble. Moreover, data preprocessing plays an important role in such cases. |
Tasks | |
Published | 2018-11-22 |
URL | http://arxiv.org/abs/1811.10481v2 |
http://arxiv.org/pdf/1811.10481v2.pdf | |
PWC | https://paperswithcode.com/paper/icprai-2018-si-on-dynamic-ensemble-selection |
Repo | |
Framework | |
XOR_p A maximally intertwined p-classes problem used as a benchmark with built-in truth for neural networks gradient descent optimization
Title | XOR_p A maximally intertwined p-classes problem used as a benchmark with built-in truth for neural networks gradient descent optimization |
Authors | Danielle Thierry-Mieg, Jean Thierry-Mieg |
Abstract | A natural p-classes generalization of the eXclusive OR problem, the subtraction modulo p, where p is prime, is presented and solved using a single fully connected hidden layer with p-neurons. Although the problem is very simple, the landscape is intricate and challenging and represents an interesting benchmark for gradient descent optimization algorithms. Testing 9 optimizers and 9 activation functions up to p = 191, the method converging most often and the fastest to a perfect classification is the Adam optimizer combined with the ELU activation function. |
Tasks | |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07538v1 |
http://arxiv.org/pdf/1812.07538v1.pdf | |
PWC | https://paperswithcode.com/paper/xor_p-a-maximally-intertwined-p-classes |
Repo | |
Framework | |
On Deep Domain Adaptation: Some Theoretical Understandings
Title | On Deep Domain Adaptation: Some Theoretical Understandings |
Authors | Trung Le, Khanh Nguyen, Nhat Ho, Hung Bui, Dinh Phung |
Abstract | Compared with shallow domain adaptation, recent progress in deep domain adaptation has shown that it can achieve higher predictive performance and stronger capacity to tackle structural data (e.g., image and sequential data). The underlying idea of deep domain adaptation is to bridge the gap between source and target domains in a joint space so that a supervised classifier trained on labeled source data can be nicely transferred to the target domain. This idea is certainly intuitive and powerful, however, limited theoretical understandings have been developed to support its underpinning principle. In this paper, we have provided a rigorous framework to explain why it is possible to close the gap of the target and source domains in the joint space. More specifically, we first study the loss incurred when performing transfer learning from the source to the target domain. This provides a theory that explains and generalizes existing work in deep domain adaptation which was mainly empirical. This enables us to further explain why closing the gap in the joint space can directly minimize the loss incurred for transfer learning between the two domains. To our knowledge, this offers the first theoretical result that characterizes a direct bound on the joint space and the gain of transfer learning via deep domain adaptation |
Tasks | Domain Adaptation, Transfer Learning |
Published | 2018-11-15 |
URL | https://arxiv.org/abs/1811.06199v3 |
https://arxiv.org/pdf/1811.06199v3.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-perspective-of-deep-domain |
Repo | |
Framework | |
Bringing personalized learning into computer-aided question generation
Title | Bringing personalized learning into computer-aided question generation |
Authors | Yi-Ting Huang, Meng Chang Chen, Yeali S. Sun |
Abstract | This paper proposes a novel and statistical method of ability estimation based on acquisition distribution for a personalized computer aided question generation. This method captures the learning outcomes over time and provides a flexible measurement based on the acquisition distributions instead of precalibration. Compared to the previous studies, the proposed method is robust, especially when an ability of a student is unknown. The results from the empirical data show that the estimated abilities match the actual abilities of learners, and the pretest and post-test of the experimental group show significant improvement. These results suggest that this method can serves as the ability estimation for a personalized computer-aided testing environment. |
Tasks | Question Generation |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09735v1 |
http://arxiv.org/pdf/1808.09735v1.pdf | |
PWC | https://paperswithcode.com/paper/bringing-personalized-learning-into-computer |
Repo | |
Framework | |
Verifying Controllers Against Adversarial Examples with Bayesian Optimization
Title | Verifying Controllers Against Adversarial Examples with Bayesian Optimization |
Authors | Shromona Ghosh, Felix Berkenkamp, Gireeja Ranade, Shaz Qadeer, Ashish Kapoor |
Abstract | Recent successes in reinforcement learning have lead to the development of complex controllers for real-world robots. As these robots are deployed in safety-critical applications and interact with humans, it becomes critical to ensure safety in order to avoid causing harm. A first step in this direction is to test the controllers in simulation. To be able to do this, we need to capture what we mean by safety and then efficiently search the space of all behaviors to see if they are safe. In this paper, we present an active-testing framework based on Bayesian Optimization. We specify safety constraints using logic and exploit structure in the problem in order to test the system for adversarial counter examples that violate the safety specifications. These specifications are defined as complex boolean combinations of smooth functions on the trajectories and, unlike reward functions in reinforcement learning, are expressive and impose hard constraints on the system. In our framework, we exploit regularity assumptions on individual functions in form of a Gaussian Process (GP) prior. We combine these into a coherent optimization framework using problem structure. The resulting algorithm is able to provably verify complex safety specifications or alternatively find counter examples. Experimental results show that the proposed method is able to find adversarial examples quickly. |
Tasks | |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08678v2 |
http://arxiv.org/pdf/1802.08678v2.pdf | |
PWC | https://paperswithcode.com/paper/verifying-controllers-against-adversarial |
Repo | |
Framework | |
Incorporating Word Embeddings into Open Directory Project based Large-scale Classification
Title | Incorporating Word Embeddings into Open Directory Project based Large-scale Classification |
Authors | Kang-Min Kim, Aliyeva Dinara, Byung-Ju Choi, SangKeun Lee |
Abstract | Recently, implicit representation models, such as embedding or deep learning, have been successfully adopted to text classification task due to their outstanding performance. However, these approaches are limited to small- or moderate-scale text classification. Explicit representation models are often used in a large-scale text classification, like the Open Directory Project (ODP)-based text classification. However, the performance of these models is limited to the associated knowledge bases. In this paper, we incorporate word embeddings into the ODP-based large-scale classification. To this end, we first generate category vectors, which represent the semantics of ODP categories by jointly modeling word embeddings and the ODP-based text classification. We then propose a novel semantic similarity measure, which utilizes the category and word vectors obtained from the joint model and word embeddings, respectively. The evaluation results clearly show the efficacy of our methodology in large-scale text classification. The proposed scheme exhibits significant improvements of 10% and 28% in terms of macro-averaging F1-score and precision at k, respectively, over state-of-the-art techniques. |
Tasks | Semantic Similarity, Semantic Textual Similarity, Text Classification, Word Embeddings |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.00828v1 |
http://arxiv.org/pdf/1804.00828v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-word-embeddings-into-open |
Repo | |
Framework | |
Data-dependent PAC-Bayes priors via differential privacy
Title | Data-dependent PAC-Bayes priors via differential privacy |
Authors | Gintare Karolina Dziugaite, Daniel M. Roy |
Abstract | The Probably Approximately Correct (PAC) Bayes framework (McAllester, 1999) can incorporate knowledge about the learning algorithm and (data) distribution through the use of distribution-dependent priors, yielding tighter generalization bounds on data-dependent posteriors. Using this flexibility, however, is difficult, especially when the data distribution is presumed to be unknown. We show how an {\epsilon}-differentially private data-dependent prior yields a valid PAC-Bayes bound, and then show how non-private mechanisms for choosing priors can also yield generalization bounds. As an application of this result, we show that a Gaussian prior mean chosen via stochastic gradient Langevin dynamics (SGLD; Welling and Teh, 2011) leads to a valid PAC-Bayes bound given control of the 2-Wasserstein distance to an {\epsilon}-differentially private stationary distribution. We study our data-dependent bounds empirically, and show that they can be nonvacuous even when other distribution-dependent bounds are vacuous. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09583v2 |
http://arxiv.org/pdf/1802.09583v2.pdf | |
PWC | https://paperswithcode.com/paper/data-dependent-pac-bayes-priors-via |
Repo | |
Framework | |
Fine Grained Classification of Personal Data Entities
Title | Fine Grained Classification of Personal Data Entities |
Authors | Riddhiman Dasgupta, Balaji Ganesan, Aswin Kannan, Berthold Reinwald, Arun Kumar |
Abstract | Entity Type Classification can be defined as the task of assigning category labels to entity mentions in documents. While neural networks have recently improved the classification of general entity mentions, pattern matching and other systems continue to be used for classifying personal data entities (e.g. classifying an organization as a media company or a government institution for GDPR, and HIPAA compliance). We propose a neural model to expand the class of personal data entities that can be classified at a fine grained level, using the output of existing pattern matching systems as additional contextual features. We introduce new resources, a personal data entities hierarchy with 134 types, and two datasets from the Wikipedia pages of elected representatives and Enron emails. We hope these resource will aid research in the area of personal data discovery, and to that effect, we provide baseline results on these datasets, and compare our method with state of the art models on OntoNotes dataset. |
Tasks | |
Published | 2018-11-23 |
URL | http://arxiv.org/abs/1811.09368v1 |
http://arxiv.org/pdf/1811.09368v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-classification-of-personal-data |
Repo | |
Framework | |
The Modeling of SDL Aiming at Knowledge Acquisition in Automatic Driving
Title | The Modeling of SDL Aiming at Knowledge Acquisition in Automatic Driving |
Authors | Zecang Gu, Yin Liang, Zhaoxi Zhang |
Abstract | In this paper we proposed an ultimate theory to solve the multi-target control problem through its introduction to the machine learning framework in automatic driving, which explored the implementation of excellent drivers’ knowledge acquisition. Nowadays there exist some core problems that have not been fully realized by the researchers in automatic driving, such as the optimal way to control the multi-target objective functions of energy saving, safe driving, headway distance control and comfort driving, as well as the resolvability of the networks that automatic driving relied on and the high-performance chips like GPU on the complex driving environments. According to these problems, we developed a new theory to map multitarget objective functions in different spaces into the same one and thus introduced a machine learning framework of SDL(Super Deep Learning) for optimal multi-targetcontrol based on knowledge acquisition. We will present in this paper the optimal multi-target control by combining the fuzzy relationship of each multi-target objective function and the implementation of excellent drivers’ knowledge acquired by machine learning. Theoretically, the impact of this method will exceed that of the fuzzy control method used in automatic train. |
Tasks | |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03007v1 |
http://arxiv.org/pdf/1812.03007v1.pdf | |
PWC | https://paperswithcode.com/paper/the-modeling-of-sdl-aiming-at-knowledge |
Repo | |
Framework | |
Improving Skin Condition Classification with a Question Answering Model
Title | Improving Skin Condition Classification with a Question Answering Model |
Authors | Mohamed Akrout, Amir-massoud Farahmand, Tory Jarmain |
Abstract | We present a skin condition classification methodology based on a sequential pipeline of a pre-trained Convolutional Neural Network (CNN) and a Question Answering (QA) model. This method enables us to not only increase the classification confidence and accuracy of the deployed CNN system, but also enables the emulation of the conventional approach of doctors asking the relevant questions in refining the ultimate diagnosis and differential. By combining the CNN output in the form of classification probabilities as a prior to the QA model and the image textual description, we greedily ask the best symptom that maximizes the information gain over symptoms. We demonstrate that combining the QA model with the CNN increases the accuracy up to 10% as compared to the CNN alone, and more than 30% as compared to the QA model alone. |
Tasks | Question Answering |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06165v1 |
http://arxiv.org/pdf/1811.06165v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-skin-condition-classification-with |
Repo | |
Framework | |
Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale
Title | Corpus Conversion Service: A Machine Learning Platform to Ingest Documents at Scale |
Authors | Peter W J Staar, Michele Dolfi, Christoph Auer, Costas Bekas |
Abstract | Over the past few decades, the amount of scientific articles and technical literature has increased exponentially in size. Consequently, there is a great need for systems that can ingest these documents at scale and make the contained knowledge discoverable. Unfortunately, both the format of these documents (e.g. the PDF format or bitmap images) as well as the presentation of the data (e.g. complex tables) make the extraction of qualitative and quantitive data extremely challenging. In this paper, we present a modular, cloud-based platform to ingest documents at scale. This platform, called the Corpus Conversion Service (CCS), implements a pipeline which allows users to parse and annotate documents (i.e. collect ground-truth), train machine-learning classification algorithms and ultimately convert any type of PDF or bitmap-documents to a structured content representation format. We will show that each of the modules is scalable due to an asynchronous microservice architecture and can therefore handle massive amounts of documents. Furthermore, we will show that our capability to gather ground-truth is accelerated by machine-learning algorithms by at least one order of magnitude. This allows us to both gather large amounts of ground-truth in very little time and obtain very good precision/recall metrics in the range of 99% with regard to content conversion to structured output. The CCS platform is currently deployed on IBM internal infrastructure and serving more than 250 active users for knowledge-engineering project engagements. |
Tasks | |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1806.02284v1 |
http://arxiv.org/pdf/1806.02284v1.pdf | |
PWC | https://paperswithcode.com/paper/corpus-conversion-service-a-machine-learning-1 |
Repo | |
Framework | |
Which Emoji Talks Best for My Picture?
Title | Which Emoji Talks Best for My Picture? |
Authors | Anurag Illendula, Kv Manohar, Manish Reddy Yedulla |
Abstract | Emojis have evolved as complementary sources for expressing emotion in social-media platforms where posts are mostly composed of texts and images. In order to increase the expressiveness of the social media posts, users associate relevant emojis with their posts. Incorporating domain knowledge has improved machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emoji recommendation task in case of multimedia posts composed of image and text. Our emoji recommendation can suggest accurate emojis by exploiting both visual and textual content from social media posts as well as domain knowledge from Emojinet. Experimental results using pre-trained image classifiers and pre-trained word embedding models on Twitter dataset show that our results outperform the current state-of-the-art by 9.6%. We also present a user study evaluation of our recommendation system on a set of images chosen from MSCOCO dataset. |
Tasks | |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08891v1 |
http://arxiv.org/pdf/1808.08891v1.pdf | |
PWC | https://paperswithcode.com/paper/which-emoji-talks-best-for-my-picture |
Repo | |
Framework | |
Towards a Grounded Dialog Model for Explainable Artificial Intelligence
Title | Towards a Grounded Dialog Model for Explainable Artificial Intelligence |
Authors | Prashan Madumal, Tim Miller, Frank Vetere, Liz Sonenberg |
Abstract | To generate trust with their users, Explainable Artificial Intelligence (XAI) systems need to include an explanation model that can communicate the internal decisions, behaviours and actions to the interacting humans. Successful explanation involves both cognitive and social processes. In this paper we focus on the challenge of meaningful interaction between an explainer and an explainee and investigate the structural aspects of an explanation in order to propose a human explanation dialog model. We follow a bottom-up approach to derive the model by analysing transcripts of 398 different explanation dialog types. We use grounded theory to code and identify key components of which an explanation dialog consists. We carry out further analysis to identify the relationships between components and sequences and cycles that occur in a dialog. We present a generalized state model obtained by the analysis and compare it with an existing conceptual dialog model of explanation. |
Tasks | |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08055v1 |
http://arxiv.org/pdf/1806.08055v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-grounded-dialog-model-for |
Repo | |
Framework | |
Regularized Contextual Bandits
Title | Regularized Contextual Bandits |
Authors | Xavier Fontaine, Quentin Berthet, Vianney Perchet |
Abstract | We consider the stochastic contextual bandit problem with additional regularization. The motivation comes from problems where the policy of the agent must be close to some baseline policy which is known to perform well on the task. To tackle this problem we use a nonparametric model and propose an algorithm splitting the context space into bins, and solving simultaneously - and independently - regularized multi-armed bandit instances on each bin. We derive slow and fast rates of convergence, depending on the unknown complexity of the problem. We also consider a new relevant margin condition to get problem-independent convergence rates, ending up in intermediate convergence rates interpolating between the aforementioned slow and fast rates. |
Tasks | Multi-Armed Bandits |
Published | 2018-10-11 |
URL | https://arxiv.org/abs/1810.05065v2 |
https://arxiv.org/pdf/1810.05065v2.pdf | |
PWC | https://paperswithcode.com/paper/regularized-contextual-bandits |
Repo | |
Framework | |