Paper Group ANR 278
A Survey on Open Information Extraction. Exploring Hierarchy-Aware Inverse Reinforcement Learning. Simplifying Reward Design through Divide-and-Conquer. Handwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach. Representation Learning for Resource Usage Prediction. The Diagrammatic AI Language …
A Survey on Open Information Extraction
Title | A Survey on Open Information Extraction |
Authors | Christina Niklaus, Matthias Cetto, André Freitas, Siegfried Handschuh |
Abstract | We provide a detailed overview of the various approaches that were proposed to date to solve the task of Open Information Extraction. We present the major challenges that such systems face, show the evolution of the suggested approaches over time and depict the specific issues they address. In addition, we provide a critique of the commonly applied evaluation procedures for assessing the performance of Open IE systems and highlight some directions for future work. |
Tasks | Open Information Extraction |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05599v1 |
http://arxiv.org/pdf/1806.05599v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-open-information-extraction |
Repo | |
Framework | |
Exploring Hierarchy-Aware Inverse Reinforcement Learning
Title | Exploring Hierarchy-Aware Inverse Reinforcement Learning |
Authors | Chris Cundy, Daniel Filan |
Abstract | We introduce a new generative model for human planning under the Bayesian Inverse Reinforcement Learning (BIRL) framework which takes into account the fact that humans often plan using hierarchical strategies. We describe the Bayesian Inverse Hierarchical RL (BIHRL) algorithm for inferring the values of hierarchical planners, and use an illustrative toy model to show that BIHRL retains accuracy where standard BIRL fails. Furthermore, BIHRL is able to accurately predict the goals of `Wikispeedia’ game players, with inclusion of hierarchical structure in the model resulting in a large boost in accuracy. We show that BIHRL is able to significantly outperform BIRL even when we only have a weak prior on the hierarchical structure of the plans available to the agent, and discuss the significant challenges that remain for scaling up this framework to more realistic settings. | |
Tasks | BIRL |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.05037v1 |
http://arxiv.org/pdf/1807.05037v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-hierarchy-aware-inverse |
Repo | |
Framework | |
Simplifying Reward Design through Divide-and-Conquer
Title | Simplifying Reward Design through Divide-and-Conquer |
Authors | Ellis Ratner, Dylan Hadfield-Menell, Anca D. Dragan |
Abstract | Designing a good reward function is essential to robot planning and reinforcement learning, but it can also be challenging and frustrating. The reward needs to work across multiple different environments, and that often requires many iterations of tuning. We introduce a novel divide-and-conquer approach that enables the designer to specify a reward separately for each environment. By treating these separate reward functions as observations about the underlying true reward, we derive an approach to infer a common reward across all environments. We conduct user studies in an abstract grid world domain and in a motion planning domain for a 7-DOF manipulator that measure user effort and solution quality. We show that our method is faster, easier to use, and produces a higher quality solution than the typical method of designing a reward jointly across all environments. We additionally conduct a series of experiments that measure the sensitivity of these results to different properties of the reward design task, such as the number of environments, the number of feasible solutions per environment, and the fraction of the total features that vary within each environment. We find that independent reward design outperforms the standard, joint, reward design process but works best when the design problem can be divided into simpler subproblems. |
Tasks | Motion Planning |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02501v1 |
http://arxiv.org/pdf/1806.02501v1.pdf | |
PWC | https://paperswithcode.com/paper/simplifying-reward-design-through-divide-and |
Repo | |
Framework | |
Handwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach
Title | Handwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach |
Authors | Saikat Roy, Nibaran Das, Mahantapas Kundu, Mita Nasipuri |
Abstract | In this work, a novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.1.3.3 dataset is reported. Greedy layer wise training of Deep Neural Network has helped to make significant strides in various pattern recognition problems. We employ layerwise training to Deep Convolutional Neural Networks (DCNN) in a supervised fashion and augment the training process with the RMSProp algorithm to achieve faster convergence. We compare results with those obtained from standard shallow learning methods with predefined features, as well as standard DCNNs. Supervised layerwise trained DCNNs are found to outperform standard shallow learning models such as Support Vector Machines as well as regular DCNNs of similar architecture by achieving error rate of 9.67% thereby setting a new benchmark on the CMATERdb 3.1.3.3 with recognition accuracy of 90.33%, representing an improvement of nearly 10%. |
Tasks | |
Published | 2018-02-02 |
URL | http://arxiv.org/abs/1802.00671v1 |
http://arxiv.org/pdf/1802.00671v1.pdf | |
PWC | https://paperswithcode.com/paper/handwritten-isolated-bangla-compound |
Repo | |
Framework | |
Representation Learning for Resource Usage Prediction
Title | Representation Learning for Resource Usage Prediction |
Authors | Florian Schmidt, Mathias Niepert, Felipe Huici |
Abstract | Creating a model of a computer system that can be used for tasks such as predicting future resource usage and detecting anomalies is a challenging problem. Most current systems rely on heuristics and overly simplistic assumptions about the workloads and system statistics. These heuristics are typically a one-size-fits-all solution so as to be applicable in a wide range of applications and systems environments. With this paper, we present our ongoing work of integrating systems telemetry ranging from standard resource usage statistics to kernel and library calls of applications into a machine learning model. Intuitively, such a ML model approximates, at any point in time, the state of a system and allows us to solve tasks such as resource usage prediction and anomaly detection. To achieve this goal, we leverage readily-available information that does not require any changes to the applications run on the system. We train recurrent neural networks to learn a model of the system under consideration. As a proof of concept, we train models specifically to predict future resource usage of running applications. |
Tasks | Anomaly Detection, Representation Learning |
Published | 2018-02-02 |
URL | http://arxiv.org/abs/1802.00673v1 |
http://arxiv.org/pdf/1802.00673v1.pdf | |
PWC | https://paperswithcode.com/paper/representation-learning-for-resource-usage |
Repo | |
Framework | |
The Diagrammatic AI Language (DIAL): Version 0.1
Title | The Diagrammatic AI Language (DIAL): Version 0.1 |
Authors | Guy Marshall, André Freitas |
Abstract | Currently, there is no consistent model for visually or formally representing the architecture of AI systems. This lack of representation brings interpretability, correctness and completeness challenges in the description of existing models and systems. DIAL (The Diagrammatic AI Language) has been created with the aspiration of being an “engineering schematic” for AI Systems. It is presented here as a starting point for a community dialogue towards a common diagrammatic language for AI Systems. |
Tasks | |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.11142v1 |
http://arxiv.org/pdf/1812.11142v1.pdf | |
PWC | https://paperswithcode.com/paper/the-diagrammatic-ai-language-dial-version-01 |
Repo | |
Framework | |
Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks
Title | Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks |
Authors | Farzad Abdolhosseini, Behrooz Azarkhalili, Abbas Maazallahi, Aryan Kamal, Seyed Abolfazl Motahari, Ali Sharifi-Zarchi, Hamidreza Chitsaz |
Abstract | Understanding cell identity is an important task in many biomedical areas. Expression patterns of specific marker genes have been used to characterize some limited cell types, but exclusive markers are not available for many cell types. A second approach is to use machine learning to discriminate cell types based on the whole gene expression profiles (GEPs). The accuracies of simple classification algorithms such as linear discriminators or support vector machines are limited due to the complexity of biological systems. We used deep neural networks to analyze 1040 GEPs from 16 different human tissues and cell types. After comparing different architectures, we identified a specific structure of deep autoencoders that can encode a GEP into a vector of 30 numeric values, which we call the cell identity code (CIC). The original GEP can be reproduced from the CIC with an accuracy comparable to technical replicates of the same experiment. Although we use an unsupervised approach to train the autoencoder, we show different values of the CIC are connected to different biological aspects of the cell, such as different pathways or biological processes. This network can use CIC to reproduce the GEP of the cell types it has never seen during the training. It also can resist some noise in the measurement of the GEP. Furthermore, we introduce classifier autoencoder, an architecture that can accurately identify cell type based on the GEP or the CIC. |
Tasks | |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.04863v1 |
http://arxiv.org/pdf/1806.04863v1.pdf | |
PWC | https://paperswithcode.com/paper/cell-identity-codes-understanding-cell |
Repo | |
Framework | |
SlideNet: Fast and Accurate Slide Quality Assessment Based on Deep Neural Networks
Title | SlideNet: Fast and Accurate Slide Quality Assessment Based on Deep Neural Networks |
Authors | Teng Zhang, Johanna Carvajal, Daniel F. Smith, Kun Zhao, Arnold Wiliem, Peter Hobson, Anthony Jennings, Brian C. Lovell |
Abstract | This work tackles the automatic fine-grained slide quality assessment problem for digitized direct smears test using the Gram staining protocol. Automatic quality assessment can provide useful information for the pathologists and the whole digital pathology workflow. For instance, if the system found a slide to have a low staining quality, it could send a request to the automatic slide preparation system to remake the slide. If the system detects severe damage in the slides, it could notify the experts that manual microscope reading may be required. In order to address the quality assessment problem, we propose a deep neural network based framework to automatically assess the slide quality in a semantic way. Specifically, the first step of our framework is to perform dense fine-grained region classification on the whole slide and calculate the region distribution histogram. Next, our framework will generate assessments of the slide quality from various perspectives: staining quality, information density, damage level and which regions are more valuable for subsequent high-magnification analysis. To make the information more accessible, we present our results in the form of a heat map and text summaries. Additionally, in order to stimulate research in this direction, we propose a novel dataset for slide quality assessment. Experiments show that the proposed framework outperforms recent related works. |
Tasks | |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07240v1 |
http://arxiv.org/pdf/1803.07240v1.pdf | |
PWC | https://paperswithcode.com/paper/slidenet-fast-and-accurate-slide-quality |
Repo | |
Framework | |
How did the discussion go: Discourse act classification in social media conversations
Title | How did the discussion go: Discourse act classification in social media conversations |
Authors | Subhabrata Dutta, Tanmoy Chakraborty, Dipankar Das |
Abstract | We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question-answers, stance detection or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71% and 66% respectively to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information etc. play their roles in charaterizing online discussions on Facebook. |
Tasks | Stance Detection |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02290v1 |
http://arxiv.org/pdf/1808.02290v1.pdf | |
PWC | https://paperswithcode.com/paper/how-did-the-discussion-go-discourse-act |
Repo | |
Framework | |
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Title | Learning Longer-term Dependencies in RNNs with Auxiliary Losses |
Authors | Trieu H. Trinh, Andrew M. Dai, Minh-Thang Luong, Quoc V. Le |
Abstract | Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective. This auxiliary loss forces RNNs to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full BPTT. We evaluate our method on a variety of settings, including pixel-by-pixel image classification with sequence lengths up to 16,000, and a real document classification benchmark. Our results highlight good performance and resource efficiency of this approach over competitive baselines, including other recurrent models and a comparable sized Transformer. Further analyses reveal beneficial effects of the auxiliary loss on optimization and regularization, as well as extreme cases where there is little to no backpropagation. |
Tasks | Document Classification, Image Classification |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00144v3 |
http://arxiv.org/pdf/1803.00144v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-longer-term-dependencies-in-rnns |
Repo | |
Framework | |
Taylor’s law for Human Linguistic Sequences
Title | Taylor’s law for Human Linguistic Sequences |
Authors | Tatsuru Kobayashi, Kumiko Tanaka-Ishii |
Abstract | Taylor’s law describes the fluctuation characteristics underlying a system in which the variance of an event within a time span grows by a power law with respect to the mean. Although Taylor’s law has been applied in many natural and social systems, its application for language has been scarce. This article describes a new quantification of Taylor’s law in natural language and reports an analysis of over 1100 texts across 14 languages. The Taylor exponents of written natural language texts were found to exhibit almost the same value. The exponent was also compared for other language-related data, such as the child-directed speech, music, and programming language code. The results show how the Taylor exponent serves to quantify the fundamental structural complexity underlying linguistic time series. The article also shows the applicability of these findings in evaluating language models. |
Tasks | Time Series |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07893v2 |
http://arxiv.org/pdf/1804.07893v2.pdf | |
PWC | https://paperswithcode.com/paper/taylors-law-for-human-linguistic-sequences |
Repo | |
Framework | |
Machine Learning Accelerated Likelihood-Free Event Reconstruction in Dark Matter Direct Detection
Title | Machine Learning Accelerated Likelihood-Free Event Reconstruction in Dark Matter Direct Detection |
Authors | U. Simola, B. Pelssers, D. Barge, J. Conrad, J. Corander |
Abstract | Reconstructing the position of an interaction for any dual-phase time projection chamber (TPC) with the best precision is key to directly detecting Dark Matter. Using the likelihood-free framework, a new algorithm to reconstruct the 2-D (x; y) position and the size of the charge signal (e) of an interaction is presented. The algorithm uses the charge signal (S2) light distribution obtained by simulating events using a waveform generator. To deal with the computational effort required by the likelihood-free approach, we employ the Bayesian Optimization for Likelihood-Free Inference (BOLFI) algorithm. Together with BOLFI, prior distributions for the parameters of interest (x; y; e) and highly informative discrepancy measures to perform the analyses are introduced. We evaluate the quality of the proposed algorithm by a comparison against the currently existing alternative methods using a large-scale simulation study. BOLFI provides a natural probabilistic uncertainty measure for the reconstruction and it improved the accuracy of the reconstruction over the next best algorithm by up to 15% when focusing on events over a large radii (R > 30 cm, the outer 37% of the detector). In addition, BOLFI provides the smallest uncertainties among all the tested methods. |
Tasks | |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.09930v3 |
http://arxiv.org/pdf/1810.09930v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-accelerated-likelihood-free |
Repo | |
Framework | |
Counterfactual Normalization: Proactively Addressing Dataset Shift and Improving Reliability Using Causal Mechanisms
Title | Counterfactual Normalization: Proactively Addressing Dataset Shift and Improving Reliability Using Causal Mechanisms |
Authors | Adarsh Subbaswamy, Suchi Saria |
Abstract | Predictive models can fail to generalize from training to deployment environments because of dataset shift, posing a threat to model reliability and the safety of downstream decisions made in practice. Instead of using samples from the target distribution to reactively correct dataset shift, we use graphical knowledge of the causal mechanisms relating variables in a prediction problem to proactively remove relationships that do not generalize across environments, even when these relationships may depend on unobserved variables (violations of the “no unobserved confounders” assumption). To accomplish this, we identify variables with unstable paths of statistical influence and remove them from the model. We also augment the causal graph with latent counterfactual variables that isolate unstable paths of statistical influence, allowing us to retain stable paths that would otherwise be removed. Our experiments demonstrate that models that remove vulnerable variables and use estimates of the latent variables transfer better, often outperforming in the target domain despite some accuracy loss in the training domain. |
Tasks | |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03253v1 |
http://arxiv.org/pdf/1808.03253v1.pdf | |
PWC | https://paperswithcode.com/paper/counterfactual-normalization-proactively |
Repo | |
Framework | |
Multi-Cycle Assignment Problems with Rotational Diversity
Title | Multi-Cycle Assignment Problems with Rotational Diversity |
Authors | Helge Spieker, Arnaud Gotlieb, Morten Mossige |
Abstract | Multi-cycle assignment problems address scenarios where a series of general assignment problems has to be solved sequentially. Subsequent cycles can differ from previous ones due to changing availability or creation of tasks and agents, which makes an upfront static schedule infeasible and introduces uncertainty in the task-agent assignment process. We consider the setting where, besides profit maximization, it is also desired to maintain diverse assignments for tasks and agents, such that all tasks have been assigned to all agents over subsequent cycles. This problem of multi-cycle assignment with rotational diversity is approached in two sub-problems: The outer problem which augments the original profit maximization objective with additional information about the state of rotational diversity while the inner problem solves the adjusted general assignment problem in a single execution of the model. We discuss strategies to augment the profit values and evaluate them experimentally. The method’s efficacy is shown in three case studies: multi-cycle variants of the multiple knapsack and the multiple subset sum problems, and a real-world case study on the test case selection and assignment problem from the software engineering domain. |
Tasks | |
Published | 2018-11-08 |
URL | https://arxiv.org/abs/1811.03496v2 |
https://arxiv.org/pdf/1811.03496v2.pdf | |
PWC | https://paperswithcode.com/paper/rotational-diversity-in-multi-cycle |
Repo | |
Framework | |
Flexible and Scalable Deep Learning with MMLSpark
Title | Flexible and Scalable Deep Learning with MMLSpark |
Authors | Mark Hamilton, Sudarshan Raghunathan, Akshaya Annavajhala, Danil Kirsanov, Eduardo de Leon, Eli Barzilay, Ilya Matiach, Joe Davison, Maureen Busch, Miruna Oprescu, Ratan Sur, Roope Astala, Tong Wen, ChangYoung Park |
Abstract | In this work we detail a novel open source library, called MMLSpark, that combines the flexible deep learning library Cognitive Toolkit, with the distributed computing framework Apache Spark. To achieve this, we have contributed Java Language bindings to the Cognitive Toolkit, and added several new components to the Spark ecosystem. In addition, we also integrate the popular image processing library OpenCV with Spark, and present a tool for the automated generation of PySpark wrappers from any SparkML estimator and use this tool to expose all work to the PySpark ecosystem. Finally, we provide a large library of tools for working and developing within the Spark ecosystem. We apply this work to the automated classification of Snow Leopards from camera trap images, and provide an end to end solution for the non-profit conservation organization, the Snow Leopard Trust. |
Tasks | |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.04031v1 |
http://arxiv.org/pdf/1804.04031v1.pdf | |
PWC | https://paperswithcode.com/paper/flexible-and-scalable-deep-learning-with |
Repo | |
Framework | |