Paper Group ANR 230
Uncovering Sociological Effect Heterogeneity using Machine Learning. Unsupervised Doodling and Painting with Improved SPIRAL. Active Learning within Constrained Environments through Imitation of an Expert Questioner. Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering. In Plain Sight: Media Bias Through the Lens of Factua …
Uncovering Sociological Effect Heterogeneity using Machine Learning
Title | Uncovering Sociological Effect Heterogeneity using Machine Learning |
Authors | Jennie E. Brand, Jiahui Xu, Bernard Koch, Pablo Geraldo |
Abstract | Individuals do not respond uniformly to treatments, events, or interventions. Sociologists routinely partition samples into subgroups to explore how the effects of treatments vary by covariates like race, gender, and socioeconomic status. In so doing, analysts determine the key subpopulations based on theoretical priors. Data-driven discoveries are also routine, yet the analyses by which sociologists typically go about them are problematic and seldom move us beyond our expectations, and biases, to explore new meaningful subgroups. Emerging machine learning methods allow researchers to explore sources of variation that they may not have previously considered, or envisaged. In this paper, we use causal trees to recursively partition the sample and uncover sources of treatment effect heterogeneity. We use honest estimation, splitting the sample into a training sample to grow the tree and an estimation sample to estimate leaf-specific effects. Assessing a central topic in the social inequality literature, college effects on wages, we compare what we learn from conventional approaches for exploring variation in effects to causal trees. Given our use of observational data, we use leaf-specific matching and sensitivity analyses to address confounding and offer interpretations of effects based on observed and unobserved heterogeneity. We encourage researchers to follow similar practices in their work on variation in sociological effects. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.09138v1 |
https://arxiv.org/pdf/1909.09138v1.pdf | |
PWC | https://paperswithcode.com/paper/uncovering-sociological-effect-heterogeneity |
Repo | |
Framework | |
Unsupervised Doodling and Painting with Improved SPIRAL
Title | Unsupervised Doodling and Painting with Improved SPIRAL |
Authors | John F. J. Mellor, Eunbyung Park, Yaroslav Ganin, Igor Babuschkin, Tejas Kulkarni, Dan Rosenbaum, Andy Ballard, Theophane Weber, Oriol Vinyals, S. M. Ali Eslami |
Abstract | We investigate using reinforcement learning agents as generative models of images (extending arXiv:1804.01118). A generative agent controls a simulated painting environment, and is trained with rewards provided by a discriminator network simultaneously trained to assess the realism of the agent’s samples, either unconditional or reconstructions. Compared to prior work, we make a number of improvements to the architectures of the agents and discriminators that lead to intriguing and at times surprising results. We find that when sufficiently constrained, generative agents can learn to produce images with a degree of visual abstraction, despite having only ever seen real photographs (no human brush strokes). And given enough time with the painting environment, they can produce images with considerable realism. These results show that, under the right circumstances, some aspects of human drawing can emerge from simulated embodiment, without the need for external supervision, imitation or social cues. Finally, we note the framework’s potential for use in creative applications. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01007v1 |
https://arxiv.org/pdf/1910.01007v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-doodling-and-painting-with |
Repo | |
Framework | |
Active Learning within Constrained Environments through Imitation of an Expert Questioner
Title | Active Learning within Constrained Environments through Imitation of an Expert Questioner |
Authors | Kalesha Bullard, Yannick Schroecker, Sonia Chernova |
Abstract | Active learning agents typically employ a query selection algorithm which solely considers the agent’s learning objectives. However, this may be insufficient in more realistic human domains. This work uses imitation learning to enable an agent in a constrained environment to concurrently reason about both its internal learning goals and environmental constraints externally imposed, all within its objective function. Experiments are conducted on a concept learning task to test generalization of the proposed algorithm to different environmental conditions and analyze how time and resource constraints impact efficacy of solving the learning problem. Our findings show the environmentally-aware learning agent is able to statistically outperform all other active learners explored under most of the constrained conditions. A key implication is adaptation for active learning agents to more realistic human environments, where constraints are often externally imposed on the learner. |
Tasks | Active Learning, Imitation Learning |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00921v1 |
https://arxiv.org/pdf/1907.00921v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-within-constrained |
Repo | |
Framework | |
Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering
Title | Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering |
Authors | Sewon Min, Danqi Chen, Luke Zettlemoyer, Hannaneh Hajishirzi |
Abstract | This paper presents a general approach for open-domain question answering (QA) that models interactions between paragraphs using structural information from a knowledge base. We first describe how to construct a graph of passages from a large corpus, where the relations are either from the knowledge base or the internal structure of Wikipedia. We then introduce a reading comprehension model which takes this graph as an input, to better model relationships across pairs of paragraphs. This approach consistently outperforms competitive baselines in three open-domain QA datasets, WebQuestions, Natural Questions and TriviaQA, improving the pipeline-based state-of-the-art by 3–13%. |
Tasks | Open-Domain Question Answering, Question Answering, Reading Comprehension |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03868v1 |
https://arxiv.org/pdf/1911.03868v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-guided-text-retrieval-and-reading |
Repo | |
Framework | |
In Plain Sight: Media Bias Through the Lens of Factual Reporting
Title | In Plain Sight: Media Bias Through the Lens of Factual Reporting |
Authors | Lisa Fan, Marshall White, Eva Sharma, Ruisi Su, Prafulla Kumar Choubey, Ruihong Huang, Lu Wang |
Abstract | The increasing prevalence of political bias in news media calls for greater public awareness of it, as well as robust methods for its detection. While prior work in NLP has primarily focused on the lexical bias captured by linguistic attributes such as word choice and syntax, other types of bias stem from the actual content selected for inclusion in the text. In this work, we investigate the effects of informational bias: factual content that can nevertheless be deployed to sway reader opinion. We first produce a new dataset, BASIL, of 300 news articles annotated with 1,727 bias spans and find evidence that informational bias appears in news articles more frequently than lexical bias. We further study our annotations to observe how informational bias surfaces in news articles by different media outlets. Lastly, a baseline model for informational bias prediction is presented by fine-tuning BERT on our labeled data, indicating the challenges of the task and future directions. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02670v1 |
https://arxiv.org/pdf/1909.02670v1.pdf | |
PWC | https://paperswithcode.com/paper/in-plain-sight-media-bias-through-the-lens-of |
Repo | |
Framework | |
A Weak Supervision Approach to Detecting Visual Anomalies for Automated Testing of Graphics Units
Title | A Weak Supervision Approach to Detecting Visual Anomalies for Automated Testing of Graphics Units |
Authors | Adi Szeskin, Lev Faivishevsky, Ashwin K Muppalla, Amitai Armon, Tom Hope |
Abstract | We present a deep learning system for testing graphics units by detecting novel visual corruptions in videos. Unlike previous work in which manual tagging was required to collect labeled training data, our weak supervision method is fully automatic and needs no human labelling. This is achieved by reproducing driver bugs that increase the probability of generating corruptions, and by making use of ideas and methods from the Multiple Instance Learning (MIL) setting. In our experiments, we significantly outperform unsupervised methods such as GAN-based models and discover novel corruptions undetected by baselines, while adhering to strict requirements on accuracy and efficiency of our real-time system. |
Tasks | Multiple Instance Learning |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04138v1 |
https://arxiv.org/pdf/1912.04138v1.pdf | |
PWC | https://paperswithcode.com/paper/a-weak-supervision-approach-to-detecting |
Repo | |
Framework | |
Multiple Anchor Learning for Visual Object Detection
Title | Multiple Anchor Learning for Visual Object Detection |
Authors | Wei Ke, Tianliang Zhang, Zeyi Huang, Qixiang Ye, Jianzhuang Liu, Dong Huang |
Abstract | Classification and localization are two pillars of visual object detectors. However, in CNN-based detectors, these two modules are usually optimized under a fixed set of candidate (or anchor) bounding boxes. This configuration significantly limits the possibility to jointly optimize classification and localization. In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector. Our approach, referred to as Multiple Anchor Learning (MAL), constructs anchor bags and selects the most representative anchors from each bag. Such an iterative selection process is potentially NP-hard to optimize. To address this issue, we solve MAL by repetitively depressing the confidence of selected anchors by perturbing their corresponding features. In an adversarial selection-depression manner, MAL not only pursues optimal solutions but also fully leverages multiple anchors/features to learn a detection model. Experiments show that MAL improves the baseline RetinaNet with significant margins on the commonly used MS-COCO object detection benchmark and achieves new state-of-the-art detection performance compared with recent methods. |
Tasks | Multiple Instance Learning, Object Detection |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02252v1 |
https://arxiv.org/pdf/1912.02252v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-anchor-learning-for-visual-object |
Repo | |
Framework | |
Music Auto-tagging Using CNNs and Mel-spectrograms With Reduced Frequency and Time Resolution
Title | Music Auto-tagging Using CNNs and Mel-spectrograms With Reduced Frequency and Time Resolution |
Authors | Andres Ferraro, Dmitry Bogdanov, Jay Ho Jeon, Jason Yoon, Xavier Serra |
Abstract | Automatic tagging of music is an important research topic in Music Information Retrieval achieved improvements with advances in deep learning. In particular, many state-of-the-art systems use Convolutional Neural Networks and operate on mel-spectrogram representations of the audio. In this paper, we compare commonly used mel-spectrogram representations and evaluate model performances that can be achieved by reducing the input size in terms of both lesser amount of frequency bands and larger frame rates. We use the MagnaTagaTune dataset for comprehensive performance comparisons and then compare selected configurations on the larger Million Song Dataset. The results of this study can serve researchers and practitioners in their trade-off decision between accuracy of the models, data storage size and training and inference times. |
Tasks | Information Retrieval, Music Auto-Tagging, Music Information Retrieval |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04824v2 |
https://arxiv.org/pdf/1911.04824v2.pdf | |
PWC | https://paperswithcode.com/paper/music-auto-tagging-using-cnns-and-mel |
Repo | |
Framework | |
Charge-Based Prison Term Prediction with Deep Gating Network
Title | Charge-Based Prison Term Prediction with Deep Gating Network |
Authors | Huajie Chen, Deng Cai, Wei Dai, Zehui Dai, Yadong Ding |
Abstract | Judgment prediction for legal cases has attracted much research efforts for its practice use, of which the ultimate goal is prison term prediction. While existing work merely predicts the total prison term, in reality a defendant is often charged with multiple crimes. In this paper, we argue that charge-based prison term prediction (CPTP) not only better fits realistic needs, but also makes the total prison term prediction more accurate and interpretable. We collect the first large-scale structured data for CPTP and evaluate several competitive baselines. Based on the observation that fine-grained feature selection is the key to achieving good performance, we propose the Deep Gating Network (DGN) for charge-specific feature selection and aggregation. Experiments show that DGN achieves the state-of-the-art performance. |
Tasks | Feature Selection |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11521v1 |
https://arxiv.org/pdf/1908.11521v1.pdf | |
PWC | https://paperswithcode.com/paper/charge-based-prison-term-prediction-with-deep |
Repo | |
Framework | |
MPP: Model Performance Predictor
Title | MPP: Model Performance Predictor |
Authors | Sindhu Ghanta, Sriram Subramanian, Lior Khermosh, Harshil Shah, Yakov Goldberg, Swaminathan Sundararaman, Drew Roselli, Nisha Talagala |
Abstract | Operations is a key challenge in the domain of machine learning pipeline deployments involving monitoring and management of real-time prediction quality. Typically, metrics like accuracy, RMSE etc., are used to track the performance of models in deployment. However, these metrics cannot be calculated in production due to the absence of labels. We propose using an ML algorithm, Model Performance Predictor (MPP), to track the performance of the models in deployment. We argue that an ensemble of such metrics can be used to create a score representing the prediction quality in production. This in turn facilitates formulation and customization of ML alerts, that can be escalated by an operations team to the data science team. Such a score automates monitoring and enables ML deployments at scale. |
Tasks | |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08638v1 |
http://arxiv.org/pdf/1902.08638v1.pdf | |
PWC | https://paperswithcode.com/paper/mpp-model-performance-predictor |
Repo | |
Framework | |
Understanding the efficacy, reliability and resiliency of computer vision techniques for malware detection and future research directions
Title | Understanding the efficacy, reliability and resiliency of computer vision techniques for malware detection and future research directions |
Authors | Li Chen |
Abstract | My research lies in the intersection of security and machine learning. This overview summarizes one component of my research: combining computer vision with malware exploit detection for enhanced security solutions. I will present the perspectives of efficacy, reliability and resiliency to formulate threat detection as computer vision problems and develop state-of-the-art image-based malware classification. Representing malware binary as images provides a direct visualization of data samples, reduces the efforts for feature extraction, and consumes the whole binary for holistic structural analysis. Employing transfer learning of deep neural networks effective for large scale image classification to malware classification demonstrates superior classification efficacy compared with classical machine learning algorithms. To enhance reliability of these vision-based malware detectors, interpretation frameworks can be constructed on the malware visual representations and useful for extracting faithful explanation, so that security practitioners have confidence in the model before deployment. In cyber-security applications, we should always assume that a malware writer constantly modifies code to bypass detection. Addressing the resiliency of the malware detectors is equivalently important as efficacy and reliability. Via understanding the attack surfaces of machine learning models used for malware detection, we can greatly improve the robustness of the algorithms to combat malware adversaries in the wild. Finally I will discuss future research directions worth pursuing in this research community. |
Tasks | Image Classification, Malware Classification, Malware Detection, Transfer Learning |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.10504v1 |
http://arxiv.org/pdf/1904.10504v1.pdf | |
PWC | https://paperswithcode.com/paper/190410504 |
Repo | |
Framework | |
Zero-shot Feature Selection via Exploiting Semantic Knowledge
Title | Zero-shot Feature Selection via Exploiting Semantic Knowledge |
Authors | Zheng Wang, Qiao Wang, Tingzhang Zhao, Xiaojun Ye |
Abstract | Feature selection plays an important role in pattern recognition and machine learning systems. Supervised knowledge can significantly improve the performance. However, confronted with the rapid growth of newly-emerging concepts, existing supervised methods may easily suffer from the scarcity of labeled data for training. Therefore, this paper studies the problem of Zero-Shot Feature Selection, i.e., building a feature selection model that generalizes well to “unseen” concepts with limited training data of “seen” concepts. To address this, inspired by zero-shot learning, we use class-semantic descriptions (i.e., attributes) which provide additional semantic information about unseen concepts as supervision. In addition, to seek for more reliable discriminative features, we further propose a novel loss function (named center-characteristic loss) which encourages the selected features to capture the central characteristics of seen concepts. Experimental results on three benchmarks demonstrate the superiority of the proposed method. |
Tasks | Feature Selection, Zero-Shot Learning |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03464v2 |
https://arxiv.org/pdf/1908.03464v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-feature-selection-via-exploiting |
Repo | |
Framework | |
Cost-Efficient Hierarchical Knowledge Extraction with Deep Reinforcement Learning
Title | Cost-Efficient Hierarchical Knowledge Extraction with Deep Reinforcement Learning |
Authors | Jaromír Janisch, Tomáš Pevný, Viliam Lisý |
Abstract | We present a new classification task where a sample is represented by a tree – a hierarchy of sets of objects and theirs properties. Individually for each sample, the task is to sequentially request pieces of information to build the hierarchy, where each new information can be further analyzed, and eventually provide a classification decision. Each piece of information has a real-valued cost and the objective is to maximize the accuracy in presence of a per-sample budget. Many problems can be represented in this manner, such as targeted advertising, medical diagnosis or malware detection. We build our method with a deep reinforcement learning algorithm and a set of techniques to process the hierarchical input and the complex action space. We demonstrate the method on seven relational classification datasets. |
Tasks | Malware Detection, Medical Diagnosis, Multiple Instance Learning |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08756v2 |
https://arxiv.org/pdf/1911.08756v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-with-explicitly |
Repo | |
Framework | |
Clusters in Explanation Space: Inferring disease subtypes from model explanations
Title | Clusters in Explanation Space: Inferring disease subtypes from model explanations |
Authors | Marc-Andre Schulz, Matt Chapman-Rounds, Manisha Verma, Danilo Bzdok, Konstantinos Georgatzis |
Abstract | Identification of disease subtypes and corresponding biomarkers can substantially improve clinical diagnosis and treatment selection. Discovering these subtypes in noisy, high dimensional biomedical data is often impossible for humans and challenging for machines. We introduce a new approach to facilitate the discovery of disease subtypes: Instead of analyzing the original data, we train a diagnostic classifier (healthy vs. diseased) and extract instance-wise explanations for the classifier’s decisions. The distribution of instances in the explanation space of our diagnostic classifier amplifies the different reasons for belonging to the same class - resulting in a representation that is uniquely useful for discovering latent subtypes. We compare our ability to recover subtypes via cluster analysis on model explanations to classical cluster analysis on the original data. In multiple datasets with known ground-truth subclasses, most compellingly on UK Biobank brain imaging data and transcriptome data from the Cancer Genome Atlas, we show that cluster analysis on model explanations substantially outperforms the classical approach. While we believe clustering in explanation space to be particularly valuable for inferring disease subtypes, the method is more general and applicable to any kind of sub-type identification. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08755v1 |
https://arxiv.org/pdf/1912.08755v1.pdf | |
PWC | https://paperswithcode.com/paper/clusters-in-explanation-space-inferring |
Repo | |
Framework | |
Activation Analysis of a Byte-Based Deep Neural Network for Malware Classification
Title | Activation Analysis of a Byte-Based Deep Neural Network for Malware Classification |
Authors | Scott E. Coull, Christopher Gardner |
Abstract | Feature engineering is one of the most costly aspects of developing effective machine learning models, and that cost is even greater in specialized problem domains, like malware classification, where expert skills are necessary to identify useful features. Recent work, however, has shown that deep learning models can be used to automatically learn feature representations directly from the raw, unstructured bytes of the binaries themselves. In this paper, we explore what these models are learning about malware. To do so, we examine the learned features at multiple levels of resolution, from individual byte embeddings to end-to-end analysis of the model. At each step, we connect these byte-oriented activations to their original semantics through parsing and disassembly of the binary to arrive at human-understandable features. Through our results, we identify several interesting features learned by the model and their connection to manually-derived features typically used by traditional machine learning models. Additionally, we explore the impact of training data volume and regularization on the quality of the learned features and the efficacy of the classifiers, revealing the somewhat paradoxical insight that better generalization does not necessarily result in better performance for byte-based malware classifiers. |
Tasks | Feature Engineering, Malware Classification |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.04717v2 |
http://arxiv.org/pdf/1903.04717v2.pdf | |
PWC | https://paperswithcode.com/paper/activation-analysis-of-a-byte-based-deep |
Repo | |
Framework | |