July 29, 2019

3030 words 15 mins read

Paper Group ANR 99

Stem-ming the Tide: Predicting STEM attrition using student transcript data. Adversarial and Clean Data Are Not Twins. EMFET: E-mail Features Extraction Tool. Improving Deep Learning by Inverse Square Root Linear Units (ISRLUs). fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs. Discovering Potential Correl …

Stem-ming the Tide: Predicting STEM attrition using student transcript data


Title	Stem-ming the Tide: Predicting STEM attrition using student transcript data
Authors	Lovenoor Aulck, Rohan Aras, Lysia Li, Coulter L’Heureux, Peter Lu, Jevin West
Abstract	Science, technology, engineering, and math (STEM) fields play growing roles in national and international economies by driving innovation and generating high salary jobs. Yet, the US is lagging behind other highly industrialized nations in terms of STEM education and training. Furthermore, many economic forecasts predict a rising shortage of domestic STEM-trained professions in the US for years to come. One potential solution to this deficit is to decrease the rates at which students leave STEM-related fields in higher education, as currently over half of all students intending to graduate with a STEM degree eventually attrite. However, little quantitative research at scale has looked at causes of STEM attrition, let alone the use of machine learning to examine how well this phenomenon can be predicted. In this paper, we detail our efforts to model and predict dropout from STEM fields using one of the largest known datasets used for research on students at a traditional campus setting. Our results suggest that attrition from STEM fields can be accurately predicted with data that is routinely collected at universities using only information on students’ first academic year. We also propose a method to model student STEM intentions for each academic term to better understand the timing of STEM attrition events. We believe these results show great promise in using machine learning to improve STEM retention in traditional and non-traditional campus settings.
Tasks
Published	2017-08-28
URL	http://arxiv.org/abs/1708.09344v1
PDF	http://arxiv.org/pdf/1708.09344v1.pdf
PWC	https://paperswithcode.com/paper/stem-ming-the-tide-predicting-stem-attrition
Repo
Framework

Adversarial and Clean Data Are Not Twins


Title	Adversarial and Clean Data Are Not Twins
Authors	Zhitao Gong, Wenlu Wang, Wei-Shinn Ku
Abstract	Adversarial attack has cast a shadow on the massive success of deep neural networks. Despite being almost visually identical to the clean data, the adversarial images can fool deep neural networks into wrong predictions with very high confidence. In this paper, however, we show that we can build a simple binary classifier separating the adversarial apart from the clean data with accuracy over 99%. We also empirically show that the binary classifier is robust to a second-round adversarial attack. In other words, it is difficult to disguise adversarial samples to bypass the binary classifier. Further more, we empirically investigate the generalization limitation which lingers on all current defensive methods, including the binary classifier approach. And we hypothesize that this is the result of intrinsic property of adversarial crafting algorithms.
Tasks	Adversarial Attack
Published	2017-04-17
URL	http://arxiv.org/abs/1704.04960v1
PDF	http://arxiv.org/pdf/1704.04960v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-and-clean-data-are-not-twins
Repo
Framework

EMFET: E-mail Features Extraction Tool


Title	EMFET: E-mail Features Extraction Tool
Authors	Wadi’ Hijawi, Hossam Faris, Ja’far Alqatawna, Ibrahim Aljarah, Ala’ M. Al-Zoubi, Maria Habib
Abstract	EMFET is an open source and flexible tool that can be used to extract a large number of features from any email corpus with emails saved in EML format. The extracted features can be categorized into three main groups: header features, payload (body) features, and attachment features. The purpose of the tool is to help practitioners and researchers to build datasets that can be used for training machine learning models for spam detection. So far, 140 features can be extracted using EMFET. EMFET is extensible and easy to use. The source code of EMFET is publicly available at GitHub (https://github.com/WadeaHijjawi/EmailFeaturesExtraction)
Tasks
Published	2017-11-22
URL	http://arxiv.org/abs/1711.08521v1
PDF	http://arxiv.org/pdf/1711.08521v1.pdf
PWC	https://paperswithcode.com/paper/emfet-e-mail-features-extraction-tool
Repo
Framework

Improving Deep Learning by Inverse Square Root Linear Units (ISRLUs)


Title	Improving Deep Learning by Inverse Square Root Linear Units (ISRLUs)
Authors	Brad Carlile, Guy Delamarter, Paul Kinney, Akiko Marti, Brian Whitney
Abstract	We introduce the “inverse square root linear unit” (ISRLU) to speed up learning in deep neural networks. ISRLU has better performance than ELU but has many of the same benefits. ISRLU and ELU have similar curves and characteristics. Both have negative values, allowing them to push mean unit activation closer to zero, and bring the normal gradient closer to the unit natural gradient, ensuring a noise-robust deactivation state, lessening the over fitting risk. The significant performance advantage of ISRLU on traditional CPUs also carry over to more efficient HW implementations on HW/SW codesign for CNNs/RNNs. In experiments with TensorFlow, ISRLU leads to faster learning and better generalization than ReLU on CNNs. This work also suggests a computationally efficient variant called the “inverse square root unit” (ISRU) which can be used for RNNs. Many RNNs use either long short-term memory (LSTM) and gated recurrent units (GRU) which are implemented with tanh and sigmoid activation functions. ISRU has less com- putational complexity but still has a similar curve to tanh and sigmoid.
Tasks
Published	2017-10-27
URL	http://arxiv.org/abs/1710.09967v2
PDF	http://arxiv.org/pdf/1710.09967v2.pdf
PWC	https://paperswithcode.com/paper/improving-deep-learning-by-inverse-square
Repo
Framework

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs


Title	fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs
Authors	Stylianos I. Venieris, Christos-Savvas Bouganis
Abstract	In recent years, Convolutional Neural Networks (ConvNets) have become an enabling technology for a wide range of novel embedded Artificial Intelligence systems. Across the range of applications, the performance needs vary significantly, from high-throughput video surveillance to the very low-latency requirements of autonomous cars. In this context, FPGAs can provide a potential platform that can be optimally configured based on the different performance needs. However, the complexity of ConvNet models keeps increasing making their mapping to an FPGA device a challenging task. This work presents fpgaConvNet, an end-to-end framework for mapping ConvNets on FPGAs. The proposed framework employs an automated design methodology based on the Synchronous Dataflow (SDF) paradigm and defines a set of SDF transformations in order to efficiently explore the architectural design space. By selectively optimising for throughput, latency or multiobjective criteria, the presented tool is able to efficiently explore the design space and generate hardware designs from high-level ConvNet specifications, explicitly optimised for the performance metric of interest. Overall, our framework yields designs that improve the performance by up to 6.65x over highly optimised embedded GPU designs for the same power constraints in embedded environments.
Tasks
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08740v1
PDF	http://arxiv.org/pdf/1711.08740v1.pdf
PWC	https://paperswithcode.com/paper/fpgaconvnet-a-toolflow-for-mapping-diverse
Repo
Framework

Discovering Potential Correlations via Hypercontractivity


Title	Discovering Potential Correlations via Hypercontractivity
Authors	Hyeji Kim, Weihao Gao, Sreeram Kannan, Sewoong Oh, Pramod Viswanath
Abstract	Discovering a correlation from one variable to another variable is of fundamental scientific and practical interest. While existing correlation measures are suitable for discovering average correlation, they fail to discover hidden or potential correlations. To bridge this gap, (i) we postulate a set of natural axioms that we expect a measure of potential correlation to satisfy; (ii) we show that the rate of information bottleneck, i.e., the hypercontractivity coefficient, satisfies all the proposed axioms; (iii) we provide a novel estimator to estimate the hypercontractivity coefficient from samples; and (iv) we provide numerical experiments demonstrating that this proposed estimator discovers potential correlations among various indicators of WHO datasets, is robust in discovering gene interactions from gene expression time series data, and is statistically more powerful than the estimators for other correlation measures in binary hypothesis testing of canonical examples of potential correlations.
Tasks	Time Series
Published	2017-09-12
URL	http://arxiv.org/abs/1709.04024v3
PDF	http://arxiv.org/pdf/1709.04024v3.pdf
PWC	https://paperswithcode.com/paper/discovering-potential-correlations-via
Repo
Framework

AMR Parsing using Stack-LSTMs


Title	AMR Parsing using Stack-LSTMs
Authors	Miguel Ballesteros, Yaser Al-Onaizan
Abstract	We present a transition-based AMR parser that directly generates AMR parses from plain text. We use Stack-LSTMs to represent our parser state and make decisions greedily. In our experiments, we show that our parser achieves very competitive scores on English using only AMR training data. Adding additional information, such as POS tags and dependency trees, improves the results further.
Tasks	Amr Parsing
Published	2017-07-24
URL	http://arxiv.org/abs/1707.07755v2
PDF	http://arxiv.org/pdf/1707.07755v2.pdf
PWC	https://paperswithcode.com/paper/amr-parsing-using-stack-lstms
Repo
Framework

ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing


Title	ShotgunWSD: An unsupervised algorithm for global word sense disambiguation inspired by DNA sequencing
Authors	Andrei M. Butnaru, Radu Tudor Ionescu, Florentina Hristea
Abstract	In this paper, we present a novel unsupervised algorithm for word sense disambiguation (WSD) at the document level. Our algorithm is inspired by a widely-used approach in the field of genetics for whole genome sequencing, known as the Shotgun sequencing technique. The proposed WSD algorithm is based on three main steps. First, a brute-force WSD algorithm is applied to short context windows (up to 10 words) selected from the document in order to generate a short list of likely sense configurations for each window. In the second step, these local sense configurations are assembled into longer composite configurations based on suffix and prefix matching. The resulted configurations are ranked by their length, and the sense of each word is chosen based on a voting scheme that considers only the top k configurations in which the word appears. We compare our algorithm with other state-of-the-art unsupervised WSD algorithms and demonstrate better performance, sometimes by a very large margin. We also show that our algorithm can yield better performance than the Most Common Sense (MCS) baseline on one data set. Moreover, our algorithm has a very small number of parameters, is robust to parameter tuning, and, unlike other bio-inspired methods, it gives a deterministic solution (it does not involve random choices).
Tasks	Common Sense Reasoning, Word Sense Disambiguation
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08084v1
PDF	http://arxiv.org/pdf/1707.08084v1.pdf
PWC	https://paperswithcode.com/paper/shotgunwsd-an-unsupervised-algorithm-for
Repo
Framework

Prediction of Daytime Hypoglycemic Events Using Continuous Glucose Monitoring Data and Classification Technique


Title	Prediction of Daytime Hypoglycemic Events Using Continuous Glucose Monitoring Data and Classification Technique
Authors	Miyeon Jung, You-Bin Lee, Sang-Man Jin, Sung-Min Park
Abstract	Daytime hypoglycemia should be accurately predicted to achieve normoglycemia and to avoid disastrous situations. Hypoglycemia, an abnormally low blood glucose level, is divided into daytime hypoglycemia and nocturnal hypoglycemia. Many studies of hypoglycemia prevention deal with nocturnal hypoglycemia. In this paper, we propose new predictor variables to predict daytime hypoglycemia using continuous glucose monitoring (CGM) data. We apply classification and regression tree (CART) as a prediction method. The independent variables of our prediction model are the rate of decrease from a peak and absolute level of the BG at the decision point. The evaluation results showed that our model was able to detect almost 80% of hypoglycemic events 15 min in advance, which was higher than the existing methods with similar conditions. The proposed method might achieve a real-time prediction as well as can be embedded into BG monitoring device.
Tasks
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08769v1
PDF	http://arxiv.org/pdf/1704.08769v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-daytime-hypoglycemic-events
Repo
Framework

Context-aware Path Ranking for Knowledge Base Completion


Title	Context-aware Path Ranking for Knowledge Base Completion
Authors	Sahisnu Mazumder, Bing Liu
Abstract	Knowledge base (KB) completion aims to infer missing facts from existing ones in a KB. Among various approaches, path ranking (PR) algorithms have received increasing attention in recent years. PR algorithms enumerate paths between entity pairs in a KB and use those paths as features to train a model for missing fact prediction. Due to their good performances and high model interpretability, several methods have been proposed. However, most existing methods suffer from scalability (high RAM consumption) and feature explosion (trains on an exponentially large number of features) problems. This paper proposes a Context-aware Path Ranking (C-PR) algorithm to solve these problems by introducing a selective path exploration strategy. C-PR learns global semantics of entities in the KB using word embedding and leverages the knowledge of entity semantics to enumerate contextually relevant paths using bidirectional random walk. Experimental results on three large KBs show that the path features (fewer in number) discovered by C-PR not only improve predictive performance but also are more interpretable than existing baselines.
Tasks	Knowledge Base Completion
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07745v1
PDF	http://arxiv.org/pdf/1712.07745v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-path-ranking-for-knowledge-base
Repo
Framework

Image reconstruction with imperfect forward models and applications in deblurring


Title	Image reconstruction with imperfect forward models and applications in deblurring
Authors	Yury Korolev, Jan Lellmann
Abstract	We present and analyse an approach to image reconstruction problems with imperfect forward models based on partially ordered spaces - Banach lattices. In this approach, errors in the data and in the forward models are described using order intervals. The method can be characterised as the lattice analogue of the residual method, where the feasible set is defined by linear inequality constraints. The study of this feasible set is the main contribution of this paper. Convexity of this feasible set is examined in several settings and modifications for introducing additional information about the forward operator are considered. Numerical examples demonstrate the performance of the method in deblurring with errors in the blurring kernel.
Tasks	Deblurring, Image Reconstruction
Published	2017-08-03
URL	http://arxiv.org/abs/1708.01244v3
PDF	http://arxiv.org/pdf/1708.01244v3.pdf
PWC	https://paperswithcode.com/paper/image-reconstruction-with-imperfect-forward
Repo
Framework

An Analysis of Dropout for Matrix Factorization


Title	An Analysis of Dropout for Matrix Factorization
Authors	Jacopo Cavazza, Connor Lane, Benjamin D. Haeffele, Vittorio Murino, René Vidal
Abstract	Dropout is a simple yet effective algorithm for regularizing neural networks by randomly dropping out units through Bernoulli multiplicative noise, and for some restricted problem classes, such as linear or logistic regression, several theoretical studies have demonstrated the equivalence between dropout and a fully deterministic optimization problem with data-dependent Tikhonov regularization. This work presents a theoretical analysis of dropout for matrix factorization, where Bernoulli random variables are used to drop a factor, thereby attempting to control the size of the factorization. While recent work has demonstrated the empirical effectiveness of dropout for matrix factorization, a theoretical understanding of the regularization properties of dropout in this context remains elusive. This work demonstrates the equivalence between dropout and a fully deterministic model for matrix factorization in which the factors are regularized by the sum of the product of the norms of the columns. While the resulting regularizer is closely related to a variational form of the nuclear norm, suggesting that dropout may limit the size of the factorization, we show that it is possible to trivially lower the objective value by doubling the size of the factorization. We show that this problem is caused by the use of a fixed dropout rate, which motivates the use of a rate that increases with the size of the factorization. Synthetic experiments validate our theoretical findings.
Tasks
Published	2017-10-10
URL	http://arxiv.org/abs/1710.03487v1
PDF	http://arxiv.org/pdf/1710.03487v1.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-dropout-for-matrix
Repo
Framework

Self-supervised learning of visual features through embedding images into text topic spaces


Title	Self-supervised learning of visual features through embedding images into text topic spaces
Authors	Lluis Gomez, Yash Patel, Marçal Rusiñol, Dimosthenis Karatzas, C. V. Jawahar
Abstract	End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning of visual features by mining a large scale corpus of multi-modal (text and image) documents. We show that discriminative visual features can be learnt efficiently by training a CNN to predict the semantic context in which a particular image is more probable to appear as an illustration. For this we leverage the hidden semantic structures discovered in the text corpus with a well-known topic modeling technique. Our experiments demonstrate state of the art performance in image classification, object detection, and multi-modal retrieval compared to recent self-supervised or natural-supervised approaches.
Tasks	Image Classification, Object Detection
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08631v1
PDF	http://arxiv.org/pdf/1705.08631v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-of-visual-features
Repo
Framework

Interval Arithmetic and Interval-Aware Operators for Genetic Programming


Title	Interval Arithmetic and Interval-Aware Operators for Genetic Programming
Authors	Grant Dick
Abstract	Symbolic regression via genetic programming is a flexible approach to machine learning that does not require up-front specification of model structure. However, traditional approaches to symbolic regression require the use of protected operators, which can lead to perverse model characteristics and poor generalisation. In this paper, we revisit interval arithmetic as one possible solution to allow genetic programming to perform regression using unprotected operators. Using standard benchmarks, we show that using interval arithmetic within model evaluation does not prevent invalid solutions from entering the population, meaning that search performance remains compromised. We extend the basic interval arithmetic concept with `safe’ search operators that integrate interval information into their process, thereby greatly reducing the number of invalid solutions produced during search. The resulting algorithms are able to more effectively identify good models that generalise well to unseen data. We conclude with an analysis of the sensitivity of interval arithmetic-based operators with respect to the accuracy of the supplied input feature intervals. \|
Tasks
Published	2017-04-17
URL	http://arxiv.org/abs/1704.04998v1
PDF	http://arxiv.org/pdf/1704.04998v1.pdf
PWC	https://paperswithcode.com/paper/interval-arithmetic-and-interval-aware
Repo
Framework

Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images


Title	Weakly-supervised localization of diabetic retinopathy lesions in retinal fundus images
Authors	Waleed M. Gondal, Jan M. Köhler, René Grzeszick, Gernot A. Fink, Michael Hirsch
Abstract	Convolutional neural networks (CNNs) show impressive performance for image classification and detection, extending heavily to the medical image domain. Nevertheless, medical experts are sceptical in these predictions as the nonlinear multilayer structure resulting in a classification outcome is not directly graspable. Recently, approaches have been shown which help the user to understand the discriminative regions within an image which are decisive for the CNN to conclude to a certain class. Although these approaches could help to build trust in the CNNs predictions, they are only slightly shown to work with medical image data which often poses a challenge as the decision for a class relies on different lesion areas scattered around the entire image. Using the DiaretDB1 dataset, we show that on retina images different lesion areas fundamental for diabetic retinopathy are detected on an image level with high accuracy, comparable or exceeding supervised methods. On lesion level, we achieve few false positives with high sensitivity, though, the network is solely trained on image-level labels which do not include information about existing lesions. Classifying between diseased and healthy images, we achieve an AUC of 0.954 on the DiaretDB1.
Tasks	Image Classification
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09634v1
PDF	http://arxiv.org/pdf/1706.09634v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-localization-of-diabetic
Repo
Framework