February 1, 2020

3303 words 16 mins read

Paper Group AWR 231

Rule-Guided Compositional Representation Learning on Knowledge Graphs. DataLearner: A Data Mining and Knowledge Discovery Tool for Android Smartphones and Tablets. Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search. Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State. Bridging …

Rule-Guided Compositional Representation Learning on Knowledge Graphs


Title	Rule-Guided Compositional Representation Learning on Knowledge Graphs
Authors	Guanglin Niu, Yongfei Zhang, Bo Li, Peng Cui, Si Liu, Jingyang Li, Xiaowei Zhang
Abstract	Representation learning on a knowledge graph (KG) is to embed entities and relations of a KG into low-dimensional continuous vector spaces. Early KG embedding methods only pay attention to structured information encoded in triples, which would cause limited performance due to the structure sparseness of KGs. Some recent attempts consider paths information to expand the structure of KGs but lack explainability in the process of obtaining the path representations. In this paper, we propose a novel Rule and Path-based Joint Embedding (RPJE) scheme, which takes full advantage of the explainability and accuracy of logic rules, the generalization of KG embedding as well as the supplementary semantic structure of paths. Specifically, logic rules of different lengths (the number of relations in rule body) in the form of Horn clauses are first mined from the KG and elaborately encoded for representation learning. Then, the rules of length 2 are applied to compose paths accurately while the rules of length 1 are explicitly employed to create semantic associations among relations and constrain relation embeddings. Besides, the confidence level of each rule is also considered in optimization to guarantee the availability of applying the rule to representation learning. Extensive experimental results illustrate that RPJE outperforms other state-of-the-art baselines on KG completion task, which also demonstrate the superiority of utilizing logic rules as well as paths for improving the accuracy and explainability of representation learning.
Tasks	Knowledge Graphs, Representation Learning
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08935v2
PDF	https://arxiv.org/pdf/1911.08935v2.pdf
PWC	https://paperswithcode.com/paper/rule-guided-compositional-representation
Repo	https://github.com/ngl567/RPJE
Framework	none

DataLearner: A Data Mining and Knowledge Discovery Tool for Android Smartphones and Tablets


Title	DataLearner: A Data Mining and Knowledge Discovery Tool for Android Smartphones and Tablets
Authors	Darren Yates, Md Zahidul Islam, Junbin Gao
Abstract	Smartphones have become the ultimate ‘personal’ computer, yet despite this, general-purpose data-mining and knowledge discovery tools for mobile devices are surprisingly rare. DataLearner is a new data-mining application designed specifically for Android devices that imports the Weka data-mining engine and augments it with algorithms developed by Charles Sturt University. Moreover, DataLearner can be expanded with additional algorithms. Combined, DataLearner delivers 40 classification, clustering and association rule mining algorithms for model training and evaluation without need for cloud computing resources or network connectivity. It provides the same classification accuracy as PCs and laptops, while doing so with acceptable processing speed and consuming negligible battery life. With its ability to provide easy-to-use data-mining on a phone-size screen, DataLearner is a new portable, self-contained data-mining tool for remote, personalised and learning applications alike. DataLearner features four elements - this paper, the app available on Google Play, the GPL3-licensed source code on GitHub and a short video on YouTube.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03773v1
PDF	https://arxiv.org/pdf/1906.03773v1.pdf
PWC	https://paperswithcode.com/paper/datalearner-a-data-mining-and-knowledge
Repo	https://github.com/darrenyatesau/DataLearner
Framework	none

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search


Title	Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search
Authors	Xin Li, Yiming Zhou, Zheng Pan, Jiashi Feng
Abstract	Achieving good speed and accuracy trade-off on a target platform is very important in deploying deep neural networks in real world scenarios. However, most existing automatic architecture search approaches only concentrate on high performance. In this work, we propose an algorithm that can offer better speed/accuracy trade-off of searched networks, which is termed “Partial Order Pruning”. It prunes the architecture search space with a partial order assumption to automatically search for the architectures with the best speed and accuracy trade-off. Our algorithm explicitly takes profile information about the inference speed on the target platform into consideration. With the proposed algorithm, we present several Dongfeng (DF) networks that provide high accuracy and fast inference speed on various application GPU platforms. By further searching decoder architectures, our DF-Seg real-time segmentation networks yield state-of-the-art speed/accuracy trade-off on both the target embedded device and the high-end GPU.
Tasks	Neural Architecture Search
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03777v2
PDF	http://arxiv.org/pdf/1903.03777v2.pdf
PWC	https://paperswithcode.com/paper/partial-order-pruning-for-best-speedaccuracy
Repo	https://github.com/lixincn2015/Partial-Order-Pruning
Framework	none

Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State


Title	Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State
Authors	Richard Futrell, Ethan Wilcox, Takashi Morita, Peng Qian, Miguel Ballesteros, Roger Levy
Abstract	We deploy the methods of controlled psycholinguistic experimentation to shed light on the extent to which the behavior of neural network language models reflects incremental representations of syntactic state. To do so, we examine model behavior on artificial sentences containing a variety of syntactically complex structures. We test four models: two publicly available LSTM sequence models of English (Jozefowicz et al., 2016; Gulordava et al., 2018) trained on large datasets; an RNNG (Dyer et al., 2016) trained on a small, parsed dataset; and an LSTM trained on the same small corpus as the RNNG. We find evidence that the LSTMs trained on large datasets represent syntactic state over large spans of text in a way that is comparable to the RNNG, while the LSTM trained on the small dataset does not or does so only weakly.
Tasks
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03260v1
PDF	http://arxiv.org/pdf/1903.03260v1.pdf
PWC	https://paperswithcode.com/paper/neural-language-models-as-psycholinguistic
Repo	https://github.com/langprocgroup/nn_syntactic_state
Framework	none

Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions


Title	Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions
Authors	Omid Rohanian, Shiva Taslimipoor, Samaneh Kouchaki, Le An Ha, Ruslan Mitkov
Abstract	We introduce a new method to tag Multiword Expressions (MWEs) using a linguistically interpretable language-independent deep learning architecture. We specifically target discontinuity, an under-explored aspect that poses a significant challenge to computational treatment of MWEs. Two neural architectures are explored: Graph Convolutional Network (GCN) and multi-head self-attention. GCN leverages dependency parse information, and self-attention attends to long-range relations. We finally propose a combined model that integrates complementary information from both through a gating mechanism. The experiments on a standard multilingual dataset for verbal MWEs show that our model outperforms the baselines not only in the case of discontinuous MWEs but also in overall F-score.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10667v2
PDF	http://arxiv.org/pdf/1902.10667v2.pdf
PWC	https://paperswithcode.com/paper/bridging-the-gap-attending-to-discontinuity
Repo	https://github.com/kawu/vine
Framework	none

Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning from Radiology Reports and Label Ontology


Title	Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning from Radiology Reports and Label Ontology
Authors	Ke Yan, Yifan Peng, Veit Sandfort, Mohammadhadi Bagheri, Zhiyong Lu, Ronald M. Summers
Abstract	In radiologists’ routine work, one major task is to read a medical image, e.g., a CT scan, find significant lesions, and describe them in the radiology report. In this paper, we study the lesion description or annotation problem. Given a lesion image, our aim is to predict a comprehensive set of relevant labels, such as the lesion’s body part, type, and attributes, which may assist downstream fine-grained diagnosis. To address this task, we first design a deep learning module to extract relevant semantic labels from the radiology reports associated with the lesion images. With the images and text-mined labels, we propose a lesion annotation network (LesaNet) based on a multilabel convolutional neural network (CNN) to learn all labels holistically. Hierarchical relations and mutually exclusive relations between the labels are leveraged to improve the label prediction accuracy. The relations are utilized in a label expansion strategy and a relational hard example mining algorithm. We also attach a simple score propagation layer on LesaNet to enhance recall and explore implicit relation between labels. Multilabel metric learning is combined with classification to enable interpretable prediction. We evaluated LesaNet on the public DeepLesion dataset, which contains over 32K diverse lesion images. Experiments show that LesaNet can precisely annotate the lesions using an ontology of 171 fine-grained labels with an average AUC of 0.9344.
Tasks	Metric Learning
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04661v2
PDF	http://arxiv.org/pdf/1904.04661v2.pdf
PWC	https://paperswithcode.com/paper/holistic-and-comprehensive-annotation-of
Repo	https://github.com/leeh43/MULAN_universal_lesion_analysis
Framework	pytorch

Entity Recognition at First Sight: Improving NER with Eye Movement Information


Title	Entity Recognition at First Sight: Improving NER with Eye Movement Information
Authors	Nora Hollenstein, Ce Zhang
Abstract	Previous research shows that eye-tracking data contains information about the lexical and syntactic properties of text, which can be used to improve natural language processing models. In this work, we leverage eye movement features from three corpora with recorded gaze information to augment a state-of-the-art neural model for named entity recognition (NER) with gaze embeddings. These corpora were manually annotated with named entity labels. Moreover, we show how gaze features, generalized on word type level, eliminate the need for recorded eye-tracking data at test time. The gaze-augmented models for NER using token-level and type-level features outperform the baselines. We present the benefits of eye-tracking features by evaluating the NER models on both individual datasets as well as in cross-domain settings.
Tasks	Eye Tracking, Named Entity Recognition
Published	2019-02-26
URL	http://arxiv.org/abs/1902.10068v2
PDF	http://arxiv.org/pdf/1902.10068v2.pdf
PWC	https://paperswithcode.com/paper/entity-recognition-at-first-sight-improving
Repo	https://github.com/DS3Lab/ner-at-first-sight
Framework	none

Scaling up the randomized gradient-free adversarial attack reveals overestimation of robustness using established attacks


Title	Scaling up the randomized gradient-free adversarial attack reveals overestimation of robustness using established attacks
Authors	Francesco Croce, Jonas Rauber, Matthias Hein
Abstract	Modern neural networks are highly non-robust against adversarial manipulation. A significant amount of work has been invested in techniques to compute lower bounds on robustness through formal guarantees and to build provably robust models. However, it is still difficult to get guarantees for larger networks or robustness against larger perturbations. Thus attack strategies are needed to provide tight upper bounds on the actual robustness. We significantly improve the randomized gradient-free attack for ReLU networks [9], in particular by scaling it up to large networks. We show that our attack achieves similar or significantly smaller robust accuracy than state-of-the-art attacks like PGD or the one of Carlini and Wagner, thus revealing an overestimation of the robustness by these state-of-the-art methods. Our attack is not based on a gradient descent scheme and in this sense gradient-free, which makes it less sensitive to the choice of hyperparameters as no careful selection of the stepsize is required.
Tasks	Adversarial Attack
Published	2019-03-27
URL	https://arxiv.org/abs/1903.11359v2
PDF	https://arxiv.org/pdf/1903.11359v2.pdf
PWC	https://paperswithcode.com/paper/scaling-up-the-randomized-gradient-free
Repo	https://github.com/jonasrauber/linear-region-attack
Framework	jax

Fast Solar Image Classification Using Deep Learning and its Importance for Automation in Solar Physics


Title	Fast Solar Image Classification Using Deep Learning and its Importance for Automation in Solar Physics
Authors	John A. Armstrong, Lyndsay Fletcher
Abstract	The volume of data being collected in solar physics has exponentially increased over the past decade and with the introduction of the $\textit{Daniel K. Inouye Solar Telescope}$ (DKIST) we will be entering the age of petabyte solar data. Automated feature detection will be an invaluable tool for post-processing of solar images to create catalogues of data ready for researchers to use. We propose a deep learning model to accomplish this; a deep convolutional neural network is adept at feature extraction and processing images quickly. We train our network using data from $\textit{Hinode/Solar Optical Telescope}$ (SOT) H$\alpha$ images of a small subset of solar features with different geometries: filaments, prominences, flare ribbons, sunspots and the quiet Sun ($\textit{i.e.}$ the absence of any of the other four features). We achieve near perfect performance on classifying unseen images from SOT ($\approx$99.9%) in 4.66 seconds. We also for the first time explore transfer learning in a solar context. Transfer learning uses pre-trained deep neural networks to help train new deep learning models $\textit{i.e.}$ it teaches a new model. We show that our network is robust to changes in resolution by degrading images from SOT resolution ($\approx$0.33$^{\prime \prime}$ at $\lambda$=6563\AA{}) to $\textit{Solar Dynamics Observatory/Atmospheric Imaging Assembly}$ (SDO/AIA) resolution ($\approx$1.2$^{\prime \prime}$) without a change in performance of our network. However, we also observe where the network fails to generalise to sunspots from SDO/AIA bands 1600/1700\AA{} due to small-scale brightenings around the sunspots and prominences in SDO/AIA 304\AA{} due to coronal emission.
Tasks	Image Classification, Transfer Learning
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13575v1
PDF	https://arxiv.org/pdf/1905.13575v1.pdf
PWC	https://paperswithcode.com/paper/fast-solar-image-classification-using-deep
Repo	https://github.com/rhero12/Slic
Framework	pytorch

Morphological Tagging and Lemmatization of Albanian: A Manually Annotated Corpus and Neural Models


Title	Morphological Tagging and Lemmatization of Albanian: A Manually Annotated Corpus and Neural Models
Authors	Nelda Kote, Marenglen Biba, Jenna Kanerva, Samuel Rönnqvist, Filip Ginter
Abstract	In this paper, we present the first publicly available part-of-speech and morphologically tagged corpus for the Albanian language, as well as a neural morphological tagger and lemmatizer trained on it. There is currently a lack of available NLP resources for Albanian, and its complex grammar and morphology present challenges to their development. We have created an Albanian part-of-speech corpus based on the Universal Dependencies schema for morphological annotation, containing about 118,000 tokens of naturally occuring text collected from different text sources, with an addition of 67,000 tokens of artificially created simple sentences used only in training. On this corpus, we subsequently train and evaluate segmentation, morphological tagging and lemmatization models, using the Turku Neural Parser Pipeline. On the held-out evaluation set, the model achieves 92.74% accuracy on part-of-speech tagging, 85.31% on morphological tagging, and 89.95% on lemmatization. The manually annotated corpus, as well as the trained models are available under an open license.
Tasks	Lemmatization, Morphological Tagging, Part-Of-Speech Tagging
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00991v1
PDF	https://arxiv.org/pdf/1912.00991v1.pdf
PWC	https://paperswithcode.com/paper/morphological-tagging-and-lemmatization-of
Repo	https://github.com/NeldaKote/Albanian-POS
Framework	none

Improving Lemmatization of Non-Standard Languages with Joint Learning


Title	Improving Lemmatization of Non-Standard Languages with Joint Learning
Authors	Enrique Manjavacas, Ákos Kádár, Mike Kestemont
Abstract	Lemmatization of standard languages is concerned with (i) abstracting over morphological differences and (ii) resolving token-lemma ambiguities of inflected words in order to map them to a dictionary headword. In the present paper we aim to improve lemmatization performance on a set of non-standard historical languages in which the difficulty is increased by an additional aspect (iii): spelling variation due to lacking orthographic standards. We approach lemmatization as a string-transduction task with an encoder-decoder architecture which we enrich with sentence context information using a hierarchical sentence encoder. We show significant improvements over the state-of-the-art when training the sentence encoder jointly for lemmatization and language modeling. Crucially, our architecture does not require POS or morphological annotations, which are not always available for historical corpora. Additionally, we also test the proposed model on a set of typologically diverse standard languages showing results on par or better than a model without enhanced sentence representations and previous state-of-the-art systems. Finally, to encourage future work on processing of non-standard varieties, we release the dataset of non-standard languages underlying the present study, based on openly accessible sources.
Tasks	Language Modelling, Lemmatization
Published	2019-03-16
URL	http://arxiv.org/abs/1903.06939v1
PDF	http://arxiv.org/pdf/1903.06939v1.pdf
PWC	https://paperswithcode.com/paper/improving-lemmatization-of-non-standard
Repo	https://github.com/emanjavacas/pie-data
Framework	none

Scientific Image Restoration Anywhere


Title	Scientific Image Restoration Anywhere
Authors	Vibhatha Abeykoon, Zhengchun Liu, Rajkumar Kettimuthu, Geoffrey Fox, Ian Foster
Abstract	The use of deep learning models within scientific experimental facilities frequently requires low-latency inference, so that, for example, quality control operations can be performed while data are being collected. Edge computing devices can be useful in this context, as their low cost and compact form factor permit them to be co-located with the experimental apparatus. Can such devices, with their limited resources, can perform neural network feed-forward computations efficiently and effectively? We explore this question by evaluating the performance and accuracy of a scientific image restoration model, for which both model input and output are images, on edge computing devices. Specifically, we evaluate deployments of TomoGAN, an image-denoising model based on generative adversarial networks developed for low-dose x-ray imaging, on the Google Edge TPU and NVIDIA Jetson. We adapt TomoGAN for edge execution, evaluate model inference performance, and propose methods to address the accuracy drop caused by model quantization. We show that these edge computing devices can deliver accuracy comparable to that of a full-fledged CPU or GPU model, at speeds that are more than adequate for use in the intended deployments, denoising a 1024 x 1024 image in less than a second. Our experiments also show that the Edge TPU models can provide 3x faster inference response than a CPU-based model and 1.5x faster than an edge GPU-based model. This combination of high speed and low cost permits image restoration anywhere.
Tasks	Denoising, Image Denoising, Image Restoration, Quantization
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05878v1
PDF	https://arxiv.org/pdf/1911.05878v1.pdf
PWC	https://paperswithcode.com/paper/scientific-image-restoration-anywhere
Repo	https://github.com/ramsesproject/TomoGAN
Framework	tf

Variational Learning with Disentanglement-PyTorch


Title	Variational Learning with Disentanglement-PyTorch
Authors	Amir H. Abdi, Purang Abolmaesumi, Sidney Fels
Abstract	Unsupervised learning of disentangled representations is an open problem in machine learning. The Disentanglement-PyTorch library is developed to facilitate research, implementation, and testing of new variational algorithms. In this modular library, neural architectures, dimensionality of the latent space, and the training algorithms are fully decoupled, allowing for independent and consistent experiments across variational methods. The library handles the training scheduling, logging, and visualizations of reconstructions and latent space traversals. It also evaluates the encodings based on various disentanglement metrics. The library, so far, includes implementations of the following unsupervised algorithms VAE, Beta-VAE, Factor-VAE, DIP-I-VAE, DIP-II-VAE, Info-VAE, and Beta-TCVAE, as well as conditional approaches such as CVAE and IFCVAE. The library is compatible with the Disentanglement Challenge of NeurIPS 2019, hosted on AICrowd, and achieved the 3rd rank in both the first and second stages of the challenge.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05184v1
PDF	https://arxiv.org/pdf/1912.05184v1.pdf
PWC	https://paperswithcode.com/paper/variational-learning-with-disentanglement
Repo	https://github.com/amir-abdi/disentanglement-pytorch
Framework	pytorch

A Benchmark Study on Machine Learning Methods for Fake News Detection


Title	A Benchmark Study on Machine Learning Methods for Fake News Detection
Authors	Junaed Younus Khan, Md. Tawkat Islam Khondaker, Anindya Iqbal, Sadia Afroz
Abstract	The proliferation of fake news and its propagation on social media have become a major concern due to its ability to create devastating impacts. Different machine learning approaches have been attempted to detect it. However, most of those focused on a special type of news (such as political) and did not apply many advanced techniques. In this research, we conduct a benchmark study to assess the performance of different applicable approaches on three different datasets where the largest and most diversified one was developed by us. We also implemented some advanced deep learning models that have shown promising results.
Tasks	Fake News Detection
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04749v1
PDF	https://arxiv.org/pdf/1905.04749v1.pdf
PWC	https://paperswithcode.com/paper/a-benchmark-study-on-machine-learning-methods
Repo	https://github.com/Tawkat/Fake-News-Detection
Framework	none

Adversarial Variational Embedding for Robust Semi-supervised Learning


Title	Adversarial Variational Embedding for Robust Semi-supervised Learning
Authors	Xiang Zhang, Lina Yao, Feng Yuan
Abstract	Semi-supervised learning is sought for leveraging the unlabelled data when labelled data is difficult or expensive to acquire. Deep generative models (e.g., Variational Autoencoder (VAE)) and semisupervised Generative Adversarial Networks (GANs) have recently shown promising performance in semi-supervised classification for the excellent discriminative representing ability. However, the latent code learned by the traditional VAE is not exclusive (repeatable) for a specific input sample, which prevents it from excellent classification performance. In particular, the learned latent representation depends on a non-exclusive component which is stochastically sampled from the prior distribution. Moreover, the semi-supervised GAN models generate data from pre-defined distribution (e.g., Gaussian noises) which is independent of the input data distribution and may obstruct the convergence and is difficult to control the distribution of the generated data. To address the aforementioned issues, we propose a novel Adversarial Variational Embedding (AVAE) framework for robust and effective semi-supervised learning to leverage both the advantage of GAN as a high quality generative model and VAE as a posterior distribution learner. The proposed approach first produces an exclusive latent code by the model which we call VAE++, and meanwhile, provides a meaningful prior distribution for the generator of GAN. The proposed approach is evaluated over four different real-world applications and we show that our method outperforms the state-of-the-art models, which confirms that the combination of VAE++ and GAN can provide significant improvements in semisupervised classification.
Tasks
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02361v2
PDF	https://arxiv.org/pdf/1905.02361v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-variational-embedding-for-robust
Repo	https://github.com/xiangzhang1015/Adversarial-Variational-Semi-supervised-Learning
Framework	tf