Paper Group ANR 64
Efficient predicate invention using shared “NeMuS”. An entropic feature selection method in perspective of Turing formula. Efficient Feature Selection techniques for Sentiment Analysis. Deep radiomic features from MRI scans predict survival outcome of recurrent glioblastoma. Hyperspectral City V1.0 Dataset and Benchmark. Improved Multi-Stage Traini …
Efficient predicate invention using shared “NeMuS”
Title | Efficient predicate invention using shared “NeMuS” |
Authors | Edjard Mota, Jacob M. Howe, Ana Schramm, Artur d’Avila Garcez |
Abstract | Amao is a cognitive agent framework that tackles the invention of predicates with a different strategy as compared to recent advances in Inductive Logic Programming (ILP) approaches like Meta-Intepretive Learning (MIL) technique. It uses a Neural Multi-Space (NeMuS) graph structure to anti-unify atoms from the Herbrand base, which passes in the inductive momentum check. Inductive Clause Learning (ICL), as it is called, is extended here by using the weights of logical components, already present in NeMuS, to support inductive learning by expanding clause candidates with anti-unified atoms. An efficient invention mechanism is achieved, including the learning of recursive hypotheses, while restricting the shape of the hypothesis by adding bias definitions or idiosyncrasies of the language. |
Tasks | |
Published | 2019-06-15 |
URL | https://arxiv.org/abs/1906.06455v1 |
https://arxiv.org/pdf/1906.06455v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-predicate-invention-using-shared |
Repo | |
Framework | |
An entropic feature selection method in perspective of Turing formula
Title | An entropic feature selection method in perspective of Turing formula |
Authors | Jingyi Shi, Jialin Zhang, Yaorong Ge |
Abstract | Health data are generally complex in type and small in sample size. Such domain-specific challenges make it difficult to capture information reliably and contribute further to the issue of generalization. To assist the analytics of healthcare datasets, we develop a feature selection method based on the concept of Coverage Adjusted Standardized Mutual Information (CASMI). The main advantages of the proposed method are: 1) it selects features more efficiently with the help of an improved entropy estimator, particularly when the sample size is small, and 2) it automatically learns the number of features to be selected based on the information from sample data. Additionally, the proposed method handles feature redundancy from the perspective of joint-distribution. The proposed method focuses on non-ordinal data, while it works with numerical data with an appropriate binning method. A simulation study comparing the proposed method to six widely cited feature selection methods shows that the proposed method performs better when measured by the Information Recovery Ratio, particularly when the sample size is small. |
Tasks | Feature Selection |
Published | 2019-02-19 |
URL | http://arxiv.org/abs/1902.07115v1 |
http://arxiv.org/pdf/1902.07115v1.pdf | |
PWC | https://paperswithcode.com/paper/an-entropic-feature-selection-method-in |
Repo | |
Framework | |
Efficient Feature Selection techniques for Sentiment Analysis
Title | Efficient Feature Selection techniques for Sentiment Analysis |
Authors | Avinash Madasu, Sivasankar E |
Abstract | Sentiment analysis is a domain of study that focuses on identifying and classifying the ideas expressed in the form of text into positive, negative and neutral polarities. Feature selection is a crucial process in machine learning. In this paper, we aim to study the performance of different feature selection techniques for sentiment analysis. Term Frequency Inverse Document Frequency (TF-IDF) is used as the feature extraction technique for creating feature vocabulary. Various Feature Selection (FS) techniques are experimented to select the best set of features from feature vocabulary. The selected features are trained using different machine learning classifiers Logistic Regression (LR), Support Vector Machines (SVM), Decision Tree (DT) and Naive Bayes (NB). Ensemble techniques Bagging and Random Subspace are applied on classifiers to enhance the performance on sentiment analysis. We show that, when the best FS techniques are trained using ensemble methods achieve remarkable results on sentiment analysis. We also compare the performance of FS methods trained using Bagging, Random Subspace with varied neural network architectures. We show that FS techniques trained using ensemble classifiers outperform neural networks requiring significantly less training time and parameters thereby eliminating the need for extensive hyper-parameter tuning. |
Tasks | Feature Selection, Sentiment Analysis |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00288v2 |
https://arxiv.org/pdf/1911.00288v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-feature-selection-techniques-for |
Repo | |
Framework | |
Deep radiomic features from MRI scans predict survival outcome of recurrent glioblastoma
Title | Deep radiomic features from MRI scans predict survival outcome of recurrent glioblastoma |
Authors | Ahmad Chaddad, Saima Rathore, Mingli Zhang, Christian Desrosiers, Tamim Niazi |
Abstract | This paper proposes to use deep radiomic features (DRFs) from a convolutional neural network (CNN) to model fine-grained texture signatures in the radiomic analysis of recurrent glioblastoma (rGBM). We use DRFs to predict survival of rGBM patients with preoperative T1-weighted post-contrast MR images (n=100). DRFs are extracted from regions of interest labelled by a radiation oncologist and used to compare between short-term and long-term survival patient groups. Random forest (RF) classification is employed to predict survival outcome (i.e., short or long survival), as well as to identify highly group-informative descriptors. Classification using DRFs results in an area under the ROC curve (AUC) of 89.15% (p<0.01) in predicting rGBM patient survival, compared to 78.07% (p<0.01) when using standard radiomic features (SRF). These results indicate the potential of DRFs as a prognostic marker for patients with rGBM. |
Tasks | |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1911.06687v1 |
https://arxiv.org/pdf/1911.06687v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-radiomic-features-from-mri-scans-predict |
Repo | |
Framework | |
Hyperspectral City V1.0 Dataset and Benchmark
Title | Hyperspectral City V1.0 Dataset and Benchmark |
Authors | Shaodi You, Erqi Huang, Shuaizhe Liang, Yongrong Zheng, Yunxiang Li, Fan Wang, Sen Lin, Qiu Shen, Xun Cao, Diming Zhang, Yuanjiang Li, Yu Li, Ying Fu, Boxin Shi, Feng Lu, Yinqiang Zheng, Robby T. Tan |
Abstract | This document introduces the background and the usage of the Hyperspectral City Dataset and the benchmark. The documentation first starts with the background and motivation of the dataset. Follow it, we briefly describe the method of collecting the dataset and the processing method from raw dataset to the final release dataset, specifically, the version 1.0. We also provide the detailed usage of the dataset and the evaluation metric for submitted the result for the 2019 Hyperspectral City Challenge. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10270v4 |
https://arxiv.org/pdf/1907.10270v4.pdf | |
PWC | https://paperswithcode.com/paper/hyperspectral-city-v10-dataset-and-benchmark |
Repo | |
Framework | |
Improved Multi-Stage Training of Online Attention-based Encoder-Decoder Models
Title | Improved Multi-Stage Training of Online Attention-based Encoder-Decoder Models |
Authors | Abhinav Garg, Dhananjaya Gowda, Ankur Kumar, Kwangyoun Kim, Mehul Kumar, Chanwoo Kim |
Abstract | In this paper, we propose a refined multi-stage multi-task training strategy to improve the performance of online attention-based encoder-decoder (AED) models. A three-stage training based on three levels of architectural granularity namely, character encoder, byte pair encoding (BPE) based encoder, and attention decoder, is proposed. Also, multi-task learning based on two-levels of linguistic granularity namely, character and BPE, is used. We explore different pre-training strategies for the encoders including transfer learning from a bidirectional encoder. Our encoder-decoder models with online attention show 35% and 10% relative improvement over their baselines for smaller and bigger models, respectively. Our models achieve a word error rate (WER) of 5.04% and 4.48% on the Librispeech test-clean data for the smaller and bigger models respectively after fusion with long short-term memory (LSTM) based external language model (LM). |
Tasks | Language Modelling, Multi-Task Learning, Transfer Learning |
Published | 2019-12-28 |
URL | https://arxiv.org/abs/1912.12384v1 |
https://arxiv.org/pdf/1912.12384v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-multi-stage-training-of-online |
Repo | |
Framework | |
Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning
Title | Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning |
Authors | Nuri Mert Vural, Fatih Ilhan, Suleyman S. Kozat |
Abstract | We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations – introduced due to decoupling – stay bounded, DEKF learns LSTM parameters with similar convergence and stability properties of the global extended Kalman filter learning algorithm. We verify our results with several numerical simulations and compare DEKF with other LSTM training methods. In our simulations, we also observe that the well-known hyper-parameter selection approaches used for DEKF in the literature satisfy our conditions. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.12258v3 |
https://arxiv.org/pdf/1911.12258v3.pdf | |
PWC | https://paperswithcode.com/paper/stability-of-the-decoupled-extended-kalman |
Repo | |
Framework | |
Training Deep Neural Networks to Detect Repeatable 2D Features Using Large Amounts of 3D World Capture Data
Title | Training Deep Neural Networks to Detect Repeatable 2D Features Using Large Amounts of 3D World Capture Data |
Authors | Alexander Mai, Joseph Menke, Allen Yang |
Abstract | Image space feature detection is the act of selecting points or parts of an image that are easy to distinguish from the surrounding image region. By combining a repeatable point detection with a descriptor, parts of an image can be matched with one another, which is useful in applications like estimating pose from camera input or rectifying images. Recently, precise indoor tracking has started to become important for Augmented and Virtual reality as it is necessary to allow positioning of a headset in 3D space without the need for external tracking devices. Several modern feature detectors use homographies to simulate different viewpoints, not only to train feature detection and description, but test them as well. The problem is that, often, views of indoor spaces contain high depth disparity. This makes the approximation that a homography applied to an image represents a viewpoint change inaccurate. We claim that in order to train detectors to work well in indoor environments, they must be robust to this type of geometry, and repeatable under true viewpoint change instead of homographies. Here we focus on the problem of detecting repeatable feature locations under true viewpoint change. To this end, we generate labeled 2D images from a photo-realistic 3D dataset. These images are used for training a neural network based feature detector. We further present an algorithm for automatically generating labels of repeatable 2D features, and present a fast, easy to use test algorithm for evaluating a detector in an 3D environment. |
Tasks | |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04384v1 |
https://arxiv.org/pdf/1912.04384v1.pdf | |
PWC | https://paperswithcode.com/paper/training-deep-neural-networks-to-detect |
Repo | |
Framework | |
Encoders Help You Disambiguate Word Senses in Neural Machine Translation
Title | Encoders Help You Disambiguate Word Senses in Neural Machine Translation |
Authors | Gongbo Tang, Rico Sennrich, Joakim Nivre |
Abstract | Neural machine translation (NMT) has achieved new state-of-the-art performance in translating ambiguous words. However, it is still unclear which component dominates the process of disambiguation. In this paper, we explore the ability of NMT encoders and decoders to disambiguate word senses by evaluating hidden states and investigating the distributions of self-attention. We train a classifier to predict whether a translation is correct given the representation of an ambiguous noun. We find that encoder hidden states outperform word embeddings significantly which indicates that encoders adequately encode relevant information for disambiguation into hidden states. In contrast to encoders, the effect of decoder is different in models with different architectures. Moreover, the attention weights and attention entropy show that self-attention can detect ambiguous nouns and distribute more attention to the context. |
Tasks | Machine Translation, Word Embeddings |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11771v1 |
https://arxiv.org/pdf/1908.11771v1.pdf | |
PWC | https://paperswithcode.com/paper/encoders-help-you-disambiguate-word-senses-in |
Repo | |
Framework | |
Machine Learning-based Estimation of Forest Carbon Stocks to increase Transparency of Forest Preservation Efforts
Title | Machine Learning-based Estimation of Forest Carbon Stocks to increase Transparency of Forest Preservation Efforts |
Authors | Björn Lütjens, Lucas Liebenwein, Katharina Kramer |
Abstract | An increasing amount of companies and cities plan to become CO2-neutral, which requires them to invest in renewable energies and carbon emission offsetting solutions. One of the cheapest carbon offsetting solutions is preventing deforestation in developing nations, a major contributor in global greenhouse gas emissions. However, forest preservation projects historically display an issue of trust and transparency, which drives companies to invest in transparent, but expensive air carbon capture facilities. Preservation projects could conduct accurate forest inventories (tree diameter, species, height etc.) to transparently estimate the biomass and amount of stored carbon. However, current rainforest inventories are too inaccurate, because they are often based on a few expensive ground-based samples and/or low-resolution satellite imagery. LiDAR-based solutions, used in US forests, are accurate, but cost-prohibitive, and hardly-accessible in the Amazon rainforest. We propose accurate and cheap forest inventory analyses through Deep Learning-based processing of drone imagery. The more transparent estimation of stored carbon will create higher transparency towards clients and thereby increase trust and investment into forest preservation projects. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.07850v1 |
https://arxiv.org/pdf/1912.07850v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-based-estimation-of-forest |
Repo | |
Framework | |
Using LSTMs to Model the Java Programming Language
Title | Using LSTMs to Model the Java Programming Language |
Authors | Brendon Boldt |
Abstract | Recurrent neural networks (RNNs), specifically long-short term memory networks (LSTMs), can model natural language effectively. This research investigates the ability for these same LSTMs to perform next “word” prediction on the Java programming language. Java source code from four different repositories undergoes a transformation that preserves the logical structure of the source code and removes the code’s various specificities such as variable names and literal values. Such datasets and an additional English language corpus are used to train and test standard LSTMs’ ability to predict the next element in a sequence. Results suggest that LSTMs can effectively model Java code achieving perplexities under 22 and accuracies above 0.47, which is an improvement over LSTM’s performance on the English language which demonstrated a perplexity of 85 and an accuracy of 0.27. This research can have applicability in other areas such as syntactic template suggestion and automated bug patching. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.11685v1 |
https://arxiv.org/pdf/1908.11685v1.pdf | |
PWC | https://paperswithcode.com/paper/using-lstms-to-model-the-java-programming |
Repo | |
Framework | |
F10-SGD: Fast Training of Elastic-net Linear Models for Text Classification and Named-entity Recognition
Title | F10-SGD: Fast Training of Elastic-net Linear Models for Text Classification and Named-entity Recognition |
Authors | Stanislav Peshterliev, Alexander Hsieh, Imre Kiss |
Abstract | Voice-assistants text classification and named-entity recognition (NER) models are trained on millions of example utterances. Because of the large datasets, long training time is one of the bottlenecks for releasing improved models. In this work, we develop F10-SGD, a fast optimizer for text classification and NER elastic-net linear models. On internal datasets, F10-SGD provides 4x reduction in training time compared to the OWL-QN optimizer without loss of accuracy or increase in model size. Furthermore, we incorporate biased sampling that prioritizes harder examples towards the end of the training. As a result, in addition to faster training, we were able to obtain statistically significant accuracy improvements for NER. On public datasets, F10-SGD obtains 22% faster training time compared to FastText for text classification. And, 4x reduction in training time compared to CRFSuite OWL-QN for NER. |
Tasks | Named Entity Recognition, Text Classification |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10649v1 |
http://arxiv.org/pdf/1902.10649v1.pdf | |
PWC | https://paperswithcode.com/paper/f10-sgd-fast-training-of-elastic-net-linear |
Repo | |
Framework | |
Learning Visual Features Under Motion Invariance
Title | Learning Visual Features Under Motion Invariance |
Authors | Alessandro Betti, Marco Gori, Stefano Melacci |
Abstract | Humans are continuously exposed to a stream of visual data with a natural temporal structure. However, most successful computer vision algorithms work at image level, completely discarding the precious information carried by motion. In this paper, we claim that processing visual streams naturally leads to formulate the motion invariance principle, which enables the construction of a new theory of learning that originates from variational principles, just like in physics. Such principled approach is well suited for a discussion on a number of interesting questions that arise in vision, and it offers a well-posed computational scheme for the discovery of convolutional filters over the retina. Differently from traditional convolutional networks, which need massive supervision, the proposed theory offers a truly new scenario for the unsupervised processing of video signals, where features are extracted in a multi-layer architecture with motion invariance. While the theory enables the implementation of novel computer vision systems, it also sheds light on the role of information-based principles to drive possible biological solutions. |
Tasks | |
Published | 2019-09-01 |
URL | https://arxiv.org/abs/1909.00350v1 |
https://arxiv.org/pdf/1909.00350v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-visual-features-under-motion |
Repo | |
Framework | |
Richer and Deeper Supervision Network for Salient Object Detection
Title | Richer and Deeper Supervision Network for Salient Object Detection |
Authors | Sen Jia, Neil D. B. Bruce |
Abstract | Recent Salient Object Detection (SOD) systems are mostly based on Convolutional Neural Networks (CNNs). Specifically, Deeply Supervised Saliency (DSS) system has shown it is very useful to add short connections to the network and supervising on the side output. In this work, we propose a new SOD system which aims at designing a more efficient and effective way to pass back global information. Richer and Deeper Supervision (RDS) is applied to better combine features from each side output without demanding much extra computational space. Meanwhile, the backbone network used for SOD is normally pre-trained on the object classification dataset, ImageNet. But the pre-trained model has been trained on cropped images in order to only focus on distinguishing features within the region of the object. But the ignored background information is also significant in the task of SOD. We try to solve this problem by introducing the training data designed for object detection. A coarse global information is learned based on an entire image with its bounding box before training on the SOD dataset. The large-scale of object images can slightly improve the performance of SOD. Our experiment shows the proposed RDS network achieves the state-of-the-art results on five public SOD datasets. |
Tasks | Object Classification, Object Detection, Salient Object Detection |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02425v1 |
http://arxiv.org/pdf/1901.02425v1.pdf | |
PWC | https://paperswithcode.com/paper/richer-and-deeper-supervision-network-for |
Repo | |
Framework | |
Efficient Supervision for Robot Learning via Imitation, Simulation, and Adaptation
Title | Efficient Supervision for Robot Learning via Imitation, Simulation, and Adaptation |
Authors | Markus Wulfmeier |
Abstract | Recent successes in machine learning have led to a shift in the design of autonomous systems, improving performance on existing tasks and rendering new applications possible. Data-focused approaches gain relevance across diverse, intricate applications when developing data collection and curation pipelines becomes more effective than manual behaviour design. The following work aims at increasing the efficiency of this pipeline in two principal ways: by utilising more powerful sources of informative data and by extracting additional information from existing data. In particular, we target three orthogonal fronts: imitation learning, domain adaptation, and transfer from simulation. |
Tasks | Domain Adaptation, Imitation Learning |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.07346v1 |
http://arxiv.org/pdf/1904.07346v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-supervision-for-robot-learning-via |
Repo | |
Framework | |