Paper Group ANR 464
Adaptive Pruning of Neural Language Models for Mobile Devices. Splitting Epistemic Logic Programs. Uniform regret bounds over $R^d$ for the sequential linear regression problem with the square loss. Unsupervised Pseudo-Labeling for Extractive Summarization on Electronic Health Records. Algorithms for screening of Cervical Cancer: A chronological re …
Adaptive Pruning of Neural Language Models for Mobile Devices
Title | Adaptive Pruning of Neural Language Models for Mobile Devices |
Authors | Raphael Tang, Jimmy Lin |
Abstract | Neural language models (NLMs) exist in an accuracy-efficiency tradeoff space where better perplexity typically comes at the cost of greater computation complexity. In a software keyboard application on mobile devices, this translates into higher power consumption and shorter battery life. This paper represents the first attempt, to our knowledge, in exploring accuracy-efficiency tradeoffs for NLMs. Building on quasi-recurrent neural networks (QRNNs), we apply pruning techniques to provide a “knob” to select different operating points. In addition, we propose a simple technique to recover some perplexity using a negligible amount of memory. Our empirical evaluations consider both perplexity as well as energy consumption on a Raspberry Pi, where we demonstrate which methods provide the best perplexity-power consumption operating point. At one operating point, one of the techniques is able to provide energy savings of 40% over the state of the art with only a 17% relative increase in perplexity. |
Tasks | |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10282v1 |
http://arxiv.org/pdf/1809.10282v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-pruning-of-neural-language-models |
Repo | |
Framework | |
Splitting Epistemic Logic Programs
Title | Splitting Epistemic Logic Programs |
Authors | Pedro Cabalar, Jorge Fandinno, Luis Fariñas del Cerro |
Abstract | Epistemic logic programs constitute an extension of the stable models semantics to deal with new constructs called subjective literals. Informally speaking, a subjective literal allows checking whether some regular literal is true in all stable models or in some stable model. As it can be imagined, the associated semantics has proved to be non-trivial, as the truth of the subjective literal may interfere with the set of stable models it is supposed to query. As a consequence, no clear agreement has been reached and different semantic proposals have been made in the literature. Unfortunately, comparison among these proposals has been limited to a study of their effect on individual examples, rather than identifying general properties to be checked. In this paper, we propose an extension of the well-known splitting property for logic programs to the epistemic case. To this aim, we formally define when an arbitrary semantics satisfies the epistemic splitting property and examine some of the consequences that can be derived from that, including its relation to conformant planning and to epistemic constraints. Interestingly, we prove (through counterexamples) that most of the existing proposals fail to fulfill the epistemic splitting property, except the original semantics proposed by Gelfond in 1991. |
Tasks | |
Published | 2018-12-20 |
URL | https://arxiv.org/abs/1812.08763v2 |
https://arxiv.org/pdf/1812.08763v2.pdf | |
PWC | https://paperswithcode.com/paper/splitting-epistemic-logic-programs |
Repo | |
Framework | |
Uniform regret bounds over $R^d$ for the sequential linear regression problem with the square loss
Title | Uniform regret bounds over $R^d$ for the sequential linear regression problem with the square loss |
Authors | Pierre Gaillard, Sébastien Gerchinovitz, Malo Huard, Gilles Stoltz |
Abstract | We consider the setting of online linear regression for arbitrary deterministic sequences, with the square loss. We are interested in the aim set by Bartlett et al. (2015): obtain regret bounds that hold uniformly over all competitor vectors. When the feature sequence is known at the beginning of the game, they provided closed-form regret bounds of $2d B^2 \ln T + \mathcal{O}_T(1)$, where $T$ is the number of rounds and $B$ is a bound on the observations. Instead, we derive bounds with an optimal constant of $1$ in front of the $d B^2 \ln T$ term. In the case of sequentially revealed features, we also derive an asymptotic regret bound of $d B^2 \ln T$ for any individual sequence of features and bounded observations. All our algorithms are variants of the online non-linear ridge regression forecaster, either with a data-dependent regularization or with almost no regularization. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11386v2 |
http://arxiv.org/pdf/1805.11386v2.pdf | |
PWC | https://paperswithcode.com/paper/uniform-regret-bounds-over-rd-for-the |
Repo | |
Framework | |
Unsupervised Pseudo-Labeling for Extractive Summarization on Electronic Health Records
Title | Unsupervised Pseudo-Labeling for Extractive Summarization on Electronic Health Records |
Authors | Xiangan Liu, Keyang Xu, Pengtao Xie, Eric Xing |
Abstract | Extractive summarization is very useful for physicians to better manage and digest Electronic Health Records (EHRs). However, the training of a supervised model requires disease-specific medical background and is thus very expensive. We studied how to utilize the intrinsic correlation between multiple EHRs to generate pseudo-labels and train a supervised model with no external annotation. Experiments on real-patient data validate that our model is effective in summarizing crucial disease-specific information for patients. |
Tasks | |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08040v3 |
http://arxiv.org/pdf/1811.08040v3.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-pseudo-labeling-for-extractive |
Repo | |
Framework | |
Algorithms for screening of Cervical Cancer: A chronological review
Title | Algorithms for screening of Cervical Cancer: A chronological review |
Authors | Yasha Singh, Dhruv Srivastava, P. S. Chandranand, Dr. Surinder Singh |
Abstract | There are various algorithms and methodologies used for automated screening of cervical cancer by segmenting and classifying cervical cancer cells into different categories. This study presents a critical review of different research papers published that integrated AI methods in screening cervical cancer via different approaches analyzed in terms of typical metrics like dataset size, drawbacks, accuracy etc. An attempt has been made to furnish the reader with an insight of Machine Learning algorithms like SVM (Support Vector Machines), GLCM (Gray Level Co-occurrence Matrix), k-NN (k-Nearest Neighbours), MARS (Multivariate Adaptive Regression Splines), CNNs (Convolutional Neural Networks), spatial fuzzy clustering algorithms, PNNs (Probabilistic Neural Networks), Genetic Algorithm, RFT (Random Forest Trees), C5.0, CART (Classification and Regression Trees) and Hierarchical clustering algorithm for feature extraction, cell segmentation and classification. This paper also covers the publicly available datasets related to cervical cancer. It presents a holistic review on the computational methods that have evolved over the period of time, in chronological order in detection of malignant cells. |
Tasks | Cell Segmentation |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00849v1 |
http://arxiv.org/pdf/1811.00849v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-for-screening-of-cervical-cancer-a |
Repo | |
Framework | |
Structured Inhomogeneous Density Map Learning for Crowd Counting
Title | Structured Inhomogeneous Density Map Learning for Crowd Counting |
Authors | Hanhui Li, Xiangjian He, Hefeng Wu, Saeed Amirgholipour Kasmani, Ruomei Wang, Xiaonan Luo, Liang Lin |
Abstract | In this paper, we aim at tackling the problem of crowd counting in extremely high-density scenes, which contain hundreds, or even thousands of people. We begin by a comprehensive analysis of the most widely used density map-based methods, and demonstrate how easily existing methods are affected by the inhomogeneous density distribution problem, e.g., causing them to be sensitive to outliers, or be hard to optimized. We then present an extremely simple solution to the inhomogeneous density distribution problem, which can be intuitively summarized as extending the density map from 2D to 3D, with the extra dimension implicitly indicating the density level. Such solution can be implemented by a single Density-Aware Network, which is not only easy to train, but also can achieve the state-of-art performance on various challenging datasets. |
Tasks | Crowd Counting |
Published | 2018-01-20 |
URL | http://arxiv.org/abs/1801.06642v1 |
http://arxiv.org/pdf/1801.06642v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-inhomogeneous-density-map-learning |
Repo | |
Framework | |
Bayesian Optimization in AlphaGo
Title | Bayesian Optimization in AlphaGo |
Authors | Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver, Nando de Freitas |
Abstract | During the development of AlphaGo, its many hyper-parameters were tuned with Bayesian optimization multiple times. This automatic tuning process resulted in substantial improvements in playing strength. For example, prior to the match with Lee Sedol, we tuned the latest AlphaGo agent and this improved its win-rate from 50% to 66.5% in self-play games. This tuned version was deployed in the final match. Of course, since we tuned AlphaGo many times during its development cycle, the compounded contribution was even higher than this percentage. It is our hope that this brief case study will be of interest to Go fans, and also provide Bayesian optimization practitioners with some insights and inspiration. |
Tasks | |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.06855v1 |
http://arxiv.org/pdf/1812.06855v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-optimization-in-alphago |
Repo | |
Framework | |
The Best of Both Worlds: Lexical Resources To Improve Low-Resource Part-of-Speech Tagging
Title | The Best of Both Worlds: Lexical Resources To Improve Low-Resource Part-of-Speech Tagging |
Authors | Barbara Plank, Sigrid Klerke, Zeljko Agic |
Abstract | In natural language processing, the deep learning revolution has shifted the focus from conventional hand-crafted symbolic representations to dense inputs, which are adequate representations learned automatically from corpora. However, particularly when working with low-resource languages, small amounts of symbolic lexical resources such as user-generated lexicons are often available even when gold-standard corpora are not. Such additional linguistic information is though often neglected, and recent neural approaches to cross-lingual tagging typically rely only on word and subword embeddings. While these representations are effective, our recent work has shown clear benefits of combining the best of both worlds: integrating conventional lexical information improves neural cross-lingual part-of-speech (PoS) tagging. However, little is known on how complementary such additional information is, and to what extent improvements depend on the coverage and quality of these external resources. This paper seeks to fill this gap by providing the first thorough analysis on the contributions of lexical resources for cross-lingual PoS tagging in neural times. |
Tasks | Part-Of-Speech Tagging |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08757v1 |
http://arxiv.org/pdf/1811.08757v1.pdf | |
PWC | https://paperswithcode.com/paper/the-best-of-both-worlds-lexical-resources-to |
Repo | |
Framework | |
Semantics Meet Saliency: Exploring Domain Affinity and Models for Dual-Task Prediction
Title | Semantics Meet Saliency: Exploring Domain Affinity and Models for Dual-Task Prediction |
Authors | Md Amirul Islam, Mahmoud Kalash, Neil D. B. Bruce |
Abstract | Much research has examined models for prediction of semantic labels or instances including dense pixel-wise prediction. The problem of predicting salient objects or regions of an image has also been examined in a similar light. With that said, there is an apparent relationship between these two problem domains in that the composition of a scene and associated semantic categories is certain to play into what is deemed salient. In this paper, we explore the relationship between these two problem domains. This is carried out in constructing deep neural networks that perform both predictions together albeit with different configurations for flow of conceptual information related to each distinct problem. This is accompanied by a detailed analysis of object co-occurrences that shed light on dataset bias and semantic precedence specific to individual categories. |
Tasks | |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09430v1 |
http://arxiv.org/pdf/1807.09430v1.pdf | |
PWC | https://paperswithcode.com/paper/semantics-meet-saliency-exploring-domain |
Repo | |
Framework | |
Rollable Latent Space for Azimuth Invariant SAR Target Recognition
Title | Rollable Latent Space for Azimuth Invariant SAR Target Recognition |
Authors | Kazutoshi Sagi, Takahiro Toizumi, Yuzo Senda |
Abstract | This paper proposes rollable latent space (RLS) for an azimuth invariant synthetic aperture radar (SAR) target recognition. Scarce labeled data and limited viewing direction are critical issues in SAR target recognition.The RLS is a designed space in which rolling of latent features corresponds to 3D rotation of an object. Thus latent features of an arbitrary view can be inferred using those of different views. This characteristic further enables us to augment data from limited viewing in RLS. RLS-based classifiers with and without data augmentation and a conventional classifier trained with target front shots are evaluated over untrained target back shots. Results show that the RLS-based classifier with augmentation improves an accuracy by 30% compared to the conventional classifier. |
Tasks | Data Augmentation |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.01821v2 |
http://arxiv.org/pdf/1802.01821v2.pdf | |
PWC | https://paperswithcode.com/paper/rollable-latent-space-for-azimuth-invariant |
Repo | |
Framework | |
A general learning system based on neuron bursting and tonic firing
Title | A general learning system based on neuron bursting and tonic firing |
Authors | Hin Wai Lui |
Abstract | This paper proposes a framework for the biological learning mechanism as a general learning system. The proposal is as follows. The bursting and tonic modes of firing patterns found in many neuron types in the brain correspond to two separate modes of information processing, with one mode resulting in awareness, and another mode being subliminal. In such a coding scheme, a neuron in bursting state codes for the highest level of perceptual abstraction representing a pattern of sensory stimuli, or volitional abstraction representing a pattern of muscle contraction sequences. Within the 50-250 ms minimum integration time of experience, the bursting neurons form synchrony ensembles to allow for binding of related percepts. The degree which different bursting neurons can be merged into the same synchrony ensemble depends on the underlying cortical connections that represent the degree of perceptual similarity. These synchrony ensembles compete for selective attention to remain active. The dominant synchrony ensemble triggers episodic memory recall in the hippocampus, while forming new episodic memory with current sensory stimuli, resulting in a stream of thoughts. Neuromodulation modulates both top-down selection of synchrony ensembles, and memory formation. Episodic memory stored in the hippocampus is transferred to semantic and procedural memory in the cortex during rapid eye movement sleep, by updating cortical neuron synaptic weights with spike timing dependent plasticity. With the update of synaptic weights, new neurons become bursting while previous bursting neurons become tonic, allowing bursting neurons to move up to a higher level of perceptual abstraction. Finally, the proposed learning mechanism is compared with the back-propagation algorithm used in deep neural networks, and a proposal of how the credit assignment problem can be addressed by the current proposal is presented. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09084v1 |
http://arxiv.org/pdf/1810.09084v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-learning-system-based-on-neuron |
Repo | |
Framework | |
TrISec: Training Data-Unaware Imperceptible Security Attacks on Deep Neural Networks
Title | TrISec: Training Data-Unaware Imperceptible Security Attacks on Deep Neural Networks |
Authors | Faiq Khalid, Muhammad Abdullah Hanif, Semeen Rehman, Rehan Ahmed, Muhammad Shafique |
Abstract | Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference. Therefore, data poisoning attacks during inference (e.g., adversarial attacks) are becoming more popular. However, they do not consider the imperceptibility factor in their optimization algorithms, and can be detected by correlation and structural testing. Therefore, in this paper, we propose a novel methodology which automatically generates imperceptible attack images by using the back-propagation algorithm on pre-trained DNNs. We present a case study on traffic sign detection using the VGGNet and the German Traffic Sign Recognition Benchmarks dataset in an autonomous driving use case. Our results demonstrate that the generated attack images successfully perform misclassification while remaining imperceptible in both subjective' and objective’ quality tests |
Tasks | Autonomous Driving, Autonomous Vehicles, data poisoning, Image Generation, Traffic Sign Recognition |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.01031v2 |
http://arxiv.org/pdf/1811.01031v2.pdf | |
PWC | https://paperswithcode.com/paper/isa4ml-training-data-unaware-imperceptible |
Repo | |
Framework | |
PlusEmo2Vec at SemEval-2018 Task 1: Exploiting emotion knowledge from emoji and #hashtags
Title | PlusEmo2Vec at SemEval-2018 Task 1: Exploiting emotion knowledge from emoji and #hashtags |
Authors | Ji Ho Park, Peng Xu, Pascale Fung |
Abstract | This paper describes our system that has been submitted to SemEval-2018 Task 1: Affect in Tweets (AIT) to solve five subtasks. We focus on modeling both sentence and word level representations of emotion inside texts through large distantly labeled corpora with emojis and hashtags. We transfer the emotional knowledge by exploiting neural network models as feature extractors and use these representations for traditional machine learning models such as support vector regression (SVR) and logistic regression to solve the competition tasks. Our system is placed among the Top3 for all subtasks we participated. |
Tasks | |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08280v1 |
http://arxiv.org/pdf/1804.08280v1.pdf | |
PWC | https://paperswithcode.com/paper/plusemo2vec-at-semeval-2018-task-1-exploiting |
Repo | |
Framework | |
Aggregating Strategies for Long-term Forecasting
Title | Aggregating Strategies for Long-term Forecasting |
Authors | Alexander Korotin, Vladimir V’yugin, Evgeny Burnaev |
Abstract | The article is devoted to investigating the application of aggregating algorithms to the problem of the long-term forecasting. We examine the classic aggregating algorithms based on the exponential reweighing. For the general Vovk’s aggregating algorithm we provide its generalization for the long-term forecasting. For the special basic case of Vovk’s algorithm we provide its two modifications for the long-term forecasting. The first one is theoretically close to an optimal algorithm and is based on replication of independent copies. It provides the time-independent regret bound with respect to the best expert in the pool. The second one is not optimal but is more practical and has $O(\sqrt{T})$ regret bound, where $T$ is the length of the game. |
Tasks | |
Published | 2018-03-18 |
URL | http://arxiv.org/abs/1803.06727v1 |
http://arxiv.org/pdf/1803.06727v1.pdf | |
PWC | https://paperswithcode.com/paper/aggregating-strategies-for-long-term |
Repo | |
Framework | |
A Robust Deep Learning Approach for Automatic Classification of Seizures Against Non-seizures
Title | A Robust Deep Learning Approach for Automatic Classification of Seizures Against Non-seizures |
Authors | X. Yao, X. Li, Q. Ye, Y. Huang, Q. Cheng, G. -Q. Zhang |
Abstract | Identifying epileptic seizures through analysis of the electroencephalography (EEG) signal becomes a standard method for the diagnosis of epilepsy. Manual seizure identification on EEG by trained neurologists is time-consuming, labor-intensive and error-prone, and a reliable automatic seizure/non-seizure classification method is needed. One of the challenges in automatic seizure/non-seizure classification is that seizure morphologies exhibit considerable variabilities. In order to capture essential seizure patterns, this paper leverages an attention mechanism and a bidirectional long short-term memory (BiLSTM) to exploit both spatial and temporal discriminating features and overcome seizure variabilities. The attention mechanism is to capture spatial features according to the contributions of different brain regions to seizures. The BiLSTM is to extract discriminating temporal features in the forward and the backward directions. Cross-validation experiments and cross-patient experiments over the noisy data of CHB-MIT are performed to evaluate our proposed approach. The obtained average sensitivity of 87.00%, specificity of 88.60% and precision of 88.63% in cross-validation experiments are higher than using the current state-of-the-art methods, and the standard deviations of our approach are lower. The evaluation results of cross-patient experiments indicate that, our approach has better performance compared with the current state-of-the-art methods and is more robust across patients. |
Tasks | EEG, Seizure Detection |
Published | 2018-12-17 |
URL | https://arxiv.org/abs/1812.06562v2 |
https://arxiv.org/pdf/1812.06562v2.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-deep-learning-approach-for-automatic |
Repo | |
Framework | |