January 29, 2020

3129 words 15 mins read

Paper Group ANR 532

Paper Group ANR 532

Text Classification for Azerbaijani Language Using Machine Learning and Embedding. Mutual Information Scaling and Expressive Power of Sequence Models. Clustering of Deep Contextualized Representations for Summarization of Biomedical Texts. Siamese Networks for Large-Scale Author Identification. Tackling Unit Commitment and Load Dispatch Problems Co …

Text Classification for Azerbaijani Language Using Machine Learning and Embedding

Title Text Classification for Azerbaijani Language Using Machine Learning and Embedding
Authors Umid Suleymanov, Behnam Kiani Kalejahi, Elkhan Amrahov, Rashid Badirkhanli
Abstract Text classification systems will help to solve the text clustering problem in the Azerbaijani language. There are some text-classification applications for foreign languages, but we tried to build a newly developed system to solve this problem for the Azerbaijani language. Firstly, we tried to find out potential practice areas. The system will be useful in a lot of areas. It will be mostly used in news feed categorization. News websites can automatically categorize news into classes such as sports, business, education, science, etc. The system is also used in sentiment analysis for product reviews. For example, the company shares a photo of a new product on Facebook and the company receives a thousand comments for new products. The systems classify the comments into categories like positive or negative. The system can also be applied in recommended systems, spam filtering, etc. Various machine learning techniques such as Naive Bayes, SVM, Decision Trees have been devised to solve the text classification problem in Azerbaijani language.
Tasks Sentiment Analysis, Text Classification, Text Clustering
Published 2019-12-26
URL https://arxiv.org/abs/1912.13362v1
PDF https://arxiv.org/pdf/1912.13362v1.pdf
PWC https://paperswithcode.com/paper/text-classification-for-azerbaijani-language
Repo
Framework

Mutual Information Scaling and Expressive Power of Sequence Models

Title Mutual Information Scaling and Expressive Power of Sequence Models
Authors Huitao Shen
Abstract Sequence models assign probabilities to variable-length sequences such as natural language texts. The ability of sequence models to capture temporal dependence can be characterized by the temporal scaling of correlation and mutual information. In this paper, we study the mutual information of recurrent neural networks (RNNs) including long short-term memories and self-attention networks such as Transformers. Through a combination of theoretical study of linear RNNs and empirical study of nonlinear RNNs, we find their mutual information decays exponentially in temporal distance. On the other hand, Transformers can capture long-range mutual information more efficiently, making them preferable in modeling sequences with slow power-law mutual information, such as natural languages and stock prices. We discuss the connection of these results with statistical mechanics. We also point out the non-uniformity problem in many natural language datasets. We hope this work provides a new perspective in understanding the expressive power of sequence models and shed new light on improving the architecture of them.
Tasks
Published 2019-05-10
URL https://arxiv.org/abs/1905.04271v1
PDF https://arxiv.org/pdf/1905.04271v1.pdf
PWC https://paperswithcode.com/paper/mutual-information-scaling-and-expressive
Repo
Framework

Clustering of Deep Contextualized Representations for Summarization of Biomedical Texts

Title Clustering of Deep Contextualized Representations for Summarization of Biomedical Texts
Authors Milad Moradi, Matthias Samwald
Abstract In recent years, summarizers that incorporate domain knowledge into the process of text summarization have outperformed generic methods, especially for summarization of biomedical texts. However, construction and maintenance of domain knowledge bases are resource-intense tasks requiring significant manual annotation. In this paper, we demonstrate that contextualized representations extracted from the pre-trained deep language model BERT, can be effectively used to measure the similarity between sentences and to quantify the informative content. The results show that our BERT-based summarizer can improve the performance of biomedical summarization. Although the summarizer does not use any sources of domain knowledge, it can capture the context of sentences more accurately than the comparison methods. The source code and data are available at https://github.com/BioTextSumm/BERT-based-Summ.
Tasks Language Modelling, Text Summarization
Published 2019-08-06
URL https://arxiv.org/abs/1908.02286v2
PDF https://arxiv.org/pdf/1908.02286v2.pdf
PWC https://paperswithcode.com/paper/clustering-of-deep-contextualized
Repo
Framework

Siamese Networks for Large-Scale Author Identification

Title Siamese Networks for Large-Scale Author Identification
Authors Chakaveh Saedi, Mark Dras
Abstract Authorship attribution is the process of identifying the author of a text. Classification-based approaches work well for small numbers of candidate authors, but only similarity-based methods are applicable for larger numbers of authors or for authors beyond the training set. While deep learning methods have been applied to classification-based approaches, applications to similarity-based applications have been limited, and most similarity-based methods only embody static notions of similarity. Siamese networks have been used to develop learned notions of similarity in one-shot image tasks, and also for tasks of mostly semantic relatedness in NLP. We examine their application to the stylistic task of authorship attribution on datasets with large numbers of authors, looking at multiple energy functions and neural network architectures, and show that they can substantially outperform both classification- and existing similarity-based approaches. We also find an unexpected relationship between choice of energy function and number of authors, in terms of performance.
Tasks Text Classification
Published 2019-12-23
URL https://arxiv.org/abs/1912.10616v2
PDF https://arxiv.org/pdf/1912.10616v2.pdf
PWC https://paperswithcode.com/paper/siamese-networks-for-large-scale-author
Repo
Framework

Tackling Unit Commitment and Load Dispatch Problems Considering All Constraints with Evolutionary Computation

Title Tackling Unit Commitment and Load Dispatch Problems Considering All Constraints with Evolutionary Computation
Authors Danilo Vasconcellos Vargas, Junichi Murata, Hirotaka Takano
Abstract Unit commitment and load dispatch problems are important and complex problems in power system operations that have being traditionally solved separately. In this paper, both problems are solved together without approximations or simplifications. In fact, the problem solved has a massive amount of grid-connected photovoltaic units, four pump-storage hydro plants as energy storage units and ten thermal power plants, each with its own set of operation requirements that need to be satisfied. To face such a complex constrained optimization problem an adaptive repair method is proposed. By including a given repair method itself as a parameter to be optimized, the proposed adaptive repair method avoid any bias in repair choices. Moreover, this results in a repair method that adapt to the problem and will improve together with the solution during optimization. Experiments are conducted revealing that the proposed method is capable of surpassing exact method solutions on a simplified version of the problem with approximations as well as solve the otherwise intractable complete problem without simplifications. Moreover, since the proposed approach can be applied to other problems in general and it may not be obvious how to choose the constraint handling for a certain constraint, a guideline is provided explaining the reasoning behind. Thus, this paper open further possibilities to deal with the ever changing types of generation units and other similarly complex operation/schedule optimization problems with many difficult constraints.
Tasks
Published 2019-03-06
URL http://arxiv.org/abs/1903.09304v1
PDF http://arxiv.org/pdf/1903.09304v1.pdf
PWC https://paperswithcode.com/paper/tackling-unit-commitment-and-load-dispatch
Repo
Framework

Generalized Adaptation for Few-Shot Learning

Title Generalized Adaptation for Few-Shot Learning
Authors Liang Song, Jinlu Liu, Yongqiang Qin
Abstract Many Few-Shot Learning research works have two stages: pre-training base model and adapting to novel model. In this paper, we propose to use closed-form base learner, which constrains the adapting stage with pre-trained base model to get better generalized novel model. Following theoretical analysis proves its rationality as well as indication of how to train a well-generalized base model. We then conduct experiments on four benchmarks and achieve state-of-the-art performance in all cases. Notably, we achieve the accuracy of 87.75% on 5-shot miniImageNet which approximately outperforms existing methods by 10%.
Tasks Few-Shot Image Classification, Few-Shot Learning
Published 2019-11-25
URL https://arxiv.org/abs/1911.10807v2
PDF https://arxiv.org/pdf/1911.10807v2.pdf
PWC https://paperswithcode.com/paper/fast-and-generalized-adaptation-for-few-shot
Repo
Framework

Software Sustainability: A Systematic Literature Review and Comprehensive Analysis

Title Software Sustainability: A Systematic Literature Review and Comprehensive Analysis
Authors Asif Imran, Tevfik Kosar
Abstract Software Engineering is a constantly evolving subject area that faces new challenges every day as it tries to automate newer business processes. One of the key challenges to the success of a software solution is attaining sustainability. The inability of numerous software to sustain for the desired time-length is caused by limited consideration given towards sustainability during the stages of software development. This review aims to present a detailed and inclusive study covering both the technical and non-technical challenges and approaches of software sustainability. A systematic and comprehensive literature review was conducted based on 107 relevant studies that were selected using the Evidence-Based Software Engineering (EBSE) technique. The study showed that sustainability can be achieved by conducting specific activities at the technical and non-technical levels. The technical level consists of software design, coding, and user experience attributes. The non-technical level consists of documentation, sustainability manifestos, training of software engineers, funding software projects, and leadership skills of project managers to achieve sustainability. This paper groups the existing research efforts based on the above aspects. Next, how those aspects affect open and closed source software is tabulated. Based on the findings of this review, it is seen that both technical and non-technical sustainability aspects are equally important, taking one into contention and ignoring the other will threaten the sustenance of software products.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.06109v1
PDF https://arxiv.org/pdf/1910.06109v1.pdf
PWC https://paperswithcode.com/paper/software-sustainability-a-systematic
Repo
Framework

Generative Models for Novelty Detection: Applications in abnormal event and situational change detection from data series

Title Generative Models for Novelty Detection: Applications in abnormal event and situational change detection from data series
Authors Mahdyar Ravanbakhsh
Abstract Novelty detection is a process for distinguishing the observations that differ in some respect from the observations that the model is trained on. Novelty detection is one of the fundamental requirements of a good classification or identification system since sometimes the test data contains observations that were not known at the training time. In other words, the novelty class is often is not presented during the training phase or not well defined. In light of the above, one-class classifiers and generative methods can efficiently model such problems. However, due to the unavailability of data from the novelty class, training an end-to-end model is a challenging task itself. Therefore, detecting the Novel classes in unsupervised and semi-supervised settings is a crucial step in such tasks. In this thesis, we propose several methods to model the novelty detection problem in unsupervised and semi-supervised fashion. The proposed frameworks applied to different related applications of anomaly and outlier detection tasks. The results show the superior of our proposed methods in compare to the baselines and state-of-the-art methods.
Tasks Outlier Detection
Published 2019-04-09
URL http://arxiv.org/abs/1904.04741v1
PDF http://arxiv.org/pdf/1904.04741v1.pdf
PWC https://paperswithcode.com/paper/generative-models-for-novelty-detection
Repo
Framework

Learning sparse representations in reinforcement learning

Title Learning sparse representations in reinforcement learning
Authors Jacob Rafati, David C. Noelle
Abstract Reinforcement learning (RL) algorithms allow artificial agents to improve their selection of actions to increase rewarding experiences in their environments. Temporal Difference (TD) Learning – a model-free RL method – is a leading account of the midbrain dopamine system and the basal ganglia in reinforcement learning. These algorithms typically learn a mapping from the agent’s current sensed state to a selected action (known as a policy function) via learning a value function (expected future rewards). TD Learning methods have been very successful on a broad range of control tasks, but learning can become intractably slow as the state space of the environment grows. This has motivated methods that learn internal representations of the agent’s state, effectively reducing the size of the state space and restructuring state representations in order to support generalization. However, TD Learning coupled with an artificial neural network, as a function approximator, has been shown to fail to learn some fairly simple control tasks, challenging this explanation of reward-based learning. We hypothesize that such failures do not arise in the brain because of the ubiquitous presence of lateral inhibition in the cortex, producing sparse distributed internal representations that support the learning of expected future reward. The sparse conjunctive representations can avoid catastrophic interference while still supporting generalization. We provide support for this conjecture through computational simulations, demonstrating the benefits of learned sparse representations for three problematic classic control tasks: Puddle-world, Mountain-car, and Acrobot.
Tasks
Published 2019-09-04
URL https://arxiv.org/abs/1909.01575v1
PDF https://arxiv.org/pdf/1909.01575v1.pdf
PWC https://paperswithcode.com/paper/learning-sparse-representations-in-1
Repo
Framework

Towards Logical Specification of Statistical Machine Learning

Title Towards Logical Specification of Statistical Machine Learning
Authors Yusuke Kawamoto
Abstract We introduce a logical approach to formalizing statistical properties of machine learning. Specifically, we propose a formal model for statistical classification based on a Kripke model, and formalize various notions of classification performance, robustness, and fairness of classifiers by using epistemic logic. Then we show some relationships among properties of classifiers and those between classification performance and robustness, which suggests robustness-related properties that have not been formalized in the literature as far as we know. To formalize fairness properties, we define a notion of counterfactual knowledge and show techniques to formalize conditional indistinguishability by using counterfactual epistemic operators. As far as we know, this is the first work that uses logical formulas to express statistical properties of machine learning, and that provides epistemic (resp. counterfactually epistemic) views on robustness (resp. fairness) of classifiers.
Tasks
Published 2019-07-24
URL https://arxiv.org/abs/1907.10327v2
PDF https://arxiv.org/pdf/1907.10327v2.pdf
PWC https://paperswithcode.com/paper/towards-logical-specification-of-statistical
Repo
Framework

Predicting Confusion from Eye-Tracking Data with Recurrent Neural Networks

Title Predicting Confusion from Eye-Tracking Data with Recurrent Neural Networks
Authors Shane D. Sims, Vanessa Putnam, Cristina Conati
Abstract Encouraged by the success of deep learning in a variety of domains, we investigate the suitability and effectiveness of Recurrent Neural Networks (RNNs) in a domain where deep learning has not yet been used; namely detecting confusion from eye-tracking data. Through experiments with a dataset of user interactions with ValueChart (an interactive visualization tool), we found that RNNs learn a feature representation from the raw data that allows for a more powerful classifier than previous methods that use engineered features. This is evidenced by the stronger performance of the RNN (0.74/0.71 sensitivity/specificity), as compared to a Random Forest classifier (0.51/0.70 sensitivity/specificity), when both are trained on an un-augmented dataset. However, using engineered features allows for simple data augmentation methods to be used. These same methods are not as effective at augmentation for the feature representation learned from the raw data, likely due to an inability to match the temporal dynamics of the data.
Tasks Data Augmentation, Eye Tracking
Published 2019-06-19
URL https://arxiv.org/abs/1906.11211v1
PDF https://arxiv.org/pdf/1906.11211v1.pdf
PWC https://paperswithcode.com/paper/predicting-confusion-from-eye-tracking-data
Repo
Framework

Explorable Super Resolution

Title Explorable Super Resolution
Authors Yuval Bahat, Tomer Michaeli
Abstract Single image super resolution (SR) has seen major performance leaps in recent years. However, existing methods do not allow exploring the infinitely many plausible reconstructions that might have given rise to the observed low-resolution (LR) image. These different explanations to the LR image may dramatically vary in their textures and fine details, and may often encode completely different semantic information. In this paper, we introduce the task of explorable super resolution. We propose a framework comprising a graphical user interface with a neural network backend, allowing editing the SR output so as to explore the abundance of plausible HR explanations to the LR input. At the heart of our method is a novel module that can wrap any existing SR network, analytically guaranteeing that its SR outputs would precisely match the LR input, when downsampled. Besides its importance in our setting, this module is guaranteed to decrease the reconstruction error of any SR network it wraps, and can be used to cope with blur kernels that are different from the one the network was trained for. We illustrate our approach in a variety of use cases, ranging from medical imaging and forensics, to graphics.
Tasks Image Super-Resolution, Super-Resolution
Published 2019-12-04
URL https://arxiv.org/abs/1912.01839v2
PDF https://arxiv.org/pdf/1912.01839v2.pdf
PWC https://paperswithcode.com/paper/explorable-super-resolution
Repo
Framework

Eye Gaze Metrics and Analysis of AOI for Indexing Working Memory towards Predicting ADHD

Title Eye Gaze Metrics and Analysis of AOI for Indexing Working Memory towards Predicting ADHD
Authors Gavindya Jayawardena, Anne Michalek, Sampath Jayarathna
Abstract ADHD is being recognized as a diagnosis which persists into adulthood impacting economic, occupational, and educational outcomes. There is an increased need to accurately diagnose and recommend interventions for this population. One consideration is the development and implementation of reliable and valid outcome measures which reflect core diagnostic criteria. For example, adults with ADHD have reduced working memory capacity when compared to their peers (Michalek et al., 2014). A reduction in working memory capacity indicates attentional control deficits which align with many symptoms outlined on behavioral checklists used to diagnose ADHD. Using computational methods, such as eye tracking technology, to generate a relationship between ADHD and measures of working memory capacity would be useful to advancing our understanding and treatment of the diagnosis in adults. This chapter will outline a feasibility study in which eye tracking was used to measure eye gaze metrics during a working memory capacity task for adults with and without ADHD and machine learning algorithms were applied to generate a feature set unique to the ADHD diagnosis. The chapter will summarize the purpose, methods, results, and impact of this study.
Tasks Eye Tracking
Published 2019-06-17
URL https://arxiv.org/abs/1906.07183v1
PDF https://arxiv.org/pdf/1906.07183v1.pdf
PWC https://paperswithcode.com/paper/eye-gaze-metrics-and-analysis-of-aoi-for
Repo
Framework

An Interactive Machine Translation Framework for Modernizing Historical Documents

Title An Interactive Machine Translation Framework for Modernizing Historical Documents
Authors Miguel Domingo, Francisco Casacuberta
Abstract Due to the nature of human language, historical documents are hard to comprehend by contemporary people. This limits their accessibility to scholars specialized in the time period in which the documents were written. Modernization aims at breaking this language barrier by generating a new version of a historical document, written in the modern version of the document’s original language. However, while it is able to increase the document’s comprehension, modernization is still far from producing an error-free version. In this work, we propose a collaborative framework in which a scholar can work together with the machine to generate the new version. We tested our approach on a simulated environment, achieving significant reductions of the human effort needed to produce the modernized version of the document.
Tasks Machine Translation
Published 2019-10-08
URL https://arxiv.org/abs/1910.03355v1
PDF https://arxiv.org/pdf/1910.03355v1.pdf
PWC https://paperswithcode.com/paper/an-interactive-machine-translation-framework
Repo
Framework

The History of Digital Spam

Title The History of Digital Spam
Authors Emilio Ferrara
Abstract Spam!: that’s what Lorrie Faith Cranor and Brian LaMacchia exclaimed in the title of a popular call-to-action article that appeared twenty years ago on Communications of the ACM. And yet, despite the tremendous efforts of the research community over the last two decades to mitigate this problem, the sense of urgency remains unchanged, as emerging technologies have brought new dangerous forms of digital spam under the spotlight. Furthermore, when spam is carried out with the intent to deceive or influence at scale, it can alter the very fabric of society and our behavior. In this article, I will briefly review the history of digital spam: starting from its quintessential incarnation, spam emails, to modern-days forms of spam affecting the Web and social media, the survey will close by depicting future risks associated with spam and abuse of new technologies, including Artificial Intelligence (e.g., Digital Humans). After providing a taxonomy of spam, and its most popular applications emerged throughout the last two decades, I will review technological and regulatory approaches proposed in the literature, and suggest some possible solutions to tackle this ubiquitous digital epidemic moving forward.
Tasks
Published 2019-08-14
URL https://arxiv.org/abs/1908.06173v1
PDF https://arxiv.org/pdf/1908.06173v1.pdf
PWC https://paperswithcode.com/paper/the-history-of-digital-spam
Repo
Framework
comments powered by Disqus