Paper Group ANR 1473
Advanced Deep Convolutional Neural Network Approaches for Digital Pathology Image Analysis: a comprehensive evaluation with different use cases. Multi-Agent Actor-Critic with Hierarchical Graph Attention Network. Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction. An Auto-ML Framework Based on GBDT for Lifelong Learning. Are w …
Advanced Deep Convolutional Neural Network Approaches for Digital Pathology Image Analysis: a comprehensive evaluation with different use cases
Title | Advanced Deep Convolutional Neural Network Approaches for Digital Pathology Image Analysis: a comprehensive evaluation with different use cases |
Authors | Md Zahangir Alom, Theus Aspiras, Tarek M. Taha, Vijayan K. Asari, TJ Bowen, Dave Billiter, Simon Arkell |
Abstract | Deep Learning (DL) approaches have been providing state-of-the-art performance in different modalities in the field of medical imagining including Digital Pathology Image Analysis (DPIA). Out of many different DL approaches, Deep Convolutional Neural Network (DCNN) technique provides superior performance for classification, segmentation, and detection tasks. Most of the task in DPIA problems are somehow possible to solve with classification, segmentation, and detection approaches. In addition, sometimes pre and post-processing methods are applied for solving some specific type of problems. Recently, different DCNN models including Inception residual recurrent CNN (IRRCNN), Densely Connected Recurrent Convolution Network (DCRCN), Recurrent Residual U-Net (R2U-Net), and R2U-Net based regression model (UD-Net) have proposed and provide state-of-the-art performance for different computer vision and medical image analysis tasks. However, these advanced DCNN models have not been explored for solving different problems related to DPIA. In this study, we have applied these DCNN techniques for solving different DPIA problems and evaluated on different publicly available benchmark datasets for seven different tasks in digital pathology including lymphoma classification, Invasive Ductal Carcinoma (IDC) detection, nuclei segmentation, epithelium segmentation, tubule segmentation, lymphocyte detection, and mitosis detection. The experimental results are evaluated with different performance metrics such as sensitivity, specificity, accuracy, F1-score, Receiver Operating Characteristics (ROC) curve, dice coefficient (DC), and Means Squired Errors (MSE). The results demonstrate superior performance for classification, segmentation, and detection tasks compared to existing machine learning and DCNN based approaches. |
Tasks | Mitosis Detection |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09075v1 |
http://arxiv.org/pdf/1904.09075v1.pdf | |
PWC | https://paperswithcode.com/paper/advanced-deep-convolutional-neural-network |
Repo | |
Framework | |
Multi-Agent Actor-Critic with Hierarchical Graph Attention Network
Title | Multi-Agent Actor-Critic with Hierarchical Graph Attention Network |
Authors | Heechang Ryu, Hayong Shin, Jinkyoo Park |
Abstract | Most previous studies on multi-agent reinforcement learning focus on deriving decentralized and cooperative policies to maximize a common reward and rarely consider the transferability of trained policies to new tasks. This prevents such policies from being applied to more complex multi-agent tasks. To resolve these limitations, we propose a model that conducts both representation learning for multiple agents using hierarchical graph attention network and policy learning using multi-agent actor-critic. The hierarchical graph attention network is specially designed to model the hierarchical relationships among multiple agents that either cooperate or compete with each other to derive more advanced strategic policies. Two attention networks, the inter-agent and inter-group attention layers, are used to effectively model individual and group level interactions, respectively. The two attention networks have been proven to facilitate the transfer of learned policies to new tasks with different agent compositions and allow one to interpret the learned strategies. Empirically, we demonstrate that the proposed model outperforms existing methods in several mixed cooperative and competitive tasks. |
Tasks | Multi-agent Reinforcement Learning, Representation Learning |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12557v2 |
https://arxiv.org/pdf/1909.12557v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-actor-critic-with-hierarchical |
Repo | |
Framework | |
Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction
Title | Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction |
Authors | Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, Andrei Popescu-Belis |
Abstract | We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English-French and English-German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English-French subtask, five for French-English, nine for English-German, and six for German-English. Most of the submissions outperformed two strong language-model based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs. |
Tasks | Language Modelling |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12091v1 |
https://arxiv.org/pdf/1911.12091v1.pdf | |
PWC | https://paperswithcode.com/paper/findings-of-the-2016-wmt-shared-task-on-cross-1 |
Repo | |
Framework | |
An Auto-ML Framework Based on GBDT for Lifelong Learning
Title | An Auto-ML Framework Based on GBDT for Lifelong Learning |
Authors | Jinlong Chai, Jiangeng Chang, Yakun Zhao, Honggang Liu |
Abstract | Automatic Machine Learning (Auto-ML) has attracted more and more attention in recent years, our work is to solve the problem of data drift, which means that the distribution of data will gradually change with the acquisition process, resulting in a worse performance of the auto-ML model. We construct our model based on GBDT, Incremental learning and full learning are used to handle with drift problem. Experiments show that our method performs well on the five data sets. Which shows that our method can effectively solve the problem of data drift and has robust performance. |
Tasks | |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11033v1 |
https://arxiv.org/pdf/1908.11033v1.pdf | |
PWC | https://paperswithcode.com/paper/an-auto-ml-framework-based-on-gbdt-for |
Repo | |
Framework | |
Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection
Title | Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection |
Authors | Maria Corkery, Yevgen Matusevych, Sharon Goldwater |
Abstract | The cognitive mechanisms needed to account for the English past tense have long been a subject of debate in linguistics and cognitive science. Neural network models were proposed early on, but were shown to have clear flaws. Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstrate humanlike performance in a nonce-word task. Here, we look more closely at the behaviour of their model in this task. We find that (1) the model exhibits instability across multiple simulations in terms of its correlation with human data, and (2) even when results are aggregated across simulations (treating each simulation as an individual human participant), the fit to the human data is not strong—worse than an older rule-based model. These findings hold up through several alternative training regimes and evaluation measures. Although other neural architectures might do better, we conclude that there is still insufficient evidence to claim that neural nets are a good cognitive model for this task. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01280v1 |
https://arxiv.org/pdf/1906.01280v1.pdf | |
PWC | https://paperswithcode.com/paper/are-we-there-yet-encoder-decoder-neural |
Repo | |
Framework | |
DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning
Title | DeepOtsu: Document Enhancement and Binarization using Iterative Deep Learning |
Authors | Sheng He, Lambert Schomaker |
Abstract | This paper presents a novel iterative deep learning framework and apply it for document enhancement and binarization. Unlike the traditional methods which predict the binary label of each pixel on the input image, we train the neural network to learn the degradations in document images and produce the uniform images of the degraded input images, which allows the network to refine the output iteratively. Two different iterative methods have been studied in this paper: recurrent refinement (RR) which uses the same trained neural network in each iteration for document enhancement and stacked refinement (SR) which uses a stack of different neural networks for iterative output refinement. Given the learned uniform and enhanced image, the binarization map can be easy to obtain by a global or local threshold. The experimental results on several public benchmark data sets show that our proposed methods provide a new clean version of the degraded image which is suitable for visualization and promising results of binarization using the global Otsu’s threshold based on the enhanced images learned iteratively by the neural network. |
Tasks | |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06081v1 |
http://arxiv.org/pdf/1901.06081v1.pdf | |
PWC | https://paperswithcode.com/paper/deepotsu-document-enhancement-and |
Repo | |
Framework | |
Fast Scenario Reduction for Power Systems by Deep Learning
Title | Fast Scenario Reduction for Power Systems by Deep Learning |
Authors | Qiao Li, David Wenzhong Gao |
Abstract | Scenario reduction is an important topic in stochastic programming problems. Due to the random behavior of load and renewable energy, stochastic programming becomes a useful technique to optimize power systems. Thus, scenario reduction gets more attentions in recent years. Many scenario reduction methods have been proposed to reduce the scenario set in a fast speed. However, the speed of scenario reduction is still very slow, in which it takes at least several seconds to several minutes to finish the reduction. This limitation of speed prevents stochastic programming to be implemented in real-time optimal control problems. In this paper, a fast scenario reduction method based on deep learning is proposed to solve this problem. Inspired by the deep learning based image process, recognition and generation methods, the scenario data are transformed into a 2D image-like data and then to be fed into a deep convolutional neural network (DCNN). The output of the DCNN will be an “image” of the reduced scenario set. Since images can be processed in a very high speed by neural networks, the scenario reduction by neural network can also be very fast. The results of the simulation show that the scenario reduction with the proposed DCNN method can be completed in very high speed. |
Tasks | |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11486v1 |
https://arxiv.org/pdf/1908.11486v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-scenario-reduction-for-power-systems-by |
Repo | |
Framework | |
Generative Adversarial Classifier for Handwriting Characters Super-Resolution
Title | Generative Adversarial Classifier for Handwriting Characters Super-Resolution |
Authors | Zhuang Qian, Kaizhu Huang, Qiufeng Wang, Jimin Xiao, Rui Zhang |
Abstract | Generative Adversarial Networks (GAN) receive great attentions recently due to its excellent performance in image generation, transformation, and super-resolution. However, GAN has rarely been studied and trained for classification, leading that the generated images may not be appropriate for classification. In this paper, we propose a novel Generative Adversarial Classifier (GAC) particularly for low-resolution Handwriting Character Recognition. Specifically, involving additionally a classifier in the training process of normal GANs, GAC is calibrated for learning suitable structures and restored characters images that benefits the classification. Experimental results show that our proposed method can achieve remarkable performance in handwriting characters 8x super-resolution, approximately 10% and 20% higher than the present state-of-the-art methods respectively on benchmark data CASIA-HWDB1.1 and MNIST. |
Tasks | Image Generation, Super-Resolution |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06199v1 |
http://arxiv.org/pdf/1901.06199v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-classifier-for |
Repo | |
Framework | |
Reservoir Topology in Deep Echo State Networks
Title | Reservoir Topology in Deep Echo State Networks |
Authors | Claudio Gallicchio, Alessio Micheli |
Abstract | Deep Echo State Networks (DeepESNs) recently extended the applicability of Reservoir Computing (RC) methods towards the field of deep learning. In this paper we study the impact of constrained reservoir topologies in the architectural design of deep reservoirs, through numerical experiments on several RC benchmarks. The major outcome of our investigation is to show the remarkable effect, in terms of predictive performance gain, achieved by the synergy between a deep reservoir construction and a structured organization of the recurrent units in each layer. Our results also indicate that a particularly advantageous architectural setting is obtained in correspondence of DeepESNs where reservoir units are structured according to a permutation recurrent matrix. |
Tasks | |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.11022v1 |
https://arxiv.org/pdf/1909.11022v1.pdf | |
PWC | https://paperswithcode.com/paper/reservoir-topology-in-deep-echo-state |
Repo | |
Framework | |
Danish Stance Classification and Rumour Resolution
Title | Danish Stance Classification and Rumour Resolution |
Authors | Anders Edelbo Lillie, Emil Refsgaard Middelboe |
Abstract | The Internet is rife with flourishing rumours that spread through microblogs and social media. Recent work has shown that analysing the stance of the crowd towards a rumour is a good indicator for its veracity. One state-of-the-art system uses an LSTM neural network to automatically classify stance for posts on Twitter by considering the context of a whole branch, while another, more simple Decision Tree classifier, performs at least as well by performing careful feature engineering. One approach to predict the veracity of a rumour is to use stance as the only feature for a Hidden Markov Model (HMM). This thesis generates a stance-annotated Reddit dataset for the Danish language, and implements various models for stance classification. Out of these, a Linear Support Vector Machine provides the best results with an accuracy of 0.76 and macro F1 score of 0.42. Furthermore, experiments show that stance labels can be used across languages and platforms with a HMM to predict the veracity of rumours, achieving an accuracy of 0.82 and F1 score of 0.67. Even higher scores are achieved by relying only on the Danish dataset. In this case veracity prediction scores an accuracy of 0.83 and an F1 of 0.68. Finally, when using automatic stance labels for the HMM, only a small drop in performance is observed, showing that the implemented system can have practical applications. |
Tasks | Feature Engineering, Rumour Detection |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01304v1 |
https://arxiv.org/pdf/1907.01304v1.pdf | |
PWC | https://paperswithcode.com/paper/danish-stance-classification-and-rumour |
Repo | |
Framework | |
Log-linear models independence structure comparison
Title | Log-linear models independence structure comparison |
Authors | Jan Strappa, Facundo Bromberg |
Abstract | Log-linear models are a family of probability distributions which capture a variety of relationships between variables, including context-specific independencies. There are a number of approaches for automatic learning of their independence structures from data, although to date, no efficient method exists for evaluating these approaches directly in terms of the structures of the models. The only known methods evaluate these approaches indirectly through the complete model produced, that includes not only the structure but also the model parameters, introducing potential distortions in the comparison. This work presents such a method, that is, a measure for the direct comparison of the independence structures of log-linear models, inspired by the Hamming distance comparison method used in undirected graphical models. The measure presented can be efficiently computed in terms of the number of variables of the domain, and is proven to be a distance metric. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.08892v1 |
https://arxiv.org/pdf/1907.08892v1.pdf | |
PWC | https://paperswithcode.com/paper/log-linear-models-independence-structure |
Repo | |
Framework | |
DISCERN: Diversity-based Selection of Centroids for k-Estimation and Rapid Non-stochastic Clustering
Title | DISCERN: Diversity-based Selection of Centroids for k-Estimation and Rapid Non-stochastic Clustering |
Authors | Ali Hassani, Amir Iranmanesh, Mahdi Eftekhari, Abbas Salemi |
Abstract | Clustering algorithms are considered an important subset of unsupervised learning methods. These algorithms require parameters such as the number of clusters or neighborhood size and radius which are usually unknown and may even be hard to estimate. Moreover, some of the most efficient clustering algorithms such as K-Means are stochastic and at times not robust. In order to address such issues, we propose DISCERN, which can serve as an initialization algorithm for K-Means, finding suitable centroids that increase the performance of K-Means. The algorithm is also designed to estimate the number of clusters, which is its only parameter, and does not require stochastic initialization. We ran experiments on the proposed method processing multiple types of datasets and the results show its undeniable superiority in terms of results and robustness when compared to other methods. In addition, the superiority in estimating the number of clusters is also discussed as well as lower computational complexity in this estimation. |
Tasks | |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.05933v3 |
https://arxiv.org/pdf/1910.05933v3.pdf | |
PWC | https://paperswithcode.com/paper/discern-diversity-based-selection-of |
Repo | |
Framework | |
Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning
Title | Distill-2MD-MTL: Data Distillation based on Multi-Dataset Multi-Domain Multi-Task Frame Work to Solve Face Related Tasksks, Multi Task Learning, Semi-Supervised Learning |
Authors | Sepidehsadat Hosseini, Mohammad Amin Shabani, Nam Ik Cho |
Abstract | We propose a new semi-supervised learning method on face-related tasks based on Multi-Task Learning (MTL) and data distillation. The proposed method exploits multiple datasets with different labels for different-but-related tasks such as simultaneous age, gender, race, facial expression estimation. Specifically, when there are only a few well-labeled data for a specific task among the multiple related ones, we exploit the labels of other related tasks in different domains. Our approach is composed of (1) a new MTL method which can deal with weakly labeled datasets and perform several tasks simultaneously, and (2) an MTL-based data distillation framework which enables network generalization for the training and test data from different domains. Experiments show that the proposed multi-task system performs each task better than the baseline single task. It is also demonstrated that using different domain datasets along with the main dataset can enhance network generalization and overcome the domain differences between datasets. Also, comparing data distillation both on the baseline and MTL framework, the latter shows more accurate predictions on unlabeled data from different domains. Furthermore, by proposing a new learning-rate optimization method, our proposed network is able to dynamically tune its learning rate. |
Tasks | Multi-Task Learning |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03402v2 |
https://arxiv.org/pdf/1907.03402v2.pdf | |
PWC | https://paperswithcode.com/paper/data-distillation-face-related-tasks-multi |
Repo | |
Framework | |
Sparse and redundant signal representations for x-ray computed tomography
Title | Sparse and redundant signal representations for x-ray computed tomography |
Authors | Davood Karimi |
Abstract | Image models are central to all image processing tasks. The great advancements in digital image processing would not have been made possible without powerful models which, themselves, have evolved over time. In the past decade, patch-based models have emerged as one of the most effective models for natural images. Patch-based methods have outperformed other competing methods in many image processing tasks. These developments have come at a time when greater availability of powerful computational resources and growing concerns over the health risks of the ionizing radiation encourage research on image processing algorithms for computed tomography (CT). The goal of this paper is to explain the principles of patch-based methods and to review some of their recent applications in CT. We review the central concepts in patch-based image processing and explain some of the state-of-the-art algorithms, with a focus on aspects that are more relevant to CT. Then, we review some of the recent application of patch-based methods in CT. |
Tasks | Computed Tomography (CT) |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.03379v1 |
https://arxiv.org/pdf/1912.03379v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-and-redundant-signal-representations |
Repo | |
Framework | |
Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces
Title | Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces |
Authors | David Alvarez-Melis, Youssef Mroueh, Tommi S. Jaakkola |
Abstract | This paper focuses on the problem of unsupervised alignment of hierarchical data such as ontologies or lexical databases. This is a problem that appears across areas, from natural language processing to bioinformatics, and is typically solved by appeal to outside knowledge bases and label-textual similarity. In contrast, we approach the problem from a purely geometric perspective: given only a vector-space representation of the items in the two hierarchies, we seek to infer correspondences across them. Our work derives from and interweaves hyperbolic-space representations for hierarchical data, on one hand, and unsupervised word-alignment methods, on the other. We first provide a set of negative results showing how and why Euclidean methods fail in this hyperbolic setting. We then propose a novel approach based on optimal transport over hyperbolic spaces, and show that it outperforms standard embedding alignment techniques in various experiments on cross-lingual WordNet alignment and ontology matching tasks. |
Tasks | Word Alignment |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02536v1 |
https://arxiv.org/pdf/1911.02536v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-hierarchy-matching-with-optimal |
Repo | |
Framework | |