Paper Group AWR 179
Cell Detection with Star-convex Polygons. Adaptive Scaling for Sparse Detection in Information Extraction. Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies. Regularization by architecture: A deep prior approach for inverse problems. Multimodal Sensor Fusion In Single Thermal image Super-Resolution. CHALET: Cornell Hou …
Cell Detection with Star-convex Polygons
Title | Cell Detection with Star-convex Polygons |
Authors | Uwe Schmidt, Martin Weigert, Coleman Broaddus, Gene Myers |
Abstract | Automatic detection and segmentation of cells and nuclei in microscopy images is important for many biological applications. Recent successful learning-based approaches include per-pixel cell segmentation with subsequent pixel grouping, or localization of bounding boxes with subsequent shape refinement. In situations of crowded cells, these can be prone to segmentation errors, such as falsely merging bordering cells or suppressing valid cell instances due to the poor approximation with bounding boxes. To overcome these issues, we propose to localize cell nuclei via star-convex polygons, which are a much better shape representation as compared to bounding boxes and thus do not need shape refinement. To that end, we train a convolutional neural network that predicts for every pixel a polygon for the cell instance at that position. We demonstrate the merits of our approach on two synthetic datasets and one challenging dataset of diverse fluorescence microscopy images. |
Tasks | Cell Segmentation |
Published | 2018-06-09 |
URL | http://arxiv.org/abs/1806.03535v2 |
http://arxiv.org/pdf/1806.03535v2.pdf | |
PWC | https://paperswithcode.com/paper/cell-detection-with-star-convex-polygons |
Repo | https://github.com/mpicbg-csbd/stardist |
Framework | tf |
Adaptive Scaling for Sparse Detection in Information Extraction
Title | Adaptive Scaling for Sparse Detection in Information Extraction |
Authors | Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun |
Abstract | This paper focuses on detection tasks in information extraction, where positive instances are sparsely distributed and models are usually evaluated using F-measure on positive classes. These characteristics often result in deficient performance of neural network based detection models. In this paper, we propose adaptive scaling, an algorithm which can handle the positive sparsity problem and directly optimize over F-measure via dynamic cost-sensitive learning. To this end, we borrow the idea of marginal utility from economics and propose a theoretical framework for instance importance measuring without introducing any additional hyper-parameters. Experiments show that our algorithm leads to a more effective and stable training of neural network based detection models. |
Tasks | |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00250v2 |
http://arxiv.org/pdf/1805.00250v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-scaling-for-sparse-detection-in |
Repo | https://github.com/zjjhuihui/bert-adaptive-scaling |
Framework | tf |
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
Title | Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies |
Authors | Max Grusky, Mor Naaman, Yoav Artzi |
Abstract | We present NEWSROOM, a summarization dataset of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications. Extracted from search and social media metadata between 1998 and 2017, these high-quality summaries demonstrate high diversity of summarization styles. In particular, the summaries combine abstractive and extractive strategies, borrowing words and phrases from articles at varying rates. We analyze the extraction strategies used in NEWSROOM summaries against other datasets to quantify the diversity and difficulty of our new data, and train existing methods on the data to evaluate its utility and challenges. The dataset is available online at summari.es. |
Tasks | |
Published | 2018-04-30 |
URL | http://arxiv.org/abs/1804.11283v1 |
http://arxiv.org/pdf/1804.11283v1.pdf | |
PWC | https://paperswithcode.com/paper/newsroom-a-dataset-of-13-million-summaries |
Repo | https://github.com/SumUpAnalytics/goldsum |
Framework | none |
Regularization by architecture: A deep prior approach for inverse problems
Title | Regularization by architecture: A deep prior approach for inverse problems |
Authors | Sören Dittmer, Tobias Kluth, Peter Maass, Daniel Otero Baguer |
Abstract | The present paper studies so-called deep image prior (DIP) techniques in the context of ill-posed inverse problems. DIP networks have been recently introduced for applications in image processing; also first experimental results for applying DIP to inverse problems have been reported. This paper aims at discussing different interpretations of DIP and to obtain analytic results for specific network designs and linear operators. The main contribution is to introduce the idea of viewing these approaches as the optimization of Tikhonov functionals rather than optimizing networks. Besides theoretical results, we present numerical verifications. |
Tasks | |
Published | 2018-12-10 |
URL | https://arxiv.org/abs/1812.03889v2 |
https://arxiv.org/pdf/1812.03889v2.pdf | |
PWC | https://paperswithcode.com/paper/regularization-by-architecture-a-deep-prior |
Repo | https://github.com/otero-baguer/analytic-deep-prior |
Framework | tf |
Multimodal Sensor Fusion In Single Thermal image Super-Resolution
Title | Multimodal Sensor Fusion In Single Thermal image Super-Resolution |
Authors | Feras Almasri, Olivier Debeir |
Abstract | With the fast growth in the visual surveillance and security sectors, thermal infrared images have become increasingly necessary ina large variety of industrial applications. This is true even though IR sensors are still more expensive than their RGB counterpart having the same resolution. In this paper, we propose a deep learning solution to enhance the thermal image resolution. The following results are given:(I) Introduction of a multimodal, visual-thermal fusion model that ad-dresses thermal image super-resolution, via integrating high-frequency information from the visual image. (II) Investigation of different net-work architecture schemes in the literature, their up-sampling methods,learning procedures, and their optimization functions by showing their beneficial contribution to the super-resolution problem. (III) A bench-mark ULB17-VT dataset that contains thermal images and their visual images counterpart is presented. (IV) Presentation of a qualitative evaluation of a large test set with 58 samples and 22 raters which shows that our proposed model performs better against state-of-the-arts. |
Tasks | Image Super-Resolution, Sensor Fusion, Super-Resolution |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1812.09276v1 |
http://arxiv.org/pdf/1812.09276v1.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-sensor-fusion-in-single-thermal |
Repo | https://github.com/fsalmasri/MSF-STI-SR |
Framework | none |
CHALET: Cornell House Agent Learning Environment
Title | CHALET: Cornell House Agent Learning Environment |
Authors | Claudia Yan, Dipendra Misra, Andrew Bennnett, Aaron Walsman, Yonatan Bisk, Yoav Artzi |
Abstract | We present CHALET, a 3D house simulator with support for navigation and manipulation. CHALET includes 58 rooms and 10 house configuration, and allows to easily create new house and room layouts. CHALET supports a range of common household activities, including moving objects, toggling appliances, and placing objects inside closeable containers. The environment and actions available are designed to create a challenging domain to train and evaluate autonomous agents, including for tasks that combine language, vision, and planning in a dynamic environment. |
Tasks | |
Published | 2018-01-23 |
URL | https://arxiv.org/abs/1801.07357v2 |
https://arxiv.org/pdf/1801.07357v2.pdf | |
PWC | https://paperswithcode.com/paper/chalet-cornell-house-agent-learning |
Repo | https://github.com/lil-lab/chalet |
Framework | none |
Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision
Title | Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision |
Authors | Sandro Pezzelle, Ionut-Teodor Sorodoc, Raffaella Bernardi |
Abstract | The present work investigates whether different quantification mechanisms (set comparison, vague quantification, and proportional estimation) can be jointly learned from visual scenes by a multi-task computational model. The motivation is that, in humans, these processes underlie the same cognitive, non-symbolic ability, which allows an automatic estimation and comparison of set magnitudes. We show that when information about lower-complexity tasks is available, the higher-level proportional task becomes more accurate than when performed in isolation. Moreover, the multi-task model is able to generalize to unseen combinations of target/non-target objects. Consistently with behavioral evidence showing the interference of absolute number in the proportional task, the multi-task model no longer works when asked to provide the number of target objects in the scene. |
Tasks | |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.05018v1 |
http://arxiv.org/pdf/1804.05018v1.pdf | |
PWC | https://paperswithcode.com/paper/comparatives-quantifiers-proportions-a-multi |
Repo | https://github.com/sandropezzelle/multitask-quant |
Framework | tf |
Churn Intent Detection in Multilingual Chatbot Conversations and Social Media
Title | Churn Intent Detection in Multilingual Chatbot Conversations and Social Media |
Authors | Christian Abbet, Meryem M’hamdi, Athanasios Giannakopoulos, Robert West, Andreea Hossmann, Michael Baeriswyl, Claudiu Musat |
Abstract | We propose a new method to detect when users express the intent to leave a service, also known as churn. While previous work focuses solely on social media, we show that this intent can be detected in chatbot conversations. As companies increasingly rely on chatbots they need an overview of potentially churny users. To this end, we crowdsource and publish a dataset of churn intent expressions in chatbot interactions in German and English. We show that classifiers trained on social media data can detect the same intent in the context of chatbots. We introduce a classification architecture that outperforms existing work on churn intent detection in social media. Moreover, we show that, using bilingual word embeddings, a system trained on combined English and German data outperforms monolingual approaches. As the only existing dataset is in English, we crowdsource and publish a novel dataset of German tweets. We thus underline the universal aspect of the problem, as examples of churn intent in English help us identify churn in German tweets and chatbot conversations. |
Tasks | Chatbot, Intent Detection, Word Embeddings |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1808.08432v1 |
http://arxiv.org/pdf/1808.08432v1.pdf | |
PWC | https://paperswithcode.com/paper/churn-intent-detection-in-multilingual |
Repo | https://github.com/swisscom/churn-intent-DE |
Framework | none |
Deep Generative Model for Joint Alignment and Word Representation
Title | Deep Generative Model for Joint Alignment and Word Representation |
Authors | Miguel Rios, Wilker Aziz, Khalil Sima’an |
Abstract | This work exploits translation data as a source of semantically relevant learning signal for models of word representation. In particular, we exploit equivalence through translation as a form of distributed context and jointly learn how to embed and align with a deep generative model. Our EmbedAlign model embeds words in their complete observed context and learns by marginalisation of latent lexical alignments. Besides, it embeds words as posterior probability densities, rather than point estimates, which allows us to compare words in context using a measure of overlap between distributions (e.g. KL divergence). We investigate our model’s performance on a range of lexical semantics tasks achieving competitive results on several standard benchmarks including natural language inference, paraphrasing, and text similarity. |
Tasks | Natural Language Inference |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.05883v3 |
http://arxiv.org/pdf/1802.05883v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-generative-model-for-joint-alignment-and |
Repo | https://github.com/Asa-Nisi-Masa/Embed-Align-NLP |
Framework | none |
Content Selection in Deep Learning Models of Summarization
Title | Content Selection in Deep Learning Models of Summarization |
Authors | Chris Kedzie, Kathleen McKeown, Hal Daume III |
Abstract | We carry out experiments with deep learning models of summarization across the domains of news, personal stories, meetings, and medical articles in order to understand how content selection is performed. We find that many sophisticated features of state of the art extractive summarizers do not improve performance over simpler models. These results suggest that it is easier to create a summarizer for a new domain than previous work suggests and bring into question the benefit of deep learning models for summarization for those domains that do have massive datasets (i.e., news). At the same time, they suggest important questions for new research in summarization; namely, new forms of sentence representations or external knowledge sources are needed that are better suited to the summarization task. |
Tasks | |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12343v2 |
http://arxiv.org/pdf/1810.12343v2.pdf | |
PWC | https://paperswithcode.com/paper/content-selection-in-deep-learning-models-of |
Repo | https://github.com/haoshuai999/News-summarization |
Framework | none |
Temporal Graph Offset Reconstruction: Towards Temporally Robust Graph Representation Learning
Title | Temporal Graph Offset Reconstruction: Towards Temporally Robust Graph Representation Learning |
Authors | Stephen Bonner, John Brennan, Ibad Kureshi, Georgios Theodoropoulos, Andrew Stephen McGough, Boguslaw Obara |
Abstract | Graphs are a commonly used construct for representing relationships between elements in complex high dimensional datasets. Many real-world phenomenon are dynamic in nature, meaning that any graph used to represent them is inherently temporal. However, many of the machine learning models designed to capture knowledge about the structure of these graphs ignore this rich temporal information when creating representations of the graph. This results in models which do not perform well when used to make predictions about the future state of the graph – especially when the delta between time stamps is not small. In this work, we explore a novel training procedure and an associated unsupervised model which creates graph representations optimised to predict the future state of the graph. We make use of graph convolutional neural networks to encode the graph into a latent representation, which we then use to train our temporal offset reconstruction method, inspired by auto-encoders, to predict a later time point – multiple time steps into the future. Using our method, we demonstrate superior performance for the task of future link prediction compared with none-temporal state-of-the-art baselines. We show our approach to be capable of outperforming non-temporal baselines by 38% on a real world dataset. |
Tasks | Graph Representation Learning, Link Prediction, Representation Learning |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08366v1 |
http://arxiv.org/pdf/1811.08366v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-graph-offset-reconstruction-towards |
Repo | https://github.com/sbonner0/temporal-offset-reconstruction |
Framework | pytorch |
The Variational Homoencoder: Learning to learn high capacity generative models from few examples
Title | The Variational Homoencoder: Learning to learn high capacity generative models from few examples |
Authors | Luke B. Hewitt, Maxwell I. Nye, Andreea Gane, Tommi Jaakkola, Joshua B. Tenenbaum |
Abstract | Hierarchical Bayesian methods can unify many related tasks (e.g. k-shot classification, conditional and unconditional generation) as inference within a single generative model. However, when this generative model is expressed as a powerful neural network such as a PixelCNN, we show that existing learning techniques typically fail to effectively use latent variables. To address this, we develop a modification of the Variational Autoencoder in which encoded observations are decoded to new elements from the same class. This technique, which we call a Variational Homoencoder (VHE), produces a hierarchical latent variable model which better utilises latent variables. We use the VHE framework to learn a hierarchical PixelCNN on the Omniglot dataset, which outperforms all existing models on test set likelihood and achieves strong performance on one-shot generation and classification tasks. We additionally validate the VHE on natural images from the YouTube Faces database. Finally, we develop extensions of the model that apply to richer dataset structures such as factorial and hierarchical categories. |
Tasks | Omniglot |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.08919v1 |
http://arxiv.org/pdf/1807.08919v1.pdf | |
PWC | https://paperswithcode.com/paper/the-variational-homoencoder-learning-to-learn |
Repo | https://github.com/insperatum/vhe |
Framework | pytorch |
Auto-Classification of Retinal Diseases in the Limit of Sparse Data Using a Two-Streams Machine Learning Model
Title | Auto-Classification of Retinal Diseases in the Limit of Sparse Data Using a Two-Streams Machine Learning Model |
Authors | C. -H. Huck Yang, Fangyu Liu, Jia-Hong Huang, Meng Tian, Hiromasa Morikawa, I-Hung Lin, Yi-Chieh Liu, Hao-Hsiang Yang, Jesper Tegner |
Abstract | Automatic clinical diagnosis of retinal diseases has emerged as a promising approach to facilitate discovery in areas with limited access to specialists. Based on the fact that fundus structure and vascular disorders are the main characteristics of retinal diseases, we propose a novel visual-assisted diagnosis hybrid model mixing the support vector machine (SVM) and deep neural networks (DNNs). Furthermore, we present a new clinical retina dataset, called EyeNet2, for ophthalmology incorporating 52 retina diseases classes. Using EyeNet2, our model achieves 90.43% diagnosis accuracy, and the model performance is comparable to the professional ophthalmologists. |
Tasks | |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05754v4 |
http://arxiv.org/pdf/1808.05754v4.pdf | |
PWC | https://paperswithcode.com/paper/auto-classification-of-retinal-diseases-in |
Repo | https://github.com/huckiyang/EyeNet2 |
Framework | none |
RuleMatrix: Visualizing and Understanding Classifiers with Rules
Title | RuleMatrix: Visualizing and Understanding Classifiers with Rules |
Authors | Yao Ming, Huamin Qu, Enrico Bertini |
Abstract | With the growing adoption of machine learning techniques, there is a surge of research interest towards making machine learning systems more transparent and interpretable. Various visualizations have been developed to help model developers understand, diagnose, and refine machine learning models. However, a large number of potential but neglected users are the domain experts with little knowledge of machine learning but are expected to work with machine learning systems. In this paper, we present an interactive visualization technique to help users with little expertise in machine learning to understand, explore and validate predictive models. By viewing the model as a black box, we extract a standardized rule-based knowledge representation from its input-output behavior. We design RuleMatrix, a matrix-based visualization of rules to help users navigate and verify the rules and the black-box model. We evaluate the effectiveness of RuleMatrix via two use cases and a usability study. |
Tasks | |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06228v1 |
http://arxiv.org/pdf/1807.06228v1.pdf | |
PWC | https://paperswithcode.com/paper/rulematrix-visualizing-and-understanding |
Repo | https://github.com/rulematrix/rule-matrix-py |
Framework | none |
BAGAN: Data Augmentation with Balancing GAN
Title | BAGAN: Data Augmentation with Balancing GAN |
Authors | Giovanni Mariani, Florian Scheidegger, Roxana Istrate, Costas Bekas, Cristiano Malossi |
Abstract | Image classification datasets are often imbalanced, characteristic that negatively affects the accuracy of deep-learning classifiers. In this work we propose balancing GAN (BAGAN) as an augmentation tool to restore balance in imbalanced datasets. This is challenging because the few minority-class images may not be enough to train a GAN. We overcome this issue by including during the adversarial training all available images of majority and minority classes. The generative model learns useful features from majority classes and uses these to generate images for minority classes. We apply class conditioning in the latent space to drive the generation process towards a target class. The generator in the GAN is initialized with the encoder module of an autoencoder that enables us to learn an accurate class-conditioning in the latent space. We compare the proposed methodology with state-of-the-art GANs and demonstrate that BAGAN generates images of superior quality when trained with an imbalanced dataset. |
Tasks | Data Augmentation, Image Classification |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09655v2 |
http://arxiv.org/pdf/1803.09655v2.pdf | |
PWC | https://paperswithcode.com/paper/bagan-data-augmentation-with-balancing-gan |
Repo | https://github.com/IBM/BAGAN |
Framework | tf |