February 1, 2020

3051 words 15 mins read

Paper Group AWR 357

Paper Group AWR 357

Learn to Segment Retinal Lesions and Beyond. WoodScape: A multi-task, multi-camera fisheye dataset for autonomous driving. Faster Boosting with Smaller Memory. Data-Free Adversarial Distillation. Better, Faster, Stronger Sequence Tagging Constituent Parsers. Detecting Toxicity in News Articles: Application to Bulgarian. Algorithmic Analysis and Sta …

Learn to Segment Retinal Lesions and Beyond

Title Learn to Segment Retinal Lesions and Beyond
Authors Qijie Wei, Xirong Li, Weihong Yu, Xiao Zhang, Yongpeng Zhang, Bojie Hu, Bin Mo, Di Gong, Ning Chen, Dayong Ding, Youxin Chen
Abstract Towards automated retinal screening, this paper makes an endeavor to simultaneously achieve pixel-level retinal lesion segmentation and image-level disease classification. Such a multi-task approach is crucial for accurate and clinically interpretable disease diagnosis. Prior art is insufficient due to three challenges, that is, lesions lacking objective boundaries, clinical importance of lesions irrelevant to their size, and the lack of one-to-one correspondence between lesion and disease classes. This paper attacks the three challenges in the context of diabetic retinopathy (DR) grading. We propose L-Net, a new variant of fully convolutional networks, with its expansive path re-designed to tackle the first challenge. A dual loss that leverages both semantic segmentation and image classification losses is devised to resolve the second challenge. We propose Side-Attention Net (SiAN) as our multi-task framework. Harnessing L-Net as a side-attention branch, SiAN simultaneously improves DR grading and interprets the decision with lesion maps. A set of 12K fundus images is manually segmented by 45 ophthalmologists for 8 DR-related lesions, resulting in 290K manual segments in total. Extensive experiments on this large-scale dataset show that our proposed approach surpasses the prior art for multiple tasks including lesion segmentation, lesion classification and DR grading.
Tasks Image Classification, Lesion Segmentation, Semantic Segmentation
Published 2019-12-25
URL https://arxiv.org/abs/1912.11619v1
PDF https://arxiv.org/pdf/1912.11619v1.pdf
PWC https://paperswithcode.com/paper/learn-to-segment-retinal-lesions-and-beyond
Repo https://github.com/WeiQijie/retinal-lesions
Framework none

WoodScape: A multi-task, multi-camera fisheye dataset for autonomous driving

Title WoodScape: A multi-task, multi-camera fisheye dataset for autonomous driving
Authors Senthil Yogamani, Ciaran Hughes, Jonathan Horgan, Ganesh Sistu, Padraig Varley, Derek O’Dea, Michal Uricar, Stefan Milz, Martin Simon, Karl Amende, Christian Witt, Hazem Rashed, Sumanth Chennupati, Sanjaya Nayak, Saquib Mansoor, Xavier Perroton, Patrick Perez
Abstract Fisheye cameras are commonly employed for obtaining a large field of view in surveillance, augmented reality and in particular automotive applications. In spite of their prevalence, there are few public datasets for detailed evaluation of computer vision algorithms on fisheye images. We release the first extensive fisheye automotive dataset, WoodScape, named after Robert Wood who invented the fisheye camera in 1906. WoodScape comprises of four surround view cameras and nine tasks including segmentation, depth estimation, 3D bounding box detection and soiling detection. Semantic annotation of 40 classes at the instance level is provided for over 10,000 images and annotation for other tasks are provided for over 100,000 images. With WoodScape, we would like to encourage the community to adapt computer vision models for fisheye camera instead of using naive rectification.
Tasks Autonomous Driving, Depth Estimation
Published 2019-05-04
URL https://arxiv.org/abs/1905.01489v2
PDF https://arxiv.org/pdf/1905.01489v2.pdf
PWC https://paperswithcode.com/paper/woodscape-a-multi-task-multi-camera-fisheye
Repo https://github.com/valeoai/WoodScape
Framework none

Faster Boosting with Smaller Memory

Title Faster Boosting with Smaller Memory
Authors Julaiti Alafate, Yoav Freund
Abstract State-of-the-art implementations of boosting, such as XGBoost and LightGBM, can process large training sets extremely fast. However, this performance requires that the memory size is sufficient to hold a 2-3 multiple of the training set size. This paper presents an alternative approach to implementing the boosted trees, which achieves a significant speedup over XGBoost and LightGBM, especially when the memory size is small. This is achieved using a combination of three techniques: early stopping, effective sample size, and stratified sampling. Our experiments demonstrate a 10-100 speedup over XGBoost when the training data is too large to fit in memory.
Tasks
Published 2019-01-25
URL https://arxiv.org/abs/1901.09047v3
PDF https://arxiv.org/pdf/1901.09047v3.pdf
PWC https://paperswithcode.com/paper/faster-boosting-with-smaller-memory
Repo https://github.com/GUEEN/Sparrow
Framework none

Data-Free Adversarial Distillation

Title Data-Free Adversarial Distillation
Authors Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song
Abstract Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer. However, almost all existing KD algorithms are data-driven, i.e., relying on a large amount of original training data or alternative data, which is usually unavailable in real-world scenarios. In this paper, we devote ourselves to this challenging problem and propose a novel adversarial distillation mechanism to craft a compact student model without any real-world data. We introduce a model discrepancy to quantificationally measure the difference between student and teacher models and construct an optimizable upper bound. In our work, the student and the teacher jointly act the role of the discriminator to reduce this discrepancy, when a generator adversarially produces some “hard samples” to enlarge it. Extensive experiments demonstrate that the proposed data-free method yields comparable performance to existing data-driven methods. More strikingly, our approach can be directly extended to semantic segmentation, which is more complicated than classification, and our approach achieves state-of-the-art results. Code and pretrained models are available at https://github.com/VainF/Data-Free-Adversarial-Distillation.
Tasks Model Compression, Semantic Segmentation, Transfer Learning
Published 2019-12-23
URL https://arxiv.org/abs/1912.11006v3
PDF https://arxiv.org/pdf/1912.11006v3.pdf
PWC https://paperswithcode.com/paper/data-free-adversarial-distillation
Repo https://github.com/VainF/Data-Free-Adversarial-Distillation
Framework pytorch

Better, Faster, Stronger Sequence Tagging Constituent Parsers

Title Better, Faster, Stronger Sequence Tagging Constituent Parsers
Authors David Vilares, Mostafa Abdou, Anders Søgaard
Abstract Sequence tagging models for constituent parsing are faster, but less accurate than other types of parsers. In this work, we address the following weaknesses of such constituent parsers: (a) high error rates around closing brackets of long constituents, (b) large label sets, leading to sparsity, and (c) error propagation arising from greedy decoding. To effectively close brackets, we train a model that learns to switch between tagging schemes. To reduce sparsity, we decompose the label set and use multi-task learning to jointly learn to predict sublabels. Finally, we mitigate issues from greedy decoding through auxiliary losses and sentence-level fine-tuning with policy gradient. Combining these techniques, we clearly surpass the performance of sequence tagging constituent parsers on the English and Chinese Penn Treebanks, and reduce their parsing time even further. On the SPMRL datasets, we observe even greater improvements across the board, including a new state of the art on Basque, Hebrew, Polish and Swedish.
Tasks Multi-Task Learning
Published 2019-02-28
URL https://arxiv.org/abs/1902.10985v3
PDF https://arxiv.org/pdf/1902.10985v3.pdf
PWC https://paperswithcode.com/paper/better-faster-stronger-sequence-tagging
Repo https://github.com/aghie/tree2labels
Framework tf

Detecting Toxicity in News Articles: Application to Bulgarian

Title Detecting Toxicity in News Articles: Application to Bulgarian
Authors Yoan Dinkov, Ivan Koychev, Preslav Nakov
Abstract Online media aim for reaching ever bigger audience and for attracting ever longer attention span. This competition creates an environment that rewards sensational, fake, and toxic news. To help limit their spread and impact, we propose and develop a news toxicity detector that can recognize various types of toxic content. While previous research primarily focused on English, here we target Bulgarian. We created a new dataset by crawling a website that for five years has been collecting Bulgarian news articles that were manually categorized into eight toxicity groups. Then we trained a multi-class classifier with nine categories: eight toxic and one non-toxic. We experimented with different representations based on ElMo, BERT, and XLM, as well as with a variety of domain-specific features. Due to the small size of our dataset, we created a separate model for each feature type, and we ultimately combined these models into a meta-classifier. The evaluation results show an accuracy of 59.0% and a macro-F1 score of 39.7%, which represent sizable improvements over the majority-class baseline (Acc=30.3%, macro-F1=5.2%).
Tasks
Published 2019-08-26
URL https://arxiv.org/abs/1908.09785v1
PDF https://arxiv.org/pdf/1908.09785v1.pdf
PWC https://paperswithcode.com/paper/detecting-toxicity-in-news-articles
Repo https://github.com/yoandinkov/ranlp-2019
Framework none

Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

Title Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing
Authors Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie Su
Abstract SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty. This non-separable penalty renders many existing techniques invalid or inconclusive in analyzing the SLOPE solution. In this paper, we develop an asymptotically exact characterization of the SLOPE solution under Gaussian random designs through solving the SLOPE problem using approximate message passing (AMP). This algorithmic approach allows us to approximate the SLOPE solution via the much more amenable AMP iterates. Explicitly, we characterize the asymptotic dynamics of the AMP iterates relying on a recently developed state evolution analysis for non-separable penalties, thereby overcoming the difficulty caused by the sorted l1 penalty. Moreover, we prove that the AMP iterates converge to the SLOPE solution in an asymptotic sense, and numerical simulations show that the convergence is surprisingly fast. Our proof rests on a novel technique that specifically leverages the SLOPE problem. In contrast to prior literature, our work not only yields an asymptotically sharp analysis but also offers an algorithmic, flexible, and constructive approach to understanding the SLOPE problem.
Tasks
Published 2019-07-17
URL https://arxiv.org/abs/1907.07502v1
PDF https://arxiv.org/pdf/1907.07502v1.pdf
PWC https://paperswithcode.com/paper/algorithmic-analysis-and-statistical
Repo https://github.com/woodyx218/SLOPE_AMP
Framework none

Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight

Title Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight
Authors Valts Blukis, Yannick Terme, Eyvind Niklasson, Ross A. Knepper, Yoav Artzi
Abstract We propose a joint simulation and real-world learning framework for mapping navigation instructions and raw first-person observations to continuous control. Our model estimates the need for environment exploration, predicts the likelihood of visiting environment positions during execution, and controls the agent to both explore and visit high-likelihood positions. We introduce Supervised Reinforcement Asynchronous Learning (SuReAL). Learning uses both simulation and real environments without requiring autonomous flight in the physical environment during training, and combines supervised learning for predicting positions to visit and reinforcement learning for continuous control. We evaluate our approach on a natural language instruction-following task with a physical quadcopter, and demonstrate effective execution and exploration behavior.
Tasks Continuous Control
Published 2019-10-21
URL https://arxiv.org/abs/1910.09664v1
PDF https://arxiv.org/pdf/1910.09664v1.pdf
PWC https://paperswithcode.com/paper/learning-to-map-natural-language-instructions
Repo https://github.com/lil-lab/drif
Framework pytorch

Discriminator optimal transport

Title Discriminator optimal transport
Authors Akinori Tanaka
Abstract Within a broad class of generative adversarial networks, we show that discriminator optimization process increases a lower bound of the dual cost function for the Wasserstein distance between the target distribution $p$ and the generator distribution $p_G$. It implies that the trained discriminator can approximate optimal transport (OT) from $p_G$ to $p$.Based on some experiments and a bit of OT theory, we propose a discriminator optimal transport (DOT) scheme to improve generated images. We show that it improves inception score and FID calculated by un-conditional GAN trained by CIFAR-10, STL-10 and a public pre-trained model of conditional GAN by ImageNet.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.06832v2
PDF https://arxiv.org/pdf/1910.06832v2.pdf
PWC https://paperswithcode.com/paper/discriminator-optimal-transport
Repo https://github.com/AkinoriTanaka-phys/DOT
Framework tf

Deep Comprehensive Correlation Mining for Image Clustering

Title Deep Comprehensive Correlation Mining for Image Clustering
Authors Jianlong Wu, Keyu Long, Fei Wang, Chen Qian, Cheng Li, Zhouchen Lin, Hongbin Zha
Abstract Recent developed deep unsupervised methods allow us to jointly learn representation and cluster unlabelled data. These deep clustering methods mainly focus on the correlation among samples, e.g., selecting high precision pairs to gradually tune the feature representation, which neglects other useful correlations. In this paper, we propose a novel clustering framework, named deep comprehensive correlation mining(DCCM), for exploring and taking full advantage of various kinds of correlations behind the unlabeled data from three aspects: 1) Instead of only using pair-wise information, pseudo-label supervision is proposed to investigate category information and learn discriminative features. 2) The features’ robustness to image transformation of input space is fully explored, which benefits the network learning and significantly improves the performance. 3) The triplet mutual information among features is presented for clustering problem to lift the recently discovered instance-level deep mutual information to a triplet-level formation, which further helps to learn more discriminative features. Extensive experiments on several challenging datasets show that our method achieves good performance, e.g., attaining $62.3%$ clustering accuracy on CIFAR-10, which is $10.1%$ higher than the state-of-the-art results.
Tasks Image Clustering
Published 2019-04-15
URL https://arxiv.org/abs/1904.06925v3
PDF https://arxiv.org/pdf/1904.06925v3.pdf
PWC https://paperswithcode.com/paper/deep-comprehensive-correlation-mining-for
Repo https://github.com/Cory-M/DCCM
Framework pytorch

Scalable Modeling of Spatiotemporal Data using the Variational Autoencoder: an Application in Glaucoma

Title Scalable Modeling of Spatiotemporal Data using the Variational Autoencoder: an Application in Glaucoma
Authors Samuel I. Berchuck, Felipe A. Medeiros, Sayan Mukherjee
Abstract As big spatial data becomes increasingly prevalent, classical spatiotemporal (ST) methods often do not scale well. While methods have been developed to account for high-dimensional spatial objects, the setting where there are exceedingly large samples of spatial observations has had less attention. The variational autoencoder (VAE), an unsupervised generative model based on deep learning and approximate Bayesian inference, fills this void using a latent variable specification that is inferred jointly across the large number of samples. In this manuscript, we compare the performance of the VAE with a more classical ST method when analyzing longitudinal visual fields from a large cohort of patients in a prospective glaucoma study. Through simulation and a case study, we demonstrate that the VAE is a scalable method for analyzing ST data, when the goal is to obtain accurate predictions. R code to implement the VAE can be found on GitHub: https://github.com/berchuck/vaeST.
Tasks Bayesian Inference
Published 2019-08-24
URL https://arxiv.org/abs/1908.09195v1
PDF https://arxiv.org/pdf/1908.09195v1.pdf
PWC https://paperswithcode.com/paper/scalable-modeling-of-spatiotemporal-data
Repo https://github.com/berchuck/vaeST
Framework tf

iSplit LBI: Individualized Partial Ranking with Ties via Split LBI

Title iSplit LBI: Individualized Partial Ranking with Ties via Split LBI
Authors Qianqian Xu, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan Yao
Abstract Due to the inherent uncertainty of data, the problem of predicting partial ranking from pairwise comparison data with ties has attracted increasing interest in recent years. However, in real-world scenarios, different individuals often hold distinct preferences. It might be misleading to merely look at a global partial ranking while ignoring personal diversity. In this paper, instead of learning a global ranking which is agreed with the consensus, we pursue the tie-aware partial ranking from an individualized perspective. Particularly, we formulate a unified framework which not only can be used for individualized partial ranking prediction, but also be helpful for abnormal user selection. This is realized by a variable splitting-based algorithm called \ilbi. Specifically, our algorithm generates a sequence of estimations with a regularization path, where both the hyperparameters and model parameters are updated. At each step of the path, the parameters can be decomposed into three orthogonal parts, namely, abnormal signals, personalized signals and random noise. The abnormal signals can serve the purpose of abnormal user selection, while the abnormal signals and personalized signals together are mainly responsible for individual partial ranking prediction. Extensive experiments on simulated and real-world datasets demonstrate that our new approach significantly outperforms state-of-the-art alternatives. The code is now availiable at https://github.com/qianqianxu010/NeurIPS2019-iSplitLBI.
Tasks
Published 2019-10-14
URL https://arxiv.org/abs/1910.05905v1
PDF https://arxiv.org/pdf/1910.05905v1.pdf
PWC https://paperswithcode.com/paper/isplit-lbi-individualized-partial-ranking
Repo https://github.com/qianqianxu010/NeurIPS2019-iSplitLBI
Framework none

Variation Network: Learning High-level Attributes for Controlled Input Manipulation

Title Variation Network: Learning High-level Attributes for Controlled Input Manipulation
Authors Gaëtan Hadjeres, Frank Nielsen
Abstract This paper presents the Variation Network (VarNet), a generative model providing means to manipulate the high-level attributes of a given input. The originality of our approach is that VarNet is not only capable of handling pre-defined attributes but can also learn the relevant attributes of the dataset by itself. These two settings can also be easily considered at the same time, which makes this model applicable to a wide variety of tasks. Further, VarNet has a sound information-theoretic interpretation which grants us with interpretable means to control how these high-level attributes are learned. We demonstrate experimentally that this model is capable of performing interesting input manipulation and that the learned attributes are relevant and meaningful.
Tasks
Published 2019-01-11
URL https://arxiv.org/abs/1901.03634v2
PDF https://arxiv.org/pdf/1901.03634v2.pdf
PWC https://paperswithcode.com/paper/variation-network-learning-high-level
Repo https://github.com/Ghadjeres/VarNet
Framework pytorch

Peeking into the Future: Predicting Future Person Activities and Locations in Videos

Title Peeking into the Future: Predicting Future Person Activities and Locations in Videos
Authors Junwei Liang, Lu Jiang, Juan Carlos Niebles, Alexander Hauptmann, Li Fei-Fei
Abstract Deciphering human behaviors to predict their future paths/trajectories and what they would do from videos is important in many applications. Motivated by this idea, this paper studies predicting a pedestrian’s future path jointly with future activities. We propose an end-to-end, multi-task learning system utilizing rich visual features about human behavioral information and interaction with their surroundings. To facilitate the training, the network is learned with an auxiliary task of predicting future location in which the activity will happen. Experimental results demonstrate our state-of-the-art performance over two public benchmarks on future trajectory prediction. Moreover, our method is able to produce meaningful future activity prediction in addition to the path. The result provides the first empirical evidence that joint modeling of paths and activities benefits future path prediction.
Tasks Activity Prediction, Future prediction, Motion Forecasting, Multi-Task Learning, Trajectory Prediction
Published 2019-02-11
URL https://arxiv.org/abs/1902.03748v3
PDF https://arxiv.org/pdf/1902.03748v3.pdf
PWC https://paperswithcode.com/paper/peeking-into-the-future-predicting-future
Repo https://github.com/google/next-prediction
Framework tf

Multilingual Question Answering from Formatted Text applied to Conversational Agents

Title Multilingual Question Answering from Formatted Text applied to Conversational Agents
Authors Wissam Siblini, Charlotte Pasqual, Axel Lavielle, Cyril Cauchois
Abstract Recent advances in NLP with language models such as BERT, GPT-2, XLNet or XLM, have allowed surpassing human performance on Reading Comprehension tasks on large-scale datasets (e.g. SQuAD), and this opens up many perspectives for Conversational AI. However, task-specific datasets are mostly in English which makes it difficult to acknowledge progress in foreign languages. Fortunately, state-of-the-art models are now being pre-trained on multiple languages (e.g. BERT was released in a multilingual version managing a hundred languages) and are exhibiting ability for zero-shot transfer from English to others languages on XNLI. In this paper, we run experiments that show that multilingual BERT, trained to solve the complex Question Answering task defined in the English SQuAD dataset, is able to achieve the same task in Japanese and French. It even outperforms the best published results of a baseline which explicitly combines an English model for Reading Comprehension and a Machine Translation Model for transfer. We run further tests on crafted cross-lingual QA datasets (context in one language and question in another) to provide intuition on the mechanisms that allow BERT to transfer the task from one language to another. Finally, we introduce our application Kate. Kate is a conversational agent dedicated to HR support for employees that exploits multilingual models to accurately answer to questions, in several languages, directly from information web pages.
Tasks Machine Translation, Question Answering, Reading Comprehension
Published 2019-10-10
URL https://arxiv.org/abs/1910.04659v1
PDF https://arxiv.org/pdf/1910.04659v1.pdf
PWC https://paperswithcode.com/paper/multilingual-question-answering-from
Repo https://github.com/wissam-sib/multilingualQA
Framework none
comments powered by Disqus