February 1, 2020

2967 words 14 mins read

Paper Group AWR 299

Paper Group AWR 299

DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing. A variance modeling framework based on variational autoencoders for speech enhancement. Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation. Fast Single Image Reflection Suppression via Convex Op …

DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing

Title DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation through SwaftNet for Surrounding Sensing
Authors Kailun Yang, Xinxin Hu, Hao Chen, Kaite Xiang, Kaiwei Wang, Rainer Stiefelhagen
Abstract Semantically interpreting the traffic scene is crucial for autonomous transportation and robotics systems. However, state-of-the-art semantic segmentation pipelines are dominantly designed to work with pinhole cameras and train with narrow Field-of-View (FoV) images. In this sense, the perception capacity is severely limited to offer higher-level confidence for upstream navigation tasks. In this paper, we propose a network adaptation framework to achieve Panoramic Annular Semantic Segmentation (PASS), which allows to re-use conventional pinhole-view image datasets, enabling modern segmentation networks to comfortably adapt to panoramic images. Specifically, we adapt our proposed SwaftNet to enhance the sensitivity to details by implementing attention-based lateral connections between the detail-critical encoder layers and the context-critical decoder layers. We benchmark the performance of efficient segmenters on panoramic segmentation with our extended PASS dataset, demonstrating that the proposed real-time SwaftNet outperforms state-of-the-art efficient networks. Furthermore, we assess real-world performance when deploying the Detail-Sensitive PASS (DS-PASS) system on a mobile robot and an instrumented vehicle, as well as the benefit of panoramic semantics for visual odometry, showing the robustness and potential to support diverse navigational applications.
Tasks Semantic Segmentation, Visual Odometry
Published 2019-09-17
URL https://arxiv.org/abs/1909.07721v2
PDF https://arxiv.org/pdf/1909.07721v2.pdf
PWC https://paperswithcode.com/paper/ds-pass-detail-sensitive-panoramic-annular
Repo https://github.com/elnino9ykl/DS-PASS
Framework pytorch

A variance modeling framework based on variational autoencoders for speech enhancement

Title A variance modeling framework based on variational autoencoders for speech enhancement
Authors Simon Leglaive, Laurent Girin, Radu Horaud
Abstract In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach. We explore the use of neural networks as an alternative to a popular speech variance model based on supervised non-negative matrix factorization (NMF). More precisely, we use a variational autoencoder as a speaker-independent supervised generative speech model, highlighting the conceptual similarities that this approach shares with its NMF-based counterpart. In order to be free of generalization issues regarding the noisy recording environments, we follow the approach of having a supervised model only for the target speech signal, the noise model being based on unsupervised NMF. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the variational autoencoder and estimating the unsupervised model parameters. Experiments show that the proposed method outperforms a semi-supervised NMF baseline and a state-of-the-art fully supervised deep learning approach.
Tasks Speech Enhancement
Published 2019-02-05
URL http://arxiv.org/abs/1902.01605v1
PDF http://arxiv.org/pdf/1902.01605v1.pdf
PWC https://paperswithcode.com/paper/a-variance-modeling-framework-based-on
Repo https://github.com/sleglaive/MLSP-2018
Framework none

Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation

Title Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
Authors Kexin Huang, Abhishek Singh, Sitong Chen, Edward T. Moseley, Chih-ying Deng, Naomi George, Charlotta Lindvall
Abstract Clinical notes contain rich data, which is unexploited in predictive modeling compared to structured data. In this work, we developed a new text representation Clinical XLNet for clinical notes which also leverages the temporal information of the sequence of the notes. We evaluated our models on prolonged mechanical ventilation prediction problem and our experiments demonstrated that Clinical XLNet outperforms the best baselines consistently.
Tasks
Published 2019-12-27
URL https://arxiv.org/abs/1912.11975v1
PDF https://arxiv.org/pdf/1912.11975v1.pdf
PWC https://paperswithcode.com/paper/clinical-xlnet-modeling-sequential-clinical
Repo https://github.com/kexinhuang12345/clinicalXLNet
Framework pytorch

Fast Single Image Reflection Suppression via Convex Optimization

Title Fast Single Image Reflection Suppression via Convex Optimization
Authors Yang Yang, Wenye Ma, Yin Zheng, Jian-Feng Cai, Weiyu Xu
Abstract Removing undesired reflections from images taken through the glass is of great importance in computer vision. It serves as a means to enhance the image quality for aesthetic purposes as well as to preprocess images in machine learning and pattern recognition applications. We propose a convex model to suppress the reflection from a single input image. Our model implies a partial differential equation with gradient thresholding, which is solved efficiently using Discrete Cosine Transform. Extensive experiments on synthetic and real-world images demonstrate that our approach achieves desirable reflection suppression results and dramatically reduces the execution time.
Tasks
Published 2019-03-10
URL https://arxiv.org/abs/1903.03889v3
PDF https://arxiv.org/pdf/1903.03889v3.pdf
PWC https://paperswithcode.com/paper/fast-single-image-reflection-suppression-via
Repo https://github.com/yyhz76/reflectSuppress
Framework none

Riemannian Normalizing Flow on Variational Wasserstein Autoencoder for Text Modeling

Title Riemannian Normalizing Flow on Variational Wasserstein Autoencoder for Text Modeling
Authors Prince Zizhuang Wang, William Yang Wang
Abstract Recurrent Variational Autoencoder has been widely used for language modeling and text generation tasks. These models often face a difficult optimization problem, also known as the Kullback-Leibler (KL) term vanishing issue, where the posterior easily collapses to the prior, and the model will ignore latent codes in generative tasks. To address this problem, we introduce an improved Wasserstein Variational Autoencoder (WAE) with Riemannian Normalizing Flow (RNF) for text modeling. The RNF transforms a latent variable into a space that respects the geometric characteristics of input space, which makes posterior impossible to collapse to the non-informative prior. The Wasserstein objective minimizes the distance between the marginal distribution and the prior directly and therefore does not force the posterior to match the prior. Empirical experiments show that our model avoids KL vanishing over a range of datasets and has better performances in tasks such as language modeling, likelihood approximation, and text generation. Through a series of experiments and analysis over latent space, we show that our model learns latent distributions that respect latent space geometry and is able to generate sentences that are more diverse.
Tasks Language Modelling, Text Generation
Published 2019-04-04
URL http://arxiv.org/abs/1904.02399v4
PDF http://arxiv.org/pdf/1904.02399v4.pdf
PWC https://paperswithcode.com/paper/riemannian-normalizing-flow-on-variational
Repo https://github.com/kingofspace0wzz/wae-rnf-lm
Framework pytorch

A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning

Title A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning
Authors Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li
Abstract Rapid progress has been made in the field of reading comprehension and question answering, where several systems have achieved human parity in some simplified settings. However, the performance of these models degrades significantly when they are applied to more realistic scenarios, such as answers involve various types, multiple text strings are correct answers, or discrete reasoning abilities are required. In this paper, we introduce the Multi-Type Multi-Span Network (MTMSN), a neural reading comprehension model that combines a multi-type answer predictor designed to support various answer types (e.g., span, count, negation, and arithmetic expression) with a multi-span extraction method for dynamically producing one or multiple text spans. In addition, an arithmetic expression reranking mechanism is proposed to rank expression candidates for further confirming the prediction. Experiments show that our model achieves 79.9 F1 on the DROP hidden test set, creating new state-of-the-art results. Source code\footnote{\url{https://github.com/huminghao16/MTMSN}} is released to facilitate future work.
Tasks Question Answering, Reading Comprehension
Published 2019-08-15
URL https://arxiv.org/abs/1908.05514v2
PDF https://arxiv.org/pdf/1908.05514v2.pdf
PWC https://paperswithcode.com/paper/a-multi-type-multi-span-network-for-reading
Repo https://github.com/huminghao16/MTMSN
Framework pytorch

Gated Task Interaction Framework for Multi-task Sequence Tagging

Title Gated Task Interaction Framework for Multi-task Sequence Tagging
Authors Isaac K. E. Ampomah, Sally McClean, Zhiwei Lin, Glenn Hawe
Abstract Recent studies have shown that neural models can achieve high performance on several sequence labelling/tagging problems without the explicit use of linguistic features such as part-of-speech (POS) tags. These models are trained only using the character-level and the word embedding vectors as inputs. Others have shown that linguistic features can improve the performance of neural models on tasks such as chunking and named entity recognition (NER). However, the change in performance depends on the degree of semantic relatedness between the linguistic features and the target task; in some instances, linguistic features can have a negative impact on performance. This paper presents an approach to jointly learn these linguistic features along with the target sequence labelling tasks with a new multi-task learning (MTL) framework called Gated Tasks Interaction (GTI) network for solving multiple sequence tagging tasks. The GTI network exploits the relations between the multiple tasks via neural gate modules. These gate modules control the flow of information between the different tasks. Experiments on benchmark datasets for chunking and NER show that our framework outperforms other competitive baselines trained with and without external training resources.
Tasks Chunking, Multi-Task Learning, Named Entity Recognition
Published 2019-09-29
URL https://arxiv.org/abs/1909.13193v1
PDF https://arxiv.org/pdf/1909.13193v1.pdf
PWC https://paperswithcode.com/paper/gated-task-interaction-framework-for-multi
Repo https://github.com/kaeflint/GTI
Framework none

Reducing Exploration of Dying Arms in Mortal Bandits

Title Reducing Exploration of Dying Arms in Mortal Bandits
Authors Stefano Tracà, Cynthia Rudin, Weiyu Yan
Abstract Mortal bandits have proven to be extremely useful for providing news article recommendations, running automated online advertising campaigns, and for other applications where the set of available options changes over time. Previous work on this problem showed how to regulate exploration of new arms when they have recently appeared, but they do not adapt when the arms are about to disappear. Since in most applications we can determine either exactly or approximately when arms will disappear, we can leverage this information to improve performance: we should not be exploring arms that are about to disappear. We provide adaptations of algorithms, regret bounds, and experiments for this study, showing a clear benefit from regulating greed (exploration/exploitation) for arms that will soon disappear. We illustrate numerical performance on the Yahoo! Front Page Today Module User Click Log Dataset.
Tasks
Published 2019-07-04
URL https://arxiv.org/abs/1907.02571v1
PDF https://arxiv.org/pdf/1907.02571v1.pdf
PWC https://paperswithcode.com/paper/reducing-exploration-of-dying-arms-in-mortal
Repo https://github.com/5tefan0/Supplement-to-Reducing-Exploration-of-Dying-Arms-in-Mortal-Bandits
Framework none

Automated Segmentation of Pulmonary Lobes using Coordination-Guided Deep Neural Networks

Title Automated Segmentation of Pulmonary Lobes using Coordination-Guided Deep Neural Networks
Authors Wenjia Wang, Junxuan Chen, Jie Zhao, Ying Chi, Xuansong Xie, Li Zhang, Xiansheng Hua
Abstract The identification of pulmonary lobes is of great importance in disease diagnosis and treatment. A few lung diseases have regional disorders at lobar level. Thus, an accurate segmentation of pulmonary lobes is necessary. In this work, we propose an automated segmentation of pulmonary lobes using coordination-guided deep neural networks from chest CT images. We first employ an automated lung segmentation to extract the lung area from CT image, then exploit volumetric convolutional neural network (V-net) for segmenting the pulmonary lobes. To reduce the misclassification of different lobes, we therefore adopt coordination-guided convolutional layers (CoordConvs) that generate additional feature maps of the positional information of pulmonary lobes. The proposed model is trained and evaluated on a few publicly available datasets and has achieved the state-of-the-art accuracy with a mean Dice coefficient index of 0.947 $\pm$ 0.044.
Tasks
Published 2019-04-19
URL http://arxiv.org/abs/1904.09106v1
PDF http://arxiv.org/pdf/1904.09106v1.pdf
PWC https://paperswithcode.com/paper/automated-segmentation-of-pulmonary-lobes
Repo https://github.com/woans0104/sk_project
Framework none

Tensor-based computation of metastable and coherent sets

Title Tensor-based computation of metastable and coherent sets
Authors Feliks Nüske, Patrick Gelß, Stefan Klus, Cecilia Clementi
Abstract Recent years have seen rapid advances in the data-driven analysis of dynamical systems based on Koopman operator theory – with extended dynamic mode decomposition (EDMD) being a cornerstone of the field. On the other hand, low-rank tensor product approximations – in particular the tensor train (TT) format – have become a valuable tool for the solution of large-scale problems in a number of fields. In this work, we combine EDMD and the TT format, enabling the application of EDMD to high-dimensional problems in conjunction with a large set of features. We derive efficient algorithms to solve the EDMD eigenvalue problem based on tensor representations of the data, and to project the data into a low-dimensional representation defined by the eigenvectors. We extend this method to perform canonical correlation analysis (CCA) of non-reversible or time-dependent systems. We prove that there is a physical interpretation of the procedure and demonstrate its capabilities by applying the method to several benchmark data sets.
Tasks
Published 2019-08-12
URL https://arxiv.org/abs/1908.04741v2
PDF https://arxiv.org/pdf/1908.04741v2.pdf
PWC https://paperswithcode.com/paper/tensor-based-edmd-for-the-koopman-analysis-of
Repo https://github.com/PGelss/scikit_tt
Framework none

From Independent Prediction to Re-ordered Prediction: Integrating Relative Position and Global Label Information to Emotion Cause Identification

Title From Independent Prediction to Re-ordered Prediction: Integrating Relative Position and Global Label Information to Emotion Cause Identification
Authors Zixiang Ding, Huihui He, Mengran Zhang, Rui Xia
Abstract Emotion cause identification aims at identifying the potential causes that lead to a certain emotion expression in text. Several techniques including rule based methods and traditional machine learning methods have been proposed to address this problem based on manually designed rules and features. More recently, some deep learning methods have also been applied to this task, with the attempt to automatically capture the causal relationship of emotion and its causes embodied in the text. In this work, we find that in addition to the content of the text, there are another two kinds of information, namely relative position and global labels, that are also very important for emotion cause identification. To integrate such information, we propose a model based on the neural network architecture to encode the three elements ($i.e.$, text content, relative position and global label), in an unified and end-to-end fashion. We introduce a relative position augmented embedding learning algorithm, and transform the task from an independent prediction problem to a reordered prediction problem, where the dynamic global label information is incorporated. Experimental results on a benchmark emotion cause dataset show that our model achieves new state-of-the-art performance and performs significantly better than a number of competitive baselines. Further analysis shows the effectiveness of the relative position augmented embedding learning algorithm and the reordered prediction mechanism with dynamic global labels.
Tasks
Published 2019-06-04
URL https://arxiv.org/abs/1906.01230v1
PDF https://arxiv.org/pdf/1906.01230v1.pdf
PWC https://paperswithcode.com/paper/from-independent-prediction-to-re-ordered
Repo https://github.com/NUSTM/PAEDGL
Framework tf

EvalAI: Towards Better Evaluation Systems for AI Agents

Title EvalAI: Towards Better Evaluation Systems for AI Agents
Authors Deshraj Yadav, Rishabh Jain, Harsh Agrawal, Prithvijit Chattopadhyay, Taranjeet Singh, Akash Jain, Shiv Baran Singh, Stefan Lee, Dhruv Batra
Abstract We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, thereby increasing the rate of measurable progress in this domain.
Tasks
Published 2019-02-10
URL http://arxiv.org/abs/1902.03570v1
PDF http://arxiv.org/pdf/1902.03570v1.pdf
PWC https://paperswithcode.com/paper/evalai-towards-better-evaluation-systems-for
Repo https://github.com/beyretb/AnimalAI-Olympics
Framework tf

Rinascimento: Optimising Statistical Forward Planning Agents for Playing Splendor

Title Rinascimento: Optimising Statistical Forward Planning Agents for Playing Splendor
Authors Ivan Bravi, Simon Lucas, Diego Perez-Liebana, Jialin Liu
Abstract Game-based benchmarks have been playing an essential role in the development of Artificial Intelligence (AI) techniques. Providing diverse challenges is crucial to push research toward innovation and understanding in modern techniques. Rinascimento provides a parameterised partially-observable multiplayer card-based board game, these parameters can easily modify the rules, objectives and items in the game. We describe the framework in all its features and the game-playing challenge providing baseline game-playing AIs and analysis of their skills. We reserve to agents’ hyper-parameter tuning a central role in the experiments highlighting how it can heavily influence the performance. The base-line agents contain several additional contribution to Statistical Forward Planning algorithms.
Tasks
Published 2019-04-03
URL http://arxiv.org/abs/1904.01883v1
PDF http://arxiv.org/pdf/1904.01883v1.pdf
PWC https://paperswithcode.com/paper/rinascimento-optimising-statistical-forward
Repo https://github.com/ivanbravi/RinascimentoFramework
Framework none

The Evolved Transformer

Title The Evolved Transformer
Authors David R. So, Chen Liang, Quoc V. Le
Abstract Recent works have highlighted the strength of the Transformer architecture on sequence tasks while, at the same time, neural architecture search (NAS) has begun to outperform human-designed models. Our goal is to apply NAS to search for a better alternative to the Transformer. We first construct a large search space inspired by the recent advances in feed-forward sequence models and then run evolutionary architecture search with warm starting by seeding our initial population with the Transformer. To directly search on the computationally expensive WMT 2014 English-German translation task, we develop the Progressive Dynamic Hurdles method, which allows us to dynamically allocate more resources to more promising candidate models. The architecture found in our experiments – the Evolved Transformer – demonstrates consistent improvement over the Transformer on four well-established language tasks: WMT 2014 English-German, WMT 2014 English-French, WMT 2014 English-Czech and LM1B. At a big model size, the Evolved Transformer establishes a new state-of-the-art BLEU score of 29.8 on WMT’14 English-German; at smaller sizes, it achieves the same quality as the original “big” Transformer with 37.6% less parameters and outperforms the Transformer by 0.7 BLEU at a mobile-friendly model size of 7M parameters.
Tasks Machine Translation, Neural Architecture Search
Published 2019-01-30
URL https://arxiv.org/abs/1901.11117v4
PDF https://arxiv.org/pdf/1901.11117v4.pdf
PWC https://paperswithcode.com/paper/the-evolved-transformer
Repo https://github.com/tensorflow/tensor2tensor
Framework tf

Beyond Statistical Relations: Integrating Knowledge Relations into Style Correlations for Multi-Label Music Style Classification

Title Beyond Statistical Relations: Integrating Knowledge Relations into Style Correlations for Multi-Label Music Style Classification
Authors Qianwen Ma, Chunyuan Yuan, Wei Zhou, Jizhong Han, Songlin Hu
Abstract Automatically labeling multiple styles for every song is a comprehensive application in all kinds of music websites. Recently, some researches explore review-driven multi-label music style classification and exploit style correlations for this task. However, their methods focus on mining the statistical relations between different music styles and only consider shallow style relations. Moreover, these statistical relations suffer from the underfitting problem because some music styles have little training data. To tackle these problems, we propose a novel knowledge relations integrated framework (KRF) to capture the complete style correlations, which jointly exploits the inherent relations between music styles according to external knowledge and their statistical relations. Based on the two types of relations, we use graph convolutional network to learn the deep correlations between styles automatically. Experimental results show that our framework significantly outperforms state-of-the-art methods. Further studies demonstrate that our framework can effectively alleviate the underfitting problem and learn meaningful style correlations.
Tasks
Published 2019-11-09
URL https://arxiv.org/abs/1911.03626v1
PDF https://arxiv.org/pdf/1911.03626v1.pdf
PWC https://paperswithcode.com/paper/beyond-statistical-relations-integrating
Repo https://github.com/chunyuanY/MusicGenre
Framework none
comments powered by Disqus