January 25, 2020

2928 words 14 mins read

Paper Group ANR 1712

Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees. Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization. Unsupervised adulterated red-chili pepper content transformation for hyperspectral classification. Helping IT and OT Defenders Collaborate. Integrating Dict …

Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees


Title	Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees
Authors	Jan Chorowski, Adrian Lancucki, Bartosz Kostka, Michal Zapotoczny
Abstract	Deep neural acoustic models benefit from context-dependent (CD) modeling of output symbols. We consider direct training of CTC networks with CD outputs, and identify two issues. The first one is frame-level normalization of probabilities in CTC, which induces strong language modeling behavior that leads to overfitting and interference with external language models. The second one is poor generalization in the presence of numerous lexical units like triphones or tri-chars. We mitigate the former with utterance-level normalization of probabilities. The latter typically requires reducing the CD symbol inventory with state-tying decision trees, which have to be transferred from classical GMM-HMM systems. We replace the trees with a CD symbol embedding network, which saves parameters and ensures generalization to unseen and undersampled CD symbols. The embedding network is trained together with the rest of the acoustic model and removes one of the last cases in which neural systems have to be bootstrapped from GMM-HMM ones.
Tasks	Language Modelling
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04379v2
PDF	http://arxiv.org/pdf/1901.04379v2.pdf
PWC	https://paperswithcode.com/paper/towards-using-context-dependent-symbols-in
Repo
Framework

Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization


Title	Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization
Authors	Sangwoo Cho, Logan Lebanoff, Hassan Foroosh, Fei Liu
Abstract	The most important obstacles facing multi-document summarization include excessive redundancy in source descriptions and the looming shortage of training data. These obstacles prevent encoder-decoder models from being used directly, but optimization-based methods such as determinantal point processes (DPPs) are known to handle them well. In this paper we seek to strengthen a DPP-based method for extractive multi-document summarization by presenting a novel similarity measure inspired by capsule networks. The approach measures redundancy between a pair of sentences based on surface form and semantic information. We show that our DPP system with improved similarity measure performs competitively, outperforming strong summarization baselines on benchmark datasets. Our findings are particularly meaningful for summarizing documents created by multiple authors containing redundant yet lexically diverse expressions.
Tasks	Document Summarization, Multi-Document Summarization, Point Processes
Published	2019-05-31
URL	https://arxiv.org/abs/1906.00072v1
PDF	https://arxiv.org/pdf/1906.00072v1.pdf
PWC	https://paperswithcode.com/paper/190600072
Repo
Framework

Unsupervised adulterated red-chili pepper content transformation for hyperspectral classification


Title	Unsupervised adulterated red-chili pepper content transformation for hyperspectral classification
Authors	Muhammad Hussain Khan, Zainab Saleem, Muhammad Ahmad, Ahmed Sohaib, Hamail Ayaz
Abstract	Preserving red-chili quality is of utmost importance in which the authorities demand the quality techniques to detect, classify and prevent it from the impurities. For example, salt, wheat flour, wheat bran, and rice bran contamination in grounded red chili, which typically a food, are a serious threat to people who are allergic to such items. This work presents the feasibility of utilizing visible and near-infrared (VNIR) hyperspectral imaging (HSI) to detect and classify the aforementioned adulterants in red chili. However, adulterated red chili data annotation is a big challenge for classification because the acquisition of labeled data for real-time supervised learning is expensive in terms of cost and time. Therefore, this study, for the very first time proposes a novel approach to annotate the red chili samples using a clustering mechanism at 500~nm wavelength spectral response due to its dark appearance at a specified wavelength. Later the spectral samples are classified into pure or adulterated using one-class SVM. The classification performance achieves 99% in case of pure adulterants or red chili whereas 85% for adulterated samples. We further investigate that the single classification model is enough to detect any foreign substance in red chili pepper rather than cascading multiple PLS regression models.
Tasks
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03711v1
PDF	https://arxiv.org/pdf/1911.03711v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-adulterated-red-chili-pepper
Repo
Framework

Helping IT and OT Defenders Collaborate


Title	Helping IT and OT Defenders Collaborate
Authors	Glenn A. Fink, Penny McKenzie
Abstract	Cyber-physical systems, especially in critical infrastructures, have become primary hacking targets in international conflicts and diplomacy. However, cyber-physical systems present unique challenges to defenders, starting with an inability to communicate. This paper outlines the results of our interviews with information technology (IT) defenders and operational technology (OT) operators and seeks to address lessons learned from them in the structure of our notional solutions. We present two problems in this paper: (1) the difficulty of coordinating detection and response between defenders who work on the cyber/IT and physical/OT sides of cyber-physical infrastructures, and (2) the difficulty of estimating the safety state of a cyber-physical system while an intrusion is underway but before damage can be effected by the attacker. To meet these challenges, we propose two solutions: (1) a visualization that will enable communication between IT defenders and OT operators, and (2) a machine-learning approach that will estimate the distance from normal the physical system is operating and send information to the visualization.
Tasks
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07374v1
PDF	http://arxiv.org/pdf/1904.07374v1.pdf
PWC	https://paperswithcode.com/paper/helping-it-and-ot-defenders-collaborate
Repo
Framework

Integrating Dictionary Feature into A Deep Learning Model for Disease Named Entity Recognition


Title	Integrating Dictionary Feature into A Deep Learning Model for Disease Named Entity Recognition
Authors	Hamada A. Nayel, Shashrekha H. L
Abstract	In recent years, Deep Learning (DL) models are becoming important due to their demonstrated success at overcoming complex learning problems. DL models have been applied effectively for different Natural Language Processing (NLP) tasks such as part-of-Speech (PoS) tagging and Machine Translation (MT). Disease Named Entity Recognition (Disease-NER) is a crucial task which aims at extracting disease Named Entities (NEs) from text. In this paper, a DL model for Disease-NER using dictionary information is proposed and evaluated on National Center for Biotechnology Information (NCBI) disease corpus and BC5CDR dataset. Word embeddings trained over general domain texts as well as biomedical texts have been used to represent input to the proposed model. This study also compares two different Segment Representation (SR) schemes, namely IOB2 and IOBES for Disease-NER. The results illustrate that using dictionary information, pre-trained word embeddings, character embeddings and CRF with global score improves the performance of Disease-NER system.
Tasks	Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01600v1
PDF	https://arxiv.org/pdf/1911.01600v1.pdf
PWC	https://paperswithcode.com/paper/integrating-dictionary-feature-into-a-deep
Repo
Framework

Hierarchical Transformers for Multi-Document Summarization


Title	Hierarchical Transformers for Multi-Document Summarization
Authors	Yang Liu, Mirella Lapata
Abstract	In this paper, we develop a neural summarization model which can effectively process multiple input documents and distill Transformer architecture with the ability to encode documents in a hierarchical manner. We represent cross-document relationships via an attention mechanism which allows to share information as opposed to simply concatenating text spans and processing them as a flat sequence. Our model learns latent dependencies among textual units, but can also take advantage of explicit graph representations focusing on similarity or discourse relations. Empirical results on the WikiSum dataset demonstrate that the proposed architecture brings substantial improvements over several strong baselines.
Tasks	Document Summarization, Multi-Document Summarization
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13164v1
PDF	https://arxiv.org/pdf/1905.13164v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-transformers-for-multi-document
Repo
Framework

Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning


Title	Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning
Authors	Georgios Papoudakis, Filippos Christianos, Arrasy Rahman, Stefano V. Albrecht
Abstract	Recent developments in deep reinforcement learning are concerned with creating decision-making agents which can perform well in various complex domains. A particular approach which has received increasing attention is multi-agent reinforcement learning, in which multiple agents learn concurrently to coordinate their actions. In such multi-agent environments, additional learning problems arise due to the continually changing decision-making policies of agents. This paper surveys recent works that address the non-stationarity problem in multi-agent deep reinforcement learning. The surveyed methods range from modifications in the training procedure, such as centralized training, to learning representations of the opponent’s policy, meta-learning, communication, and decentralized learning. The survey concludes with a list of open problems and possible lines of future research.
Tasks	Decision Making, Meta-Learning, Multi-agent Reinforcement Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04737v1
PDF	https://arxiv.org/pdf/1906.04737v1.pdf
PWC	https://paperswithcode.com/paper/dealing-with-non-stationarity-in-multi-agent
Repo
Framework


Title	Macross: Urban Dynamics Modeling based on Metapath Guided Cross-Modal Embedding
Authors	Yunan Zhang, Heting Gao, Tarek Abdelzaher
Abstract	As the ongoing rapid urbanization takes place with an ever-increasing speed, fully modeling urban dynamics becomes more and more challenging, but also a necessity for socioeconomic development. It is challenging because human activities and constructions are ubiquitous; urban landscape and life content change anywhere and anytime. It’s crucial due to the fact that only up-to-date urban dynamics can enable governors to optimize their city planning strategy and help individuals organize their daily lives in a more efficient way. Previous geographic topic model based methods attempt to solve this problem but suffer from high computational cost and memory consumption, limiting their scalability to city level applications. Also, strong prior assumptions make such models fail to capture certain patterns by nature. To bridge the gap, we propose Macross, a metapath guided embedding approach to jointly model location, time and text information. Given a dataset of geo-tagged social media posts, we extract and aggregate location and time and construct a heterogeneous information network using the aggregated space and time. Metapath2vec based approach is used to construct vector representations for times, locations and frequent words such that co-occurrence pairs of nodes are closer in latent space. The vector representations will be used to infer related time, locations or keywords for a user query. Experiments done on enormous datasets show our model can generate comparable if not better quality query results compared to state of the art models and outperform some cutting-edge models for activity recovery and classification.
Tasks
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12866v1
PDF	https://arxiv.org/pdf/1911.12866v1.pdf
PWC	https://paperswithcode.com/paper/macross-urban-dynamics-modeling-based-on
Repo
Framework

Prediction of Soil Moisture Content Based On Satellite Data and Sequence-to-Sequence Networks


Title	Prediction of Soil Moisture Content Based On Satellite Data and Sequence-to-Sequence Networks
Authors	Natalia Efremova, Dmitry Zausaev, Gleb Antipov
Abstract	The main objective of this study is to combine remote sensing and machine learning to detect soil moisture content. Growing population and food consumption has led to the need to improve agricultural yield and to reduce wastage of natural resources. In this paper, we propose a neural network architecture, based on recent work by the research community, that can make a strong social impact and aid United Nations Sustainable Development Goal of Zero Hunger. The main aims here are to: improve efficiency of water usage; reduce dependence on irrigation; increase overall crop yield; minimise risk of crop loss due to drought and extreme weather conditions. We achieve this by applying satellite imagery, crop segmentation, soil classification and NDVI and soil moisture prediction on satellite data, ground truth and climate data records. By applying machine learning to sensor data and ground data, farm management systems can evolve into a real time AI enabled platform that can provide actionable recommendations and decision support tools to the farmers.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1907.03697v1
PDF	https://arxiv.org/pdf/1907.03697v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-soil-moisture-content-based-on
Repo
Framework

Self-adaption grey DBSCAN clustering


Title	Self-adaption grey DBSCAN clustering
Authors	Shizhan Lu
Abstract	Clustering analysis, a classical issue in data mining, is widely used in various research areas. This article aims at proposing a self-adaption grey DBSCAN clustering (SAG-DBSCAN) algorithm. First, the grey relational matrix is used to obtain the grey local density indicator, and then this indicator is applied to make self-adapting noise identification for obtaining a dense subset of clustering dataset, finally, the DBSCAN which automatically selects parameters is utilized to cluster the dense subset. Several frequently-used datasets were used to demonstrate the performance and effectiveness of the proposed clustering algorithm and to compare the results with those of other state-of-the-art algorithms. The comprehensive comparisons indicate that our method has advantages over other compared methods.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11477v1
PDF	https://arxiv.org/pdf/1912.11477v1.pdf
PWC	https://paperswithcode.com/paper/self-adaption-grey-dbscan-clustering
Repo
Framework

Fine-grained Information Status Classification Using Discourse Context-Aware Self-Attention


Title	Fine-grained Information Status Classification Using Discourse Context-Aware Self-Attention
Authors	Yufang Hou
Abstract	Previous work on bridging anaphora recognition (Hou et al., 2013a) casts the problem as a subtask of learning fine-grained information status (IS). However, these systems heavily depend on many hand-crafted linguistic features. In this paper, we propose a discourse context-aware self-attention neural network model for fine-grained IS classification. On the ISNotes corpus (Markert et al., 2012), our model with the contextually-encoded word representations (BERT) (Devlin et al., 2018) achieves new state-of-the-art performances on fine-grained IS classification, obtaining a 4.1% absolute overall accuracy improvement compared to Hou et al. (2013a). More importantly, we also show an improvement of 3.9% F1 for bridging anaphora recognition without using any complex hand-crafted semantic features designed for capturing the bridging phenomenon.
Tasks
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04755v1
PDF	https://arxiv.org/pdf/1908.04755v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-information-status
Repo
Framework

Structure-Attentioned Memory Network for Monocular Depth Estimation


Title	Structure-Attentioned Memory Network for Monocular Depth Estimation
Authors	Jing Zhu, Yunxiao Shi, Mengwei Ren, Yi Fang, Kuo-Chin Lien, Junli Gu
Abstract	Monocular depth estimation is a challenging task that aims to predict a corresponding depth map from a given single RGB image. Recent deep learning models have been proposed to predict the depth from the image by learning the alignment of deep features between the RGB image and the depth domains. In this paper, we present a novel approach, named Structure-Attentioned Memory Network, to more effectively transfer domain features for monocular depth estimation by taking into account the common structure regularities (e.g., repetitive structure patterns, planar surfaces, symmetries) in domain adaptation. To this end, we introduce a new Structure-Oriented Memory (SOM) module to learn and memorize the structure-specific information between RGB image domain and the depth domain. More specifically, in the SOM module, we develop a Memorable Bank of Filters (MBF) unit to learn a set of filters that memorize the structure-aware image-depth residual pattern, and also an Attention Guided Controller (AGC) unit to control the filter selection in the MBF given image features queries. Given the query image feature, the trained SOM module is able to adaptively select the best customized filters for cross-domain feature transferring with an optimal structural disparity between image and depth. In summary, we focus on addressing this structure-specific domain adaption challenge by proposing a novel end-to-end multi-scale memorable network for monocular depth estimation. The experiments show that our proposed model demonstrates the superior performance compared to the existing supervised monocular depth estimation approaches on the challenging KITTI and NYU Depth V2 benchmarks.
Tasks	Depth Estimation, Domain Adaptation, Monocular Depth Estimation
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04594v1
PDF	https://arxiv.org/pdf/1909.04594v1.pdf
PWC	https://paperswithcode.com/paper/structure-attentioned-memory-network-for
Repo
Framework

Propagation Channel Modeling by Deep learning Techniques


Title	Propagation Channel Modeling by Deep learning Techniques
Authors	Shirin Seyedsalehi, Vahid Pourahmadi, Hamid Sheikhzadeh, Ali Hossein Gharari Foumani
Abstract	Channel, as the medium for the propagation of electromagnetic waves, is one of the most important parts of a communication system. Being aware of how the channel affects the propagation waves is essential for designing, optimization and performance analysis of a communication system. For this purpose, a proper channel model is needed. This paper presents a novel propagation channel model which considers the time-frequency response of the channel as an image. It models the distribution of these channel images using Deep Convolutional Generative Adversarial Networks. Moreover, for the measurements with different user speeds, the user speed is considered as an auxiliary parameter for the model. StarGAN as an image-to-image translation technique is used to change the generated channel images with respect to the desired user speed. The performance of the proposed model is evaluated using existing metrics. Furthermore, to capture 2D similarity in both time and frequency, a new metric is introduced. Using this metric, the generated channels show significant statistical similarity to the measurement data.
Tasks	Image-to-Image Translation
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06767v1
PDF	https://arxiv.org/pdf/1908.06767v1.pdf
PWC	https://paperswithcode.com/paper/propagation-channel-modeling-by-deep-learning
Repo
Framework

Approximation Capabilities of Neural ODEs and Invertible Residual Networks


Title	Approximation Capabilities of Neural ODEs and Invertible Residual Networks
Authors	Han Zhang, Xi Gao, Jacob Unterman, Tom Arodz
Abstract	Neural ODEs and i-ResNet are recently proposed methods for enforcing invertibility of residual neural models. Having a generic technique for constructing invertible models can open new avenues for advances in learning systems, but so far the question of whether Neural ODEs and i-ResNets can model any continuous invertible function remained unresolved. Here, we show that both of these models are limited in their approximation capabilities. We then prove that any homeomorphism on a $p$-dimensional Euclidean space can be approximated by a Neural ODE operating on a $2p$-dimensional Euclidean space, and a similar result for i-ResNets. We conclude by showing that capping a Neural ODE or an i-ResNet with a single linear layer is sufficient to turn the model into a universal approximator for non-invertible continuous functions.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12998v2
PDF	https://arxiv.org/pdf/1907.12998v2.pdf
PWC	https://paperswithcode.com/paper/approximation-capabilities-of-neural-ordinary
Repo
Framework

Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning


Title	Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning
Authors	Xuefeng Peng, Yi Ding, David Wihl, Omer Gottesman, Matthieu Komorowski, Li-wei H. Lehman, Andrew Ross, Aldo Faisal, Finale Doshi-Velez
Abstract	Sepsis is the leading cause of mortality in the ICU. It is challenging to manage because individual patients respond differently to treatment. Thus, tailoring treatment to the individual patient is essential for the best outcomes. In this paper, we take steps toward this goal by applying a mixture-of-experts framework to personalize sepsis treatment. The mixture model selectively alternates between neighbor-based (kernel) and deep reinforcement learning (DRL) experts depending on patient’s current history. On a large retrospective cohort, this mixture-based approach outperforms physician, kernel only, and DRL-only experts.
Tasks
Published	2019-01-15
URL	http://arxiv.org/abs/1901.04670v1
PDF	http://arxiv.org/pdf/1901.04670v1.pdf
PWC	https://paperswithcode.com/paper/improving-sepsis-treatment-strategies-by
Repo
Framework