October 16, 2019

3052 words 15 mins read

Paper Group NAWR 33

Paper Group NAWR 33

DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding. Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement. A Reassessment of Reference-Based Grammatical Error Correction Metrics. Facts That Matter. Stochastic Capsule Endoscopy Image Enhancement. Fusing St …

DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding

Title DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding
Authors Min Li, Marina Danilevsky, Sara Noeman, Yunyao Li
Abstract Phonetic similarity algorithms identify words and phrases with similar pronunciation which are used in many natural language processing tasks. However, existing approaches are designed mainly for Indo-European languages and fail to capture the unique properties of Chinese pronunciation. In this paper, we propose a high dimensional encoded phonetic similarity algorithm for Chinese, DIMSIM. The encodings are learned from annotated data to separately map initial and final phonemes into n-dimensional coordinates. Pinyin phonetic similarities are then calculated by aggregating the similarities of initial, final and tone. DIMSIM demonstrates a 7.5X improvement on mean reciprocal rank over the state-of-the-art phonetic similarity approaches.
Tasks Spelling Correction
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-1043/
PDF https://www.aclweb.org/anthology/K18-1043
PWC https://paperswithcode.com/paper/dimsim-an-accurate-chinese-phonetic
Repo https://github.com/System-T/DimSim
Framework none

Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement

Title Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement
Authors Nina Poerner, Hinrich Sch{"u}tze, Benjamin Roth
Abstract The behavior of deep neural networks (DNNs) is hard to understand. This makes it necessary to explore post hoc explanation methods. We conduct the first comprehensive evaluation of explanation methods for NLP. To this end, we design two novel evaluation paradigms that cover two important classes of NLP problems: small context and large context problems. Both paradigms require no manual annotation and are therefore broadly applicable. We also introduce LIMSSE, an explanation method inspired by LIME that is designed for NLP. We show empirically that LIMSSE, LRP and DeepLIFT are the most effective explanation methods and recommend them for explaining DNNs in NLP.
Tasks Sentiment Analysis
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-1032/
PDF https://www.aclweb.org/anthology/P18-1032
PWC https://paperswithcode.com/paper/evaluating-neural-network-explanation-methods-1
Repo https://github.com/ArrasL/LRP_for_LSTM
Framework none

A Reassessment of Reference-Based Grammatical Error Correction Metrics

Title A Reassessment of Reference-Based Grammatical Error Correction Metrics
Authors Shamil Chollampatt, Hwee Tou Ng
Abstract Several metrics have been proposed for evaluating grammatical error correction (GEC) systems based on grammaticality, fluency, and adequacy of the output sentences. Previous studies of the correlation of these metrics with human quality judgments were inconclusive, due to the lack of appropriate significance tests, discrepancies in the methods, and choice of datasets used. In this paper, we re-evaluate reference-based GEC metrics by measuring the system-level correlations with humans on a large dataset of human judgments of GEC outputs, and by properly conducting statistical significance tests. Our results show no significant advantage of GLEU over MaxMatch (M2), contradicting previous studies that claim GLEU to be superior. For a finer-grained analysis, we additionally evaluate these metrics for their agreement with human judgments at the sentence level. Our sentence-level analysis indicates that comparing GLEU and M2, one metric may be more useful than the other depending on the scenario. We further qualitatively analyze these metrics and our findings show that apart from being less interpretable and non-deterministic, GLEU also produces counter-intuitive scores in commonly occurring test examples.
Tasks Grammatical Error Correction, Machine Translation
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1231/
PDF https://www.aclweb.org/anthology/C18-1231
PWC https://paperswithcode.com/paper/a-reassessment-of-reference-based-grammatical
Repo https://github.com/nusnlp/gecmetrics
Framework none

Facts That Matter

Title Facts That Matter
Authors Marco Ponza, Luciano Del Corro, Gerhard Weikum
Abstract This work introduces fact salience: The task of generating a machine-readable representation of the most prominent information in a text document as a set of facts. We also present SalIE, the first fact salience system. SalIE is unsupervised and knowledge agnostic, based on open information extraction to detect facts in natural language text, PageRank to determine their relevance, and clustering to promote diversity. We compare SalIE with several baselines (including positional, standard for saliency tasks), and in an extrinsic evaluation, with state-of-the-art automatic text summarizers. SalIE outperforms baselines and text summarizers showing that facts are an effective way to compress information.
Tasks Entity Linking, Open Information Extraction, Question Answering, Relation Extraction, Text Summarization
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1129/
PDF https://www.aclweb.org/anthology/D18-1129
PWC https://paperswithcode.com/paper/facts-that-matter
Repo https://github.com/mponza/SalIE
Framework none

Stochastic Capsule Endoscopy Image Enhancement

Title Stochastic Capsule Endoscopy Image Enhancement
Authors Ahmed Mohammed, Ivar Farup, Marius Pedersen, Øistein Hovde, and Sule Yildirim Yayilgan
Abstract Capsule endoscopy, which uses a wireless camera to take images of the digestive tract, is emerging as an alternative to traditional colonoscopy. The diagnostic values of these images depend on the quality of revealed underlying tissue surfaces. In this paper, we consider the problem of enhancing the visibility of detail and shadowed tissue surfaces for capsule endoscopy images. Using concentric circles at each pixel for random walks combined with stochastic sampling, the proposed method enhances the details of vessel and tissue surfaces. The framework decomposes the image into two detailed layers that contain shadowed tissue surfaces and detail features. The target pixel value is recalculated for the smooth layer using similarity of the target pixel to neighboring pixels by weighting against the total gradient variation and intensity differences. In order to evaluate the diagnostic image quality of the proposed method, we used clinical subjective evaluation with a rank order on selected KID image database and compared it to state-of-the-art enhancement methods. The result showed that the proposed method provides a better result in terms of diagnostic image quality and objective quality contrast metrics and structural similarity index.
Tasks Image Enhancement
Published 2018-06-06
URL https://www.mdpi.com/2313-433X/4/6/75
PDF https://res.mdpi.com/d_attachment/jimaging/jimaging-04-00075/article_deploy/jimaging-04-00075.pdf
PWC https://paperswithcode.com/paper/stochastic-capsule-endoscopy-image
Repo https://github.com/ahme0307/CCE
Framework none

Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks

Title Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks
Authors Sambaran Bandyopadhyay, Harsh Kara, Aswin Kannan, M N Murty
Abstract Analysis and visualization of an information network can be facilitated better using an appropriate embedding of the network. Network embedding learns a compact low-dimensional vector representation for each node of the network, and uses this lower dimensional representation for different network analysis tasks. Only the structure of the network is considered by a majority of the current embedding algorithms. However, some content is associated with each node, in most of the practical applications, which can help to understand the underlying semantics of the network. It is not straightforward to integrate the content of each node in the current state-of-the-art network embedding methods. In this paper, we propose a nonnegative matrix factorization based optimization framework, namely FSCNMF which considers both the network structure and the content of the nodes while learning a lower dimensional representation of each node in the network. Our approach systematically regularizes structure based on content and vice versa to exploit the consistency between the structure and content to the best possible extent. We further extend the basic FSCNMF to an advanced method, namely FSCNMF++ to capture the higher order proximities in the network. We conduct experiments on real world information networks for different types of machine learning applications such as node clustering, visualization, and multi-class classification. The results show that our method can represent the network significantly better than the state-of-the-art algorithms and improve the performance across all the applications that we consider.
Tasks Network Embedding
Published 2018-04-15
URL https://arxiv.org/abs/1804.05313
PDF https://arxiv.org/pdf/1804.05313.pdf
PWC https://paperswithcode.com/paper/fusing-structure-and-content-via-non-negative
Repo https://github.com/benedekrozemberczki/FSCNMF
Framework none

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

Title Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection
Authors Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam
Abstract This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM). A Pyramid Dilated Convolution (PDC) module is first designed for simultaneously extracting spatial features at multiple scales. These spatial features are then concatenated and fed into an extended Deeper Bidirectional ConvLSTM (DB-ConvLSTM) to learn spatiotemporal information. Forward and backward ConvLSTM units are placed in two layers and connected in a cascaded way, encouraging information flow between the bi-directional streams and leading to deeper feature extraction. We further augment DB-ConvLSTM with a PDC-like structure, by adopting several dilated DB-ConvLSTMs to extract multi-scale spatiotemporal information. Extensive experimental results show that our method outperforms previous video saliency models in a large margin, with a real-time speed of 20 fps on a single GPU. With unsupervised video object segmentation as an example application, the proposed model (with a CRF-based post-process) achieves state-of-the-art results on two popular benchmarks, well demonstrating its superior performance and high applicability.
Tasks Object Detection, Salient Object Detection, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Salient Object Detection, Video Semantic Segmentation
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Hongmei_Song_Pseudo_Pyramid_Deeper_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Hongmei_Song_Pseudo_Pyramid_Deeper_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/pyramid-dilated-deeper-convlstm-for-video
Repo https://github.com/shenjianbing/PDB-ConvLSTM
Framework caffe2

Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal?

Title Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal?
Authors Dingquan Li, Tingting Jiang, Weisi Lin, Ming Jiang
Abstract Image content variation is a typical and challenging problem in no-reference image quality assessment (NR-IQA). This work pays special attention to the impact of image content variation on NR-IQA methods. To better analyze this impact, we focus on blur-dominated distortions to exclude the impacts of distortion-type variations. We empirically show that current NR-IQA methods are inconsistent with human visual perception when predicting the relative quality of image pairs with different image contents. In view of deep semantic features of pretrained image classification neural networks always containing discriminative image content information, we put forward a new NR-IQA method based on semantic feature aggregation (SFA) to alleviate the impact of image content variation. Specifically, instead of resizing the image, we first crop multiple overlapping patches over the entire distorted image to avoid introducing geometric deformations. Then, according to an adaptive layer selection procedure, we extract deep semantic features by leveraging the power of a pretrained image classification model for its inherent content-aware property. After that, the local patch features are aggregated using several statistical structures. Finally, a linear regression model is trained for mapping the aggregated global features to image-quality scores. The proposed method, SFA, is compared with nine representative blur-specific NR-IQA methods, two general-purpose NR-IQA methods, and two extra full-reference IQA methods on Gaussian blur images (with and without Gaussian noise/JPEG compression) and realistic blur images from multiple databases, including LIVE, TID2008, TID2013, MLIVE1, MLIVE2, BID, and CLIVE. Experimental results show that SFA is superior to the state-of-the-art NR methods on all seven databases. It is also verified that deep semantic features play a crucial role in addressing image content variation, and this provides a new perspective for NR-IQA.
Tasks Blind Image Quality Assessment, Image Classification, Image Quality Assessment, Image Quality Estimation, No-Reference Image Quality Assessment
Published 2018-10-11
URL https://ieeexplore.ieee.org/document/8489929
PDF https://www.researchgate.net/publication/328240901_Which_Has_Better_Visual_Quality_The_Clear_Blue_Sky_or_a_Blurry_Animal
PWC https://paperswithcode.com/paper/which-has-better-visual-quality-the-clear
Repo https://github.com/lidq92/SFA
Framework pytorch

Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval

Title Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
Authors Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury
Abstract Constructing a joint representation invariant across different modalities (e.g., video, language) is of significant importance in many multimedia applications. While there are a number of recent successes in developing effective image-text retrieval methods by learning joint representations, the video-text retrieval task, in contrast, has not been explored to its fullest extent. In this paper, we study how to effectively utilize available multi-modal cues from videos for the cross-modal video-text retrieval task. Based on our analysis, we propose a novel framework that simultaneously utilizes multimodal features (different visual characteristics, audio inputs, and text) by a fusion strategy for efficient retrieval. Furthermore, we explore several loss functions in training the joint embedding and propose a modified pairwise ranking loss for the retrieval task. Experiments on MSVD and MSR-VTT datasets demonstrate that our method achieves significant performance gain compared to the state-of-the-art approaches.
Tasks Video Retrieval
Published 2018-06-11
URL https://dl.acm.org/citation.cfm?id=3206064
PDF http://www.cs.cmu.edu/~fmetze/interACT/Publications_files/publications/ICMR2018_Camera_Ready.pdf
PWC https://paperswithcode.com/paper/learning-joint-embedding-with-multimodal-cues
Repo https://github.com/niluthpol/multimodal_vtt
Framework pytorch

Question Condensing Networks for Answer Selection in Community Question Answering

Title Question Condensing Networks for Answer Selection in Community Question Answering
Authors Wei Wu, Xu Sun, Houfeng Wang
Abstract Answer selection is an important subtask of community question answering (CQA). In a real-world CQA forum, a question is often represented as two parts: a subject that summarizes the main points of the question, and a body that elaborates on the subject in detail. Previous researches on answer selection usually ignored the difference between these two parts and concatenated them as the question representation. In this paper, we propose the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions. In our model, the question subject is the primary part of the question representation, and the question body information is aggregated based on similarity and disparity with the question subject. Experimental results show that QCN outperforms all existing models on two CQA datasets.
Tasks Answer Selection, Community Question Answering, Question Answering
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-1162/
PDF https://www.aclweb.org/anthology/P18-1162
PWC https://paperswithcode.com/paper/question-condensing-networks-for-answer
Repo https://github.com/pku-wuwei/QCN
Framework none

EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection

Title EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection
Authors Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, Jing Gao
Abstract As news reading on social media becomes more and more popular, fake news becomes a major issue concerning the public and government. The fake news can take advantage of multimedia content to mislead readers and get dissemination, which can cause negative effects or even manipulate public events. One of the unique challenges for fake news detection on social media is how to identify fake news on newly emerged events. Unfortunately, most of the existing approaches can hardly handle this challenge, since they tend to learn event-specific features that can not be transferred to unseen events. In order to address this issue, we propose an end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events. It consists of three main components: the multi-modal feature extractor, the fake news detector, and the event discriminator. The multi-modal feature extractor is responsible for extracting the textual and visual features from posts. It cooperates with the fake news detector to learn the discriminable representation for the detection of fake news. The role of event discriminator is to remove the event-specific features and keep shared features among events. Extensive experiments are conducted on multimedia datasets collected from Weibo and Twitter. The experimental results show our proposed EANN model can outperform the state-of-the-art methods, and learn transferable feature representations.
Tasks Fake News Detection, Sentence Classification
Published 2018-08-19
URL https://dl.acm.org/citation.cfm?id=3219819.3219903
PDF https://dl.acm.org/ft_gateway.cfm?id=3219903&ftid=1988763&dwn=1&CFID=96862880&CFTOKEN=3e28747d4422e5ed-9058E945-9FB8-637C-70D2E207619AE1AF
PWC https://paperswithcode.com/paper/eann-event-adversarial-neural-networks-for
Repo https://github.com/yaqingwang/EANN-KDD18
Framework pytorch

Time Expressions in Mental Health Records for Symptom Onset Extraction

Title Time Expressions in Mental Health Records for Symptom Onset Extraction
Authors Natalia Viani, Lucia Yin, Joyce Kam, Ayunni Alawi, Andr{'e} Bittar, Rina Dutta, Rashmi Patel, Robert Stewart, Sumithra Velupillai
Abstract For psychiatric disorders such as schizophrenia, longer durations of untreated psychosis are associated with worse intervention outcomes. Data included in electronic health records (EHRs) can be useful for retrospective clinical studies, but much of this is stored as unstructured text which cannot be directly used in computation. Natural Language Processing (NLP) methods can be used to extract this data, in order to identify symptoms and treatments from mental health records, and temporally anchor the first emergence of these. We are developing an EHR corpus annotated with time expressions, clinical entities and their relations, to be used for NLP development. In this study, we focus on the first step, identifying time expressions in EHRs for patients with schizophrenia. We developed a gold standard corpus, compared this corpus to other related corpora in terms of content and time expression prevalence, and adapted two NLP systems for extracting time expressions. To the best of our knowledge, this is the first resource annotated for temporal entities in the mental health domain.
Tasks Temporal Information Extraction
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-5621/
PDF https://www.aclweb.org/anthology/W18-5621
PWC https://paperswithcode.com/paper/time-expressions-in-mental-health-records-for
Repo https://github.com/medesto/systems-adaptation
Framework none

SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology

Title SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology
Authors Md Iftekhar Tanveer, Ferhan Ture
Abstract This paper describes SyntaViz, a visualization interface specifically designed for analyzing natural-language queries that were created by users of a voice-enabled product. SyntaViz provides a platform for browsing the ontology of user queries from a syntax-driven perspective, providing quick access to high-impact failure points of the existing intent understanding system and evidence for data-driven decisions in the development cycle. A case study on Xfinity X1 (a voice-enabled entertainment platform from Comcast) reveals that SyntaViz helps developers identify multiple action items in a short amount of time without any special training. SyntaViz has been open-sourced for the benefit of the community.
Tasks Sentiment Analysis, Topic Models
Published 2018-11-01
URL https://www.aclweb.org/anthology/D18-2001/
PDF https://www.aclweb.org/anthology/D18-2001
PWC https://paperswithcode.com/paper/syntaviz-visualizing-voice-queries-through-a
Repo https://github.com/Comcast/SyntaViz
Framework tf

Semantic Supersenses for English Possessives

Title Semantic Supersenses for English Possessives
Authors Austin Blodgett, Nathan Schneider
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1242/
PDF https://www.aclweb.org/anthology/L18-1242
PWC https://paperswithcode.com/paper/semantic-supersenses-for-english-possessives
Repo https://github.com/nert-gu/streusle
Framework none

Multi-Cue Correlation Filters for Robust Visual Tracking

Title Multi-Cue Correlation Filters for Robust Visual Tracking
Authors Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, Houqiang Li
Abstract In recent years, many tracking algorithms achieve impressive performance via fusing multiple types of features, however, most of them fail to fully explore the context among the adopted multiple features and the strength of them. In this paper, we propose an efficient multi-cue analysis framework for robust visual tracking. By combining different types of features, our approach constructs multiple experts through Discriminative Correlation Filter (DCF) and each of them tracks the target independently. With the proposed robustness evaluation strategy, the suitable expert is selected for tracking in each frame. Furthermore, the divergence of multiple experts reveals the reliability of the current tracking, which is quantified to update the experts adaptively to keep them from corruption. Through the proposed multi-cue analysis, our tracker with standard DCF and deep features achieves outstanding results on several challenging benchmarks: OTB-2013, OTB-2015, Temple-Color and VOT 2016. On the other hand, when evaluated with only simple hand-crafted features, our method demonstrates comparable performance amongst complex non-realtime trackers, but exhibits much better efficiency, with a speed of 45 FPS on a CPU.
Tasks Visual Tracking
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Multi-Cue_Correlation_Filters_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Multi-Cue_Correlation_Filters_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/multi-cue-correlation-filters-for-robust
Repo https://github.com/594422814/MCCT
Framework none
comments powered by Disqus