October 16, 2019

3052 words 15 mins read

Paper Group NAWR 33

DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding. Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement. A Reassessment of Reference-Based Grammatical Error Correction Metrics. Facts That Matter. Stochastic Capsule Endoscopy Image Enhancement. Fusing St …

DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding


Title	DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding
Authors	Min Li, Marina Danilevsky, Sara Noeman, Yunyao Li
Abstract	Phonetic similarity algorithms identify words and phrases with similar pronunciation which are used in many natural language processing tasks. However, existing approaches are designed mainly for Indo-European languages and fail to capture the unique properties of Chinese pronunciation. In this paper, we propose a high dimensional encoded phonetic similarity algorithm for Chinese, DIMSIM. The encodings are learned from annotated data to separately map initial and final phonemes into n-dimensional coordinates. Pinyin phonetic similarities are then calculated by aggregating the similarities of initial, final and tone. DIMSIM demonstrates a 7.5X improvement on mean reciprocal rank over the state-of-the-art phonetic similarity approaches.
Tasks	Spelling Correction
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-1043/
PDF	https://www.aclweb.org/anthology/K18-1043
PWC	https://paperswithcode.com/paper/dimsim-an-accurate-chinese-phonetic
Repo	https://github.com/System-T/DimSim
Framework	none

Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement


Title	Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement
Authors	Nina Poerner, Hinrich Sch{"u}tze, Benjamin Roth
Abstract	The behavior of deep neural networks (DNNs) is hard to understand. This makes it necessary to explore post hoc explanation methods. We conduct the first comprehensive evaluation of explanation methods for NLP. To this end, we design two novel evaluation paradigms that cover two important classes of NLP problems: small context and large context problems. Both paradigms require no manual annotation and are therefore broadly applicable. We also introduce LIMSSE, an explanation method inspired by LIME that is designed for NLP. We show empirically that LIMSSE, LRP and DeepLIFT are the most effective explanation methods and recommend them for explaining DNNs in NLP.
Tasks	Sentiment Analysis
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1032/
PDF	https://www.aclweb.org/anthology/P18-1032
PWC	https://paperswithcode.com/paper/evaluating-neural-network-explanation-methods-1
Repo	https://github.com/ArrasL/LRP_for_LSTM
Framework	none

A Reassessment of Reference-Based Grammatical Error Correction Metrics


Title	A Reassessment of Reference-Based Grammatical Error Correction Metrics
Authors	Shamil Chollampatt, Hwee Tou Ng
Abstract	Several metrics have been proposed for evaluating grammatical error correction (GEC) systems based on grammaticality, fluency, and adequacy of the output sentences. Previous studies of the correlation of these metrics with human quality judgments were inconclusive, due to the lack of appropriate significance tests, discrepancies in the methods, and choice of datasets used. In this paper, we re-evaluate reference-based GEC metrics by measuring the system-level correlations with humans on a large dataset of human judgments of GEC outputs, and by properly conducting statistical significance tests. Our results show no significant advantage of GLEU over MaxMatch (M2), contradicting previous studies that claim GLEU to be superior. For a finer-grained analysis, we additionally evaluate these metrics for their agreement with human judgments at the sentence level. Our sentence-level analysis indicates that comparing GLEU and M2, one metric may be more useful than the other depending on the scenario. We further qualitatively analyze these metrics and our findings show that apart from being less interpretable and non-deterministic, GLEU also produces counter-intuitive scores in commonly occurring test examples.
Tasks	Grammatical Error Correction, Machine Translation
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1231/
PDF	https://www.aclweb.org/anthology/C18-1231
PWC	https://paperswithcode.com/paper/a-reassessment-of-reference-based-grammatical
Repo	https://github.com/nusnlp/gecmetrics
Framework	none

Facts That Matter


Title	Facts That Matter
Authors	Marco Ponza, Luciano Del Corro, Gerhard Weikum
Abstract	This work introduces fact salience: The task of generating a machine-readable representation of the most prominent information in a text document as a set of facts. We also present SalIE, the first fact salience system. SalIE is unsupervised and knowledge agnostic, based on open information extraction to detect facts in natural language text, PageRank to determine their relevance, and clustering to promote diversity. We compare SalIE with several baselines (including positional, standard for saliency tasks), and in an extrinsic evaluation, with state-of-the-art automatic text summarizers. SalIE outperforms baselines and text summarizers showing that facts are an effective way to compress information.
Tasks	Entity Linking, Open Information Extraction, Question Answering, Relation Extraction, Text Summarization
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1129/
PDF	https://www.aclweb.org/anthology/D18-1129
PWC	https://paperswithcode.com/paper/facts-that-matter
Repo	https://github.com/mponza/SalIE
Framework	none

Stochastic Capsule Endoscopy Image Enhancement


Title	Stochastic Capsule Endoscopy Image Enhancement
Authors	Ahmed Mohammed, Ivar Farup, Marius Pedersen, Øistein Hovde, and Sule Yildirim Yayilgan
Abstract	Capsule endoscopy, which uses a wireless camera to take images of the digestive tract, is emerging as an alternative to traditional colonoscopy. The diagnostic values of these images depend on the quality of revealed underlying tissue surfaces. In this paper, we consider the problem of enhancing the visibility of detail and shadowed tissue surfaces for capsule endoscopy images. Using concentric circles at each pixel for random walks combined with stochastic sampling, the proposed method enhances the details of vessel and tissue surfaces. The framework decomposes the image into two detailed layers that contain shadowed tissue surfaces and detail features. The target pixel value is recalculated for the smooth layer using similarity of the target pixel to neighboring pixels by weighting against the total gradient variation and intensity differences. In order to evaluate the diagnostic image quality of the proposed method, we used clinical subjective evaluation with a rank order on selected KID image database and compared it to state-of-the-art enhancement methods. The result showed that the proposed method provides a better result in terms of diagnostic image quality and objective quality contrast metrics and structural similarity index.
Tasks	Image Enhancement
Published	2018-06-06
URL	https://www.mdpi.com/2313-433X/4/6/75
PDF	https://res.mdpi.com/d_attachment/jimaging/jimaging-04-00075/article_deploy/jimaging-04-00075.pdf
PWC	https://paperswithcode.com/paper/stochastic-capsule-endoscopy-image
Repo	https://github.com/ahme0307/CCE
Framework	none

Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks


Title	Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks
Authors	Sambaran Bandyopadhyay, Harsh Kara, Aswin Kannan, M N Murty
Abstract	Analysis and visualization of an information network can be facilitated better using an appropriate embedding of the network. Network embedding learns a compact low-dimensional vector representation for each node of the network, and uses this lower dimensional representation for different network analysis tasks. Only the structure of the network is considered by a majority of the current embedding algorithms. However, some content is associated with each node, in most of the practical applications, which can help to understand the underlying semantics of the network. It is not straightforward to integrate the content of each node in the current state-of-the-art network embedding methods. In this paper, we propose a nonnegative matrix factorization based optimization framework, namely FSCNMF which considers both the network structure and the content of the nodes while learning a lower dimensional representation of each node in the network. Our approach systematically regularizes structure based on content and vice versa to exploit the consistency between the structure and content to the best possible extent. We further extend the basic FSCNMF to an advanced method, namely FSCNMF++ to capture the higher order proximities in the network. We conduct experiments on real world information networks for different types of machine learning applications such as node clustering, visualization, and multi-class classification. The results show that our method can represent the network significantly better than the state-of-the-art algorithms and improve the performance across all the applications that we consider.
Tasks	Network Embedding
Published	2018-04-15
URL	https://arxiv.org/abs/1804.05313
PDF	https://arxiv.org/pdf/1804.05313.pdf
PWC	https://paperswithcode.com/paper/fusing-structure-and-content-via-non-negative
Repo	https://github.com/benedekrozemberczki/FSCNMF
Framework	none

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection


Title	Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection
Authors	Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam
Abstract	This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM). A Pyramid Dilated Convolution (PDC) module is first designed for simultaneously extracting spatial features at multiple scales. These spatial features are then concatenated and fed into an extended Deeper Bidirectional ConvLSTM (DB-ConvLSTM) to learn spatiotemporal information. Forward and backward ConvLSTM units are placed in two layers and connected in a cascaded way, encouraging information flow between the bi-directional streams and leading to deeper feature extraction. We further augment DB-ConvLSTM with a PDC-like structure, by adopting several dilated DB-ConvLSTMs to extract multi-scale spatiotemporal information. Extensive experimental results show that our method outperforms previous video saliency models in a large margin, with a real-time speed of 20 fps on a single GPU. With unsupervised video object segmentation as an example application, the proposed model (with a CRF-based post-process) achieves state-of-the-art results on two popular benchmarks, well demonstrating its superior performance and high applicability.
Tasks	Object Detection, Salient Object Detection, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Salient Object Detection, Video Semantic Segmentation
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Hongmei_Song_Pseudo_Pyramid_Deeper_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Hongmei_Song_Pseudo_Pyramid_Deeper_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/pyramid-dilated-deeper-convlstm-for-video
Repo	https://github.com/shenjianbing/PDB-ConvLSTM
Framework	caffe2

Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal?


Title	Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal?
Authors	Dingquan Li, Tingting Jiang, Weisi Lin, Ming Jiang
Abstract	Image content variation is a typical and challenging problem in no-reference image quality assessment (NR-IQA). This work pays special attention to the impact of image content variation on NR-IQA methods. To better analyze this impact, we focus on blur-dominated distortions to exclude the impacts of distortion-type variations. We empirically show that current NR-IQA methods are inconsistent with human visual perception when predicting the relative quality of image pairs with different image contents. In view of deep semantic features of pretrained image classification neural networks always containing discriminative image content information, we put forward a new NR-IQA method based on semantic feature aggregation (SFA) to alleviate the impact of image content variation. Specifically, instead of resizing the image, we first crop multiple overlapping patches over the entire distorted image to avoid introducing geometric deformations. Then, according to an adaptive layer selection procedure, we extract deep semantic features by leveraging the power of a pretrained image classification model for its inherent content-aware property. After that, the local patch features are aggregated using several statistical structures. Finally, a linear regression model is trained for mapping the aggregated global features to image-quality scores. The proposed method, SFA, is compared with nine representative blur-specific NR-IQA methods, two general-purpose NR-IQA methods, and two extra full-reference IQA methods on Gaussian blur images (with and without Gaussian noise/JPEG compression) and realistic blur images from multiple databases, including LIVE, TID2008, TID2013, MLIVE1, MLIVE2, BID, and CLIVE. Experimental results show that SFA is superior to the state-of-the-art NR methods on all seven databases. It is also verified that deep semantic features play a crucial role in addressing image content variation, and this provides a new perspective for NR-IQA.
Tasks	Blind Image Quality Assessment, Image Classification, Image Quality Assessment, Image Quality Estimation, No-Reference Image Quality Assessment
Published	2018-10-11
URL	https://ieeexplore.ieee.org/document/8489929
PDF	https://www.researchgate.net/publication/328240901_Which_Has_Better_Visual_Quality_The_Clear_Blue_Sky_or_a_Blurry_Animal
PWC	https://paperswithcode.com/paper/which-has-better-visual-quality-the-clear
Repo	https://github.com/lidq92/SFA
Framework	pytorch


Title	Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
Authors	Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury
Abstract	Constructing a joint representation invariant across different modalities (e.g., video, language) is of significant importance in many multimedia applications. While there are a number of recent successes in developing effective image-text retrieval methods by learning joint representations, the video-text retrieval task, in contrast, has not been explored to its fullest extent. In this paper, we study how to effectively utilize available multi-modal cues from videos for the cross-modal video-text retrieval task. Based on our analysis, we propose a novel framework that simultaneously utilizes multimodal features (different visual characteristics, audio inputs, and text) by a fusion strategy for efficient retrieval. Furthermore, we explore several loss functions in training the joint embedding and propose a modified pairwise ranking loss for the retrieval task. Experiments on MSVD and MSR-VTT datasets demonstrate that our method achieves significant performance gain compared to the state-of-the-art approaches.
Tasks	Video Retrieval
Published	2018-06-11
URL	https://dl.acm.org/citation.cfm?id=3206064
PDF	http://www.cs.cmu.edu/~fmetze/interACT/Publications_files/publications/ICMR2018_Camera_Ready.pdf
PWC	https://paperswithcode.com/paper/learning-joint-embedding-with-multimodal-cues
Repo	https://github.com/niluthpol/multimodal_vtt
Framework	pytorch

Question Condensing Networks for Answer Selection in Community Question Answering


Title	Question Condensing Networks for Answer Selection in Community Question Answering
Authors	Wei Wu, Xu Sun, Houfeng Wang
Abstract	Answer selection is an important subtask of community question answering (CQA). In a real-world CQA forum, a question is often represented as two parts: a subject that summarizes the main points of the question, and a body that elaborates on the subject in detail. Previous researches on answer selection usually ignored the difference between these two parts and concatenated them as the question representation. In this paper, we propose the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions. In our model, the question subject is the primary part of the question representation, and the question body information is aggregated based on similarity and disparity with the question subject. Experimental results show that QCN outperforms all existing models on two CQA datasets.
Tasks	Answer Selection, Community Question Answering, Question Answering
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1162/
PDF	https://www.aclweb.org/anthology/P18-1162
PWC	https://paperswithcode.com/paper/question-condensing-networks-for-answer
Repo	https://github.com/pku-wuwei/QCN
Framework	none


Title	EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection
Authors	Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, Jing Gao
Abstract	As news reading on social media becomes more and more popular, fake news becomes a major issue concerning the public and government. The fake news can take advantage of multimedia content to mislead readers and get dissemination, which can cause negative effects or even manipulate public events. One of the unique challenges for fake news detection on social media is how to identify fake news on newly emerged events. Unfortunately, most of the existing approaches can hardly handle this challenge, since they tend to learn event-specific features that can not be transferred to unseen events. In order to address this issue, we propose an end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events. It consists of three main components: the multi-modal feature extractor, the fake news detector, and the event discriminator. The multi-modal feature extractor is responsible for extracting the textual and visual features from posts. It cooperates with the fake news detector to learn the discriminable representation for the detection of fake news. The role of event discriminator is to remove the event-specific features and keep shared features among events. Extensive experiments are conducted on multimedia datasets collected from Weibo and Twitter. The experimental results show our proposed EANN model can outperform the state-of-the-art methods, and learn transferable feature representations.
Tasks	Fake News Detection, Sentence Classification
Published	2018-08-19
URL	https://dl.acm.org/citation.cfm?id=3219819.3219903
PDF	https://dl.acm.org/ft_gateway.cfm?id=3219903&ftid=1988763&dwn=1&CFID=96862880&CFTOKEN=3e28747d4422e5ed-9058E945-9FB8-637C-70D2E207619AE1AF
PWC	https://paperswithcode.com/paper/eann-event-adversarial-neural-networks-for
Repo	https://github.com/yaqingwang/EANN-KDD18
Framework	pytorch

Time Expressions in Mental Health Records for Symptom Onset Extraction


Title	Time Expressions in Mental Health Records for Symptom Onset Extraction
Authors	Natalia Viani, Lucia Yin, Joyce Kam, Ayunni Alawi, Andr{'e} Bittar, Rina Dutta, Rashmi Patel, Robert Stewart, Sumithra Velupillai
Abstract	For psychiatric disorders such as schizophrenia, longer durations of untreated psychosis are associated with worse intervention outcomes. Data included in electronic health records (EHRs) can be useful for retrospective clinical studies, but much of this is stored as unstructured text which cannot be directly used in computation. Natural Language Processing (NLP) methods can be used to extract this data, in order to identify symptoms and treatments from mental health records, and temporally anchor the first emergence of these. We are developing an EHR corpus annotated with time expressions, clinical entities and their relations, to be used for NLP development. In this study, we focus on the first step, identifying time expressions in EHRs for patients with schizophrenia. We developed a gold standard corpus, compared this corpus to other related corpora in terms of content and time expression prevalence, and adapted two NLP systems for extracting time expressions. To the best of our knowledge, this is the first resource annotated for temporal entities in the mental health domain.
Tasks	Temporal Information Extraction
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5621/
PDF	https://www.aclweb.org/anthology/W18-5621
PWC	https://paperswithcode.com/paper/time-expressions-in-mental-health-records-for
Repo	https://github.com/medesto/systems-adaptation
Framework	none

SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology


Title	SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology
Authors	Md Iftekhar Tanveer, Ferhan Ture
Abstract	This paper describes SyntaViz, a visualization interface specifically designed for analyzing natural-language queries that were created by users of a voice-enabled product. SyntaViz provides a platform for browsing the ontology of user queries from a syntax-driven perspective, providing quick access to high-impact failure points of the existing intent understanding system and evidence for data-driven decisions in the development cycle. A case study on Xfinity X1 (a voice-enabled entertainment platform from Comcast) reveals that SyntaViz helps developers identify multiple action items in a short amount of time without any special training. SyntaViz has been open-sourced for the benefit of the community.
Tasks	Sentiment Analysis, Topic Models
Published	2018-11-01
URL	https://www.aclweb.org/anthology/D18-2001/
PDF	https://www.aclweb.org/anthology/D18-2001
PWC	https://paperswithcode.com/paper/syntaviz-visualizing-voice-queries-through-a
Repo	https://github.com/Comcast/SyntaViz
Framework	tf

Semantic Supersenses for English Possessives


Title	Semantic Supersenses for English Possessives
Authors	Austin Blodgett, Nathan Schneider
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1242/
PDF	https://www.aclweb.org/anthology/L18-1242
PWC	https://paperswithcode.com/paper/semantic-supersenses-for-english-possessives
Repo	https://github.com/nert-gu/streusle
Framework	none

Multi-Cue Correlation Filters for Robust Visual Tracking


Title	Multi-Cue Correlation Filters for Robust Visual Tracking
Authors	Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, Houqiang Li
Abstract	In recent years, many tracking algorithms achieve impressive performance via fusing multiple types of features, however, most of them fail to fully explore the context among the adopted multiple features and the strength of them. In this paper, we propose an efficient multi-cue analysis framework for robust visual tracking. By combining different types of features, our approach constructs multiple experts through Discriminative Correlation Filter (DCF) and each of them tracks the target independently. With the proposed robustness evaluation strategy, the suitable expert is selected for tracking in each frame. Furthermore, the divergence of multiple experts reveals the reliability of the current tracking, which is quantified to update the experts adaptively to keep them from corruption. Through the proposed multi-cue analysis, our tracker with standard DCF and deep features achieves outstanding results on several challenging benchmarks: OTB-2013, OTB-2015, Temple-Color and VOT 2016. On the other hand, when evaluated with only simple hand-crafted features, our method demonstrates comparable performance amongst complex non-realtime trackers, but exhibits much better efficiency, with a speed of 45 FPS on a CPU.
Tasks	Visual Tracking
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Multi-Cue_Correlation_Filters_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Multi-Cue_Correlation_Filters_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/multi-cue-correlation-filters-for-robust
Repo	https://github.com/594422814/MCCT
Framework	none