Paper Group NAWR 33
DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding. Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement. A Reassessment of Reference-Based Grammatical Error Correction Metrics. Facts That Matter. Stochastic Capsule Endoscopy Image Enhancement. Fusing St …
DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding
Title | DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding |
Authors | Min Li, Marina Danilevsky, Sara Noeman, Yunyao Li |
Abstract | Phonetic similarity algorithms identify words and phrases with similar pronunciation which are used in many natural language processing tasks. However, existing approaches are designed mainly for Indo-European languages and fail to capture the unique properties of Chinese pronunciation. In this paper, we propose a high dimensional encoded phonetic similarity algorithm for Chinese, DIMSIM. The encodings are learned from annotated data to separately map initial and final phonemes into n-dimensional coordinates. Pinyin phonetic similarities are then calculated by aggregating the similarities of initial, final and tone. DIMSIM demonstrates a 7.5X improvement on mean reciprocal rank over the state-of-the-art phonetic similarity approaches. |
Tasks | Spelling Correction |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-1043/ |
https://www.aclweb.org/anthology/K18-1043 | |
PWC | https://paperswithcode.com/paper/dimsim-an-accurate-chinese-phonetic |
Repo | https://github.com/System-T/DimSim |
Framework | none |
Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement
Title | Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement |
Authors | Nina Poerner, Hinrich Sch{"u}tze, Benjamin Roth |
Abstract | The behavior of deep neural networks (DNNs) is hard to understand. This makes it necessary to explore post hoc explanation methods. We conduct the first comprehensive evaluation of explanation methods for NLP. To this end, we design two novel evaluation paradigms that cover two important classes of NLP problems: small context and large context problems. Both paradigms require no manual annotation and are therefore broadly applicable. We also introduce LIMSSE, an explanation method inspired by LIME that is designed for NLP. We show empirically that LIMSSE, LRP and DeepLIFT are the most effective explanation methods and recommend them for explaining DNNs in NLP. |
Tasks | Sentiment Analysis |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1032/ |
https://www.aclweb.org/anthology/P18-1032 | |
PWC | https://paperswithcode.com/paper/evaluating-neural-network-explanation-methods-1 |
Repo | https://github.com/ArrasL/LRP_for_LSTM |
Framework | none |
A Reassessment of Reference-Based Grammatical Error Correction Metrics
Title | A Reassessment of Reference-Based Grammatical Error Correction Metrics |
Authors | Shamil Chollampatt, Hwee Tou Ng |
Abstract | Several metrics have been proposed for evaluating grammatical error correction (GEC) systems based on grammaticality, fluency, and adequacy of the output sentences. Previous studies of the correlation of these metrics with human quality judgments were inconclusive, due to the lack of appropriate significance tests, discrepancies in the methods, and choice of datasets used. In this paper, we re-evaluate reference-based GEC metrics by measuring the system-level correlations with humans on a large dataset of human judgments of GEC outputs, and by properly conducting statistical significance tests. Our results show no significant advantage of GLEU over MaxMatch (M2), contradicting previous studies that claim GLEU to be superior. For a finer-grained analysis, we additionally evaluate these metrics for their agreement with human judgments at the sentence level. Our sentence-level analysis indicates that comparing GLEU and M2, one metric may be more useful than the other depending on the scenario. We further qualitatively analyze these metrics and our findings show that apart from being less interpretable and non-deterministic, GLEU also produces counter-intuitive scores in commonly occurring test examples. |
Tasks | Grammatical Error Correction, Machine Translation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1231/ |
https://www.aclweb.org/anthology/C18-1231 | |
PWC | https://paperswithcode.com/paper/a-reassessment-of-reference-based-grammatical |
Repo | https://github.com/nusnlp/gecmetrics |
Framework | none |
Facts That Matter
Title | Facts That Matter |
Authors | Marco Ponza, Luciano Del Corro, Gerhard Weikum |
Abstract | This work introduces fact salience: The task of generating a machine-readable representation of the most prominent information in a text document as a set of facts. We also present SalIE, the first fact salience system. SalIE is unsupervised and knowledge agnostic, based on open information extraction to detect facts in natural language text, PageRank to determine their relevance, and clustering to promote diversity. We compare SalIE with several baselines (including positional, standard for saliency tasks), and in an extrinsic evaluation, with state-of-the-art automatic text summarizers. SalIE outperforms baselines and text summarizers showing that facts are an effective way to compress information. |
Tasks | Entity Linking, Open Information Extraction, Question Answering, Relation Extraction, Text Summarization |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1129/ |
https://www.aclweb.org/anthology/D18-1129 | |
PWC | https://paperswithcode.com/paper/facts-that-matter |
Repo | https://github.com/mponza/SalIE |
Framework | none |
Stochastic Capsule Endoscopy Image Enhancement
Title | Stochastic Capsule Endoscopy Image Enhancement |
Authors | Ahmed Mohammed, Ivar Farup, Marius Pedersen, Øistein Hovde, and Sule Yildirim Yayilgan |
Abstract | Capsule endoscopy, which uses a wireless camera to take images of the digestive tract, is emerging as an alternative to traditional colonoscopy. The diagnostic values of these images depend on the quality of revealed underlying tissue surfaces. In this paper, we consider the problem of enhancing the visibility of detail and shadowed tissue surfaces for capsule endoscopy images. Using concentric circles at each pixel for random walks combined with stochastic sampling, the proposed method enhances the details of vessel and tissue surfaces. The framework decomposes the image into two detailed layers that contain shadowed tissue surfaces and detail features. The target pixel value is recalculated for the smooth layer using similarity of the target pixel to neighboring pixels by weighting against the total gradient variation and intensity differences. In order to evaluate the diagnostic image quality of the proposed method, we used clinical subjective evaluation with a rank order on selected KID image database and compared it to state-of-the-art enhancement methods. The result showed that the proposed method provides a better result in terms of diagnostic image quality and objective quality contrast metrics and structural similarity index. |
Tasks | Image Enhancement |
Published | 2018-06-06 |
URL | https://www.mdpi.com/2313-433X/4/6/75 |
https://res.mdpi.com/d_attachment/jimaging/jimaging-04-00075/article_deploy/jimaging-04-00075.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-capsule-endoscopy-image |
Repo | https://github.com/ahme0307/CCE |
Framework | none |
Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks
Title | Fusing Structure and Content via Non-negative Matrix Factorization for Embedding Information Networks |
Authors | Sambaran Bandyopadhyay, Harsh Kara, Aswin Kannan, M N Murty |
Abstract | Analysis and visualization of an information network can be facilitated better using an appropriate embedding of the network. Network embedding learns a compact low-dimensional vector representation for each node of the network, and uses this lower dimensional representation for different network analysis tasks. Only the structure of the network is considered by a majority of the current embedding algorithms. However, some content is associated with each node, in most of the practical applications, which can help to understand the underlying semantics of the network. It is not straightforward to integrate the content of each node in the current state-of-the-art network embedding methods. In this paper, we propose a nonnegative matrix factorization based optimization framework, namely FSCNMF which considers both the network structure and the content of the nodes while learning a lower dimensional representation of each node in the network. Our approach systematically regularizes structure based on content and vice versa to exploit the consistency between the structure and content to the best possible extent. We further extend the basic FSCNMF to an advanced method, namely FSCNMF++ to capture the higher order proximities in the network. We conduct experiments on real world information networks for different types of machine learning applications such as node clustering, visualization, and multi-class classification. The results show that our method can represent the network significantly better than the state-of-the-art algorithms and improve the performance across all the applications that we consider. |
Tasks | Network Embedding |
Published | 2018-04-15 |
URL | https://arxiv.org/abs/1804.05313 |
https://arxiv.org/pdf/1804.05313.pdf | |
PWC | https://paperswithcode.com/paper/fusing-structure-and-content-via-non-negative |
Repo | https://github.com/benedekrozemberczki/FSCNMF |
Framework | none |
Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection
Title | Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection |
Authors | Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam |
Abstract | This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM). A Pyramid Dilated Convolution (PDC) module is first designed for simultaneously extracting spatial features at multiple scales. These spatial features are then concatenated and fed into an extended Deeper Bidirectional ConvLSTM (DB-ConvLSTM) to learn spatiotemporal information. Forward and backward ConvLSTM units are placed in two layers and connected in a cascaded way, encouraging information flow between the bi-directional streams and leading to deeper feature extraction. We further augment DB-ConvLSTM with a PDC-like structure, by adopting several dilated DB-ConvLSTMs to extract multi-scale spatiotemporal information. Extensive experimental results show that our method outperforms previous video saliency models in a large margin, with a real-time speed of 20 fps on a single GPU. With unsupervised video object segmentation as an example application, the proposed model (with a CRF-based post-process) achieves state-of-the-art results on two popular benchmarks, well demonstrating its superior performance and high applicability. |
Tasks | Object Detection, Salient Object Detection, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Salient Object Detection, Video Semantic Segmentation |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Hongmei_Song_Pseudo_Pyramid_Deeper_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Hongmei_Song_Pseudo_Pyramid_Deeper_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/pyramid-dilated-deeper-convlstm-for-video |
Repo | https://github.com/shenjianbing/PDB-ConvLSTM |
Framework | caffe2 |
Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal?
Title | Which Has Better Visual Quality: The Clear Blue Sky or a Blurry Animal? |
Authors | Dingquan Li, Tingting Jiang, Weisi Lin, Ming Jiang |
Abstract | Image content variation is a typical and challenging problem in no-reference image quality assessment (NR-IQA). This work pays special attention to the impact of image content variation on NR-IQA methods. To better analyze this impact, we focus on blur-dominated distortions to exclude the impacts of distortion-type variations. We empirically show that current NR-IQA methods are inconsistent with human visual perception when predicting the relative quality of image pairs with different image contents. In view of deep semantic features of pretrained image classification neural networks always containing discriminative image content information, we put forward a new NR-IQA method based on semantic feature aggregation (SFA) to alleviate the impact of image content variation. Specifically, instead of resizing the image, we first crop multiple overlapping patches over the entire distorted image to avoid introducing geometric deformations. Then, according to an adaptive layer selection procedure, we extract deep semantic features by leveraging the power of a pretrained image classification model for its inherent content-aware property. After that, the local patch features are aggregated using several statistical structures. Finally, a linear regression model is trained for mapping the aggregated global features to image-quality scores. The proposed method, SFA, is compared with nine representative blur-specific NR-IQA methods, two general-purpose NR-IQA methods, and two extra full-reference IQA methods on Gaussian blur images (with and without Gaussian noise/JPEG compression) and realistic blur images from multiple databases, including LIVE, TID2008, TID2013, MLIVE1, MLIVE2, BID, and CLIVE. Experimental results show that SFA is superior to the state-of-the-art NR methods on all seven databases. It is also verified that deep semantic features play a crucial role in addressing image content variation, and this provides a new perspective for NR-IQA. |
Tasks | Blind Image Quality Assessment, Image Classification, Image Quality Assessment, Image Quality Estimation, No-Reference Image Quality Assessment |
Published | 2018-10-11 |
URL | https://ieeexplore.ieee.org/document/8489929 |
https://www.researchgate.net/publication/328240901_Which_Has_Better_Visual_Quality_The_Clear_Blue_Sky_or_a_Blurry_Animal | |
PWC | https://paperswithcode.com/paper/which-has-better-visual-quality-the-clear |
Repo | https://github.com/lidq92/SFA |
Framework | pytorch |
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
Title | Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval |
Authors | Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury |
Abstract | Constructing a joint representation invariant across different modalities (e.g., video, language) is of significant importance in many multimedia applications. While there are a number of recent successes in developing effective image-text retrieval methods by learning joint representations, the video-text retrieval task, in contrast, has not been explored to its fullest extent. In this paper, we study how to effectively utilize available multi-modal cues from videos for the cross-modal video-text retrieval task. Based on our analysis, we propose a novel framework that simultaneously utilizes multimodal features (different visual characteristics, audio inputs, and text) by a fusion strategy for efficient retrieval. Furthermore, we explore several loss functions in training the joint embedding and propose a modified pairwise ranking loss for the retrieval task. Experiments on MSVD and MSR-VTT datasets demonstrate that our method achieves significant performance gain compared to the state-of-the-art approaches. |
Tasks | Video Retrieval |
Published | 2018-06-11 |
URL | https://dl.acm.org/citation.cfm?id=3206064 |
http://www.cs.cmu.edu/~fmetze/interACT/Publications_files/publications/ICMR2018_Camera_Ready.pdf | |
PWC | https://paperswithcode.com/paper/learning-joint-embedding-with-multimodal-cues |
Repo | https://github.com/niluthpol/multimodal_vtt |
Framework | pytorch |
Question Condensing Networks for Answer Selection in Community Question Answering
Title | Question Condensing Networks for Answer Selection in Community Question Answering |
Authors | Wei Wu, Xu Sun, Houfeng Wang |
Abstract | Answer selection is an important subtask of community question answering (CQA). In a real-world CQA forum, a question is often represented as two parts: a subject that summarizes the main points of the question, and a body that elaborates on the subject in detail. Previous researches on answer selection usually ignored the difference between these two parts and concatenated them as the question representation. In this paper, we propose the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions. In our model, the question subject is the primary part of the question representation, and the question body information is aggregated based on similarity and disparity with the question subject. Experimental results show that QCN outperforms all existing models on two CQA datasets. |
Tasks | Answer Selection, Community Question Answering, Question Answering |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1162/ |
https://www.aclweb.org/anthology/P18-1162 | |
PWC | https://paperswithcode.com/paper/question-condensing-networks-for-answer |
Repo | https://github.com/pku-wuwei/QCN |
Framework | none |
EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection
Title | EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection |
Authors | Yaqing Wang, Fenglong Ma, Zhiwei Jin, Ye Yuan, Guangxu Xun, Kishlay Jha, Lu Su, Jing Gao |
Abstract | As news reading on social media becomes more and more popular, fake news becomes a major issue concerning the public and government. The fake news can take advantage of multimedia content to mislead readers and get dissemination, which can cause negative effects or even manipulate public events. One of the unique challenges for fake news detection on social media is how to identify fake news on newly emerged events. Unfortunately, most of the existing approaches can hardly handle this challenge, since they tend to learn event-specific features that can not be transferred to unseen events. In order to address this issue, we propose an end-to-end framework named Event Adversarial Neural Network (EANN), which can derive event-invariant features and thus benefit the detection of fake news on newly arrived events. It consists of three main components: the multi-modal feature extractor, the fake news detector, and the event discriminator. The multi-modal feature extractor is responsible for extracting the textual and visual features from posts. It cooperates with the fake news detector to learn the discriminable representation for the detection of fake news. The role of event discriminator is to remove the event-specific features and keep shared features among events. Extensive experiments are conducted on multimedia datasets collected from Weibo and Twitter. The experimental results show our proposed EANN model can outperform the state-of-the-art methods, and learn transferable feature representations. |
Tasks | Fake News Detection, Sentence Classification |
Published | 2018-08-19 |
URL | https://dl.acm.org/citation.cfm?id=3219819.3219903 |
https://dl.acm.org/ft_gateway.cfm?id=3219903&ftid=1988763&dwn=1&CFID=96862880&CFTOKEN=3e28747d4422e5ed-9058E945-9FB8-637C-70D2E207619AE1AF | |
PWC | https://paperswithcode.com/paper/eann-event-adversarial-neural-networks-for |
Repo | https://github.com/yaqingwang/EANN-KDD18 |
Framework | pytorch |
Time Expressions in Mental Health Records for Symptom Onset Extraction
Title | Time Expressions in Mental Health Records for Symptom Onset Extraction |
Authors | Natalia Viani, Lucia Yin, Joyce Kam, Ayunni Alawi, Andr{'e} Bittar, Rina Dutta, Rashmi Patel, Robert Stewart, Sumithra Velupillai |
Abstract | For psychiatric disorders such as schizophrenia, longer durations of untreated psychosis are associated with worse intervention outcomes. Data included in electronic health records (EHRs) can be useful for retrospective clinical studies, but much of this is stored as unstructured text which cannot be directly used in computation. Natural Language Processing (NLP) methods can be used to extract this data, in order to identify symptoms and treatments from mental health records, and temporally anchor the first emergence of these. We are developing an EHR corpus annotated with time expressions, clinical entities and their relations, to be used for NLP development. In this study, we focus on the first step, identifying time expressions in EHRs for patients with schizophrenia. We developed a gold standard corpus, compared this corpus to other related corpora in terms of content and time expression prevalence, and adapted two NLP systems for extracting time expressions. To the best of our knowledge, this is the first resource annotated for temporal entities in the mental health domain. |
Tasks | Temporal Information Extraction |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5621/ |
https://www.aclweb.org/anthology/W18-5621 | |
PWC | https://paperswithcode.com/paper/time-expressions-in-mental-health-records-for |
Repo | https://github.com/medesto/systems-adaptation |
Framework | none |
SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology
Title | SyntaViz: Visualizing Voice Queries through a Syntax-Driven Hierarchical Ontology |
Authors | Md Iftekhar Tanveer, Ferhan Ture |
Abstract | This paper describes SyntaViz, a visualization interface specifically designed for analyzing natural-language queries that were created by users of a voice-enabled product. SyntaViz provides a platform for browsing the ontology of user queries from a syntax-driven perspective, providing quick access to high-impact failure points of the existing intent understanding system and evidence for data-driven decisions in the development cycle. A case study on Xfinity X1 (a voice-enabled entertainment platform from Comcast) reveals that SyntaViz helps developers identify multiple action items in a short amount of time without any special training. SyntaViz has been open-sourced for the benefit of the community. |
Tasks | Sentiment Analysis, Topic Models |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/D18-2001/ |
https://www.aclweb.org/anthology/D18-2001 | |
PWC | https://paperswithcode.com/paper/syntaviz-visualizing-voice-queries-through-a |
Repo | https://github.com/Comcast/SyntaViz |
Framework | tf |
Semantic Supersenses for English Possessives
Title | Semantic Supersenses for English Possessives |
Authors | Austin Blodgett, Nathan Schneider |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1242/ |
https://www.aclweb.org/anthology/L18-1242 | |
PWC | https://paperswithcode.com/paper/semantic-supersenses-for-english-possessives |
Repo | https://github.com/nert-gu/streusle |
Framework | none |
Multi-Cue Correlation Filters for Robust Visual Tracking
Title | Multi-Cue Correlation Filters for Robust Visual Tracking |
Authors | Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, Houqiang Li |
Abstract | In recent years, many tracking algorithms achieve impressive performance via fusing multiple types of features, however, most of them fail to fully explore the context among the adopted multiple features and the strength of them. In this paper, we propose an efficient multi-cue analysis framework for robust visual tracking. By combining different types of features, our approach constructs multiple experts through Discriminative Correlation Filter (DCF) and each of them tracks the target independently. With the proposed robustness evaluation strategy, the suitable expert is selected for tracking in each frame. Furthermore, the divergence of multiple experts reveals the reliability of the current tracking, which is quantified to update the experts adaptively to keep them from corruption. Through the proposed multi-cue analysis, our tracker with standard DCF and deep features achieves outstanding results on several challenging benchmarks: OTB-2013, OTB-2015, Temple-Color and VOT 2016. On the other hand, when evaluated with only simple hand-crafted features, our method demonstrates comparable performance amongst complex non-realtime trackers, but exhibits much better efficiency, with a speed of 45 FPS on a CPU. |
Tasks | Visual Tracking |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Multi-Cue_Correlation_Filters_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Multi-Cue_Correlation_Filters_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/multi-cue-correlation-filters-for-robust |
Repo | https://github.com/594422814/MCCT |
Framework | none |