February 2, 2020

3139 words 15 mins read

Paper Group AWR 31

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification. CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking. Mixture Content Selection for Diverse Sequence Generation. Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots. A Wind of Change: …

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification


Title	tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification
Authors	Blaž Škrlj, Matej Martinc, Jan Kralj, Nada Lavrač, Senja Pollak
Abstract	The use of background knowledge remains largely unexploited in many text classification tasks. In this work, we explore word taxonomies as means for constructing new semantic features, which may improve the performance and robustness of the learned classifiers. We propose tax2vec, a parallel algorithm for constructing taxonomy based features, and demonstrate its use on six short-text classification problems, including gender, age and personality type prediction, drug effectiveness and side effect prediction, and news topic prediction. The experimental results indicate that the interpretable features constructed using tax2vec can notably improve the performance of classifiers; the constructed features, in combination with fast, linear classifiers tested against strong baselines, such as hierarchical attention neural networks, achieved comparable or better classification results on short documents. Further, tax2vec can also serve for extraction of corpus-specific keywords. Finally, we investigated the semantic space of potential features where we observe a similarity with the well known Zipf’s law.
Tasks	Text Classification
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00438v2
PDF	http://arxiv.org/pdf/1902.00438v2.pdf
PWC	https://paperswithcode.com/paper/tax2vec-constructing-interpretable-features
Repo	https://github.com/SkBlaz/tax2vec
Framework	tf

CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking


Title	CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking
Authors	Anubrata Das, Kunjan Mehta, Matthew Lease
Abstract	The effect of user bias in fact-checking has not been explored extensively from a user-experience perspective. We estimate the user bias as a function of the user’s perceived reputation of the news sources (e.g., a user with liberal beliefs may tend to trust liberal sources). We build an interface to communicate the role of estimated user bias in the context of a fact-checking task. We also explore the utility of helping users visualize their detected level of bias. 80% of the users of our system find that the presence of an indicator for user bias is useful in judging the veracity of a political claim.
Tasks
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03718v1
PDF	https://arxiv.org/pdf/1907.03718v1.pdf
PWC	https://paperswithcode.com/paper/cobweb-a-research-prototype-for-exploring
Repo	https://github.com/anubrata/anubrata.github.io
Framework	none

Mixture Content Selection for Diverse Sequence Generation


Title	Mixture Content Selection for Diverse Sequence Generation
Authors	Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
Abstract	Generating diverse sequences is important in many NLP applications such as question generation or summarization that exhibit semantically one-to-many relationships between source and the target sequences. We present a method to explicitly separate diversification from generation using a general plug-and-play module (called SELECTOR) that wraps around and guides an existing encoder-decoder model. The diversification stage uses a mixture of experts to sample different binary masks on the source sequence for diverse content selection. The generation stage uses a standard encoder-decoder model given each selected content from the source sequence. Due to the non-differentiable nature of discrete sampling and the lack of ground truth labels for binary mask, we leverage a proxy for ground truth mask and adopt stochastic hard-EM for training. In question generation (SQuAD) and abstractive summarization (CNN-DM), our method demonstrates significant improvements in accuracy, diversity and training efficiency, including state-of-the-art top-1 accuracy in both datasets, 6% gain in top-5 accuracy, and 3.7 times faster training over a state of the art model. Our code is publicly available at https://github.com/clovaai/FocusSeq2Seq.
Tasks	Abstractive Text Summarization, Document Summarization, Question Generation
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01953v1
PDF	https://arxiv.org/pdf/1909.01953v1.pdf
PWC	https://paperswithcode.com/paper/mixture-content-selection-for-diverse
Repo	https://github.com/clovaai/FocusSeq2Seq
Framework	pytorch

Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots


Title	Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots
Authors	Jia-Chen Gu, Zhen-Hua Ling, Xiaodan Zhu, Quan Liu
Abstract	This paper proposes a dually interactive matching network (DIM) for presenting the personalities of dialogue agents in retrieval-based chatbots. This model develops from the interactive matching network (IMN) which models the matching degree between a context composed of multiple utterances and a response candidate. Compared with previous persona fusion approaches which enhance the representation of a context by calculating its similarity with a given persona, the DIM model adopts a dual matching architecture, which performs interactive matching between responses and contexts and between responses and personas respectively for ranking response candidates. Experimental results on PERSONA-CHAT dataset show that the DIM model outperforms its baseline model, i.e., IMN with persona fusion, by a margin of 14.5% and outperforms the current state-of-the-art model by a margin of 27.7% in terms of top-1 accuracy hits@1.
Tasks
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05859v3
PDF	https://arxiv.org/pdf/1908.05859v3.pdf
PWC	https://paperswithcode.com/paper/dually-interactive-matching-network-for
Repo	https://github.com/JasonForJoy/DIM
Framework	tf

A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains


Title	A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains
Authors	Dominik Schlechtweg, Anna Hätty, Marco del Tredici, Sabine Schulte im Walde
Abstract	We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains. Our work addresses the superficialness and lack of comparison in assessing models of diachronic lexical change, by bringing together and extending benchmark models on a common state-of-the-art evaluation task. In addition, we demonstrate that the same evaluation task and modelling approaches can successfully be utilised for the synchronic detection of domain-specific sense divergences in the field of term extraction.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.02979v1
PDF	https://arxiv.org/pdf/1906.02979v1.pdf
PWC	https://paperswithcode.com/paper/a-wind-of-change-detecting-and-evaluating
Repo	https://github.com/Garrafao/LSCDetection
Framework	none

Large-Scale Long-Tailed Recognition in an Open World


Title	Large-Scale Long-Tailed Recognition in an Open World
Authors	Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
Abstract	Real world data often have a long-tailed and open-ended distribution. A practical recognition system must classify among majority and minority classes, generalize from a few known instances, and acknowledge novelty upon a never seen instance. We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes. OLTR must handle imbalanced classification, few-shot learning, and open-set recognition in one integrated algorithm, whereas existing classification approaches focus only on one aspect and deliver poorly over the entire class spectrum. The key challenges are how to share visual knowledge between head and tail classes and how to reduce confusion between tail and open classes. We develop an integrated OLTR algorithm that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world. Our so-called dynamic meta-embedding combines a direct image feature and an associated memory feature, with the feature norm indicating the familiarity to known classes. On three large-scale OLTR datasets we curate from object-centric ImageNet, scene-centric Places, and face-centric MS1M data, our method consistently outperforms the state-of-the-art. Our code, datasets, and models enable future OLTR research and are publicly available at https://liuziwei7.github.io/projects/LongTail.html.
Tasks	Few-Shot Learning, Open Set Learning
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05160v2
PDF	http://arxiv.org/pdf/1904.05160v2.pdf
PWC	https://paperswithcode.com/paper/large-scale-long-tailed-recognition-in-an
Repo	https://github.com/zhmiao/OpenLongTailRecognition-OLTR
Framework	pytorch

Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network


Title	Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network
Authors	Wenxiang Jiao, Michael R. Lyu, Irwin King
Abstract	Real-time emotion recognition (RTER) in conversations is significant for developing emotionally intelligent chatting machines. Without the future context in RTER, it becomes critical to build the memory bank carefully for capturing historical context and summarize the memories appropriately to retrieve relevant information. We propose an Attention Gated Hierarchical Memory Network (AGHMN) to address the problems of prior work: (1) Commonly used convolutional neural networks (CNNs) for utterance feature extraction are less compatible in the memory modules; (2) Unidirectional gated recurrent units (GRUs) only allow each historical utterance to have context before it, preventing information propagation in the opposite direction; (3) The Soft Attention for summarizing loses the positional and ordering information of memories, regardless of how the memory bank is built. Particularly, we propose a Hierarchical Memory Network (HMN) with a bidirectional GRU (BiGRU) as the utterance reader and a BiGRU fusion layer for the interaction between historical utterances. For memory summarizing, we propose an Attention GRU (AGRU) where we utilize the attention weights to update the internal state of GRU. We further promote the AGRU to a bidirectional variant (BiAGRU) to balance the contextual information from recent memories and that from distant memories. We conduct experiments on two emotion conversation datasets with extensive analysis, demonstrating the efficacy of our AGHMN models.
Tasks	Emotion Recognition
Published	2019-11-20
URL	https://arxiv.org/abs/1911.09075v1
PDF	https://arxiv.org/pdf/1911.09075v1.pdf
PWC	https://paperswithcode.com/paper/real-time-emotion-recognition-via-attention
Repo	https://github.com/wxjiao/AGHMN
Framework	pytorch


Title	A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism
Authors	Zhiri Tang, Ruohua Zhu, Peng Lin, Jin He, Hao Wang, Qijun Huang, Sheng Chang, Qiming Ma
Abstract	Memristive neural networks (MNNs), which use memristors as neurons or synapses, have become a hot research topic recently. However, most memristors are not compatible with mainstream integrated circuit technology and their stabilities in large-scale are not very well so far. In this paper, a hardware friendly MNN circuit is introduced, in which the memristive characteristics are implemented by digital integrated circuit. Through this method, spike timing dependent plasticity (STDP) and unsupervised learning are realized. A weight sharing mechanism is proposed to bridge the gap of network scale and hardware resource. Experiment results show the hardware resource is significantly saved with it, maintaining good recognition accuracy and high speed. Moreover, the tendency of resource increase is slower than the expansion of network scale, which infers our method’s potential on large scale neuromorphic network’s realization.
Tasks
Published	2019-01-01
URL	http://arxiv.org/abs/1901.00100v1
PDF	http://arxiv.org/pdf/1901.00100v1.pdf
PWC	https://paperswithcode.com/paper/a-hardware-friendly-unsupervised-memristive
Repo	https://github.com/GerinTang/InnovateFPGA2018_PR039
Framework	none

Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction


Title	Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
Authors	Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, Kun Gai
Abstract	User response prediction, which models the user preference w.r.t. the presented items, plays a key role in online services. With two-decade rapid development, nowadays the cumulated user behavior sequences on mature Internet service platforms have become extremely long since the user’s first registration. Each user not only has intrinsic tastes, but also keeps changing her personal interests during lifetime. Hence, it is challenging to handle such lifelong sequential modeling for each individual user. Existing methodologies for sequential modeling are only capable of dealing with relatively recent user behaviors, which leaves huge space for modeling long-term especially lifelong sequential patterns to facilitate user modeling. Moreover, one user’s behavior may be accounted for various previous behaviors within her whole online activity history, i.e., long-term dependency with multi-scale sequential patterns. In order to tackle these challenges, in this paper, we propose a Hierarchical Periodic Memory Network for lifelong sequential modeling with personalized memorization of sequential patterns for each user. The model also adopts a hierarchical and periodical updating mechanism to capture multi-scale sequential patterns of user interests while supporting the evolving user behavior logs. The experimental results over three large-scale real-world datasets have demonstrated the advantages of our proposed model with significant improvement in user response prediction performance against the state-of-the-arts.
Tasks
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00758v2
PDF	https://arxiv.org/pdf/1905.00758v2.pdf
PWC	https://paperswithcode.com/paper/lifelong-sequential-modeling-with
Repo	https://github.com/alimamarankgroup/HPMN
Framework	tf

DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence


Title	DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence
Authors	Edvinas Byla, Wei Pang
Abstract	In this paper we propose DeepSwarm, a novel neural architecture search (NAS) method based on Swarm Intelligence principles. At its core DeepSwarm uses Ant Colony Optimization (ACO) to generate ant population which uses the pheromone information to collectively search for the best neural architecture. Furthermore, by using local and global pheromone update rules our method ensures the balance between exploitation and exploration. On top of this, to make our method more efficient we combine progressive neural architecture search with weight reusability. Furthermore, due to the nature of ACO our method can incorporate heuristic information which can further speed up the search process. After systematic and extensive evaluation, we discover that on three different datasets (MNIST, Fashion-MNIST, and CIFAR-10) when compared to existing systems our proposed method demonstrates competitive performance. Finally, we open source DeepSwarm as a NAS library and hope it can be used by more deep learning researchers and practitioners.
Tasks	Neural Architecture Search
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07350v1
PDF	https://arxiv.org/pdf/1905.07350v1.pdf
PWC	https://paperswithcode.com/paper/deepswarm-optimising-convolutional-neural
Repo	https://github.com/Pattio/DeepSwarm
Framework	tf

On the Evaluation of Conditional GANs


Title	On the Evaluation of Conditional GANs
Authors	Terrance DeVries, Adriana Romero, Luis Pineda, Graham W. Taylor, Michal Drozdzal
Abstract	Conditional Generative Adversarial Networks (cGANs) are finding increasingly widespread use in many application domains. Despite outstanding progress, quantitative evaluation of such models often involves multiple distinct metrics to assess different desirable properties, such as image quality, conditional consistency, and intra-conditioning diversity. In this setting, model benchmarking becomes a challenge, as each metric may indicate a different “best” model. In this paper, we propose the Frechet Joint Distance (FJD), which is defined as the Frechet distance between joint distributions of images and conditioning, allowing it to implicitly capture the aforementioned properties in a single metric. We conduct proof-of-concept experiments on a controllable synthetic dataset, which consistently highlight the benefits of FJD when compared to currently established metrics. Moreover, we use the newly introduced metric to compare existing cGAN-based models for a variety of conditioning modalities (e.g. class labels, object masks, bounding boxes, images, and text captions). We show that FJD can be used as a promising single metric for cGAN benchmarking and model selection. Code can be found at https://github.com/facebookresearch/fjd.
Tasks	Model Selection
Published	2019-07-11
URL	https://arxiv.org/abs/1907.08175v3
PDF	https://arxiv.org/pdf/1907.08175v3.pdf
PWC	https://paperswithcode.com/paper/on-the-evaluation-of-conditional-gans
Repo	https://github.com/facebookresearch/fjd
Framework	pytorch

Learning Priors in High-frequency Domain for Inverse Imaging Reconstruction


Title	Learning Priors in High-frequency Domain for Inverse Imaging Reconstruction
Authors	Zhuonan He, Jinjie Zhou, Dong Liang, Yuhao Wang, Qiegen Liu
Abstract	Ill-posed inverse problems in imaging remain an active research topic in several decades, with new approaches constantly emerging. Recognizing that the popular dictionary learning and convolutional sparse coding are both essentially modeling the high-frequency component of an image, which convey most of the semantic information such as texture details, in this work we propose a novel multi-profile high-frequency transform-guided denoising autoencoder as prior (HF-DAEP). To achieve this goal, we first extract a set of multi-profile high-frequency components via a specific transformation and add the artificial Gaussian noise to these high-frequency components as training samples. Then, as the high-frequency prior information is learned, we incorporate it into classical iterative reconstruction process by proximal gradient descent technique. Preliminary results on highly under-sampled magnetic resonance imaging and sparse-view computed tomography reconstruction demonstrate that the proposed method can efficiently reconstruct feature details and present advantages over state-of-the-arts.
Tasks	Denoising, Dictionary Learning
Published	2019-10-23
URL	https://arxiv.org/abs/1910.11148v1
PDF	https://arxiv.org/pdf/1910.11148v1.pdf
PWC	https://paperswithcode.com/paper/learning-priors-in-high-frequency-domain-for
Repo	https://github.com/yqx7150/HFDAEP
Framework	none

Neural Machine Translating from Natural Language to SPARQL


Title	Neural Machine Translating from Natural Language to SPARQL
Authors	Xiaoyu Yin, Dagmar Gromann, Sebastian Rudolph
Abstract	SPARQL is a highly powerful query language for an ever-growing number of Linked Data resources and Knowledge Graphs. Using it requires a certain familiarity with the entities in the domain to be queried as well as expertise in the language’s syntax and semantics, none of which average human web users can be assumed to possess. To overcome this limitation, automatically translating natural language questions to SPARQL queries has been a vibrant field of research. However, to this date, the vast success of deep learning methods has not yet been fully propagated to this research problem. This paper contributes to filling this gap by evaluating the utilization of eight different Neural Machine Translation (NMT) models for the task of translating from natural language to the structured query language SPARQL. While highlighting the importance of high-quantity and high-quality datasets, the results show a dominance of a CNN-based architecture with a BLEU score of up to 98 and accuracy of up to 94%.
Tasks	Knowledge Graphs, Machine Translation
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09302v1
PDF	https://arxiv.org/pdf/1906.09302v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translating-from-natural
Repo	https://github.com/AKSW/DBNQA
Framework	none

Integrals over Gaussians under Linear Domain Constraints


Title	Integrals over Gaussians under Linear Domain Constraints
Authors	Alexandra Gessner, Oindrila Kanjilal, Philipp Hennig
Abstract	Integrals of linearly constrained multivariate Gaussian densities are a frequent problem in machine learning and statistics, arising in tasks like generalized linear models and Bayesian optimization. Yet they are notoriously hard to compute, and to further complicate matters, the numerical values of such integrals may be very small. We present an efficient black-box algorithm that exploits geometry for the estimation of integrals over a small, truncated Gaussian volume, and to simulate therefrom. Our algorithm uses the Holmes-Diaconis-Ross (HDR) method combined with an analytic version of elliptical slice sampling (ESS). Adapted to the linear setting, ESS allows for rejection-free sampling, because intersections of ellipses and domain boundaries have closed-form solutions. The key idea of HDR is to decompose the integral into easier-to-compute conditional probabilities by using a sequence of nested domains. Remarkably, it allows for direct computation of the logarithm of the integral value and thus enables the computation of extremely small probability masses. We demonstrate the effectiveness of our tailored combination of HDR and ESS on high-dimensional integrals and on entropy search for Bayesian optimization.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09328v2
PDF	https://arxiv.org/pdf/1910.09328v2.pdf
PWC	https://paperswithcode.com/paper/integrals-over-gaussians-under-linear-domain
Repo	https://github.com/alpiges/LinConGauss
Framework	none

Learning Conditional Deformable Templates with Convolutional Networks


Title	Learning Conditional Deformable Templates with Convolutional Networks
Authors	Adrian V. Dalca, Marianne Rakic, John Guttag, Mert R. Sabuncu
Abstract	We develop a learning framework for building deformable templates, which play a fundamental role in many image analysis and computational anatomy tasks. Conventional methods for template creation and image alignment to the template have undergone decades of rich technical development. In these frameworks, templates are constructed using an iterative process of template estimation and alignment, which is often computationally very expensive. Due in part to this shortcoming, most methods compute a single template for the entire population of images, or a few templates for specific sub-groups of the data. In this work, we present a probabilistic model and efficient learning strategy that yields either universal or conditional templates, jointly with a neural network that provides efficient alignment of the images to these templates. We demonstrate the usefulness of this method on a variety of domains, with a special focus on neuroimaging. This is particularly useful for clinical applications where a pre-existing template does not exist, or creating a new one with traditional methods can be prohibitively expensive. Our code and atlases are available online as part of the VoxelMorph library at http://voxelmorph.csail.mit.edu.
Tasks	Deformable Medical Image Registration, Medical Image Registration
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02738v2
PDF	https://arxiv.org/pdf/1908.02738v2.pdf
PWC	https://paperswithcode.com/paper/learning-conditional-deformable-templates
Repo	https://github.com/voxelmorph/voxelmorph
Framework	tf