February 2, 2020

3139 words 15 mins read

Paper Group AWR 31

Paper Group AWR 31

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification. CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking. Mixture Content Selection for Diverse Sequence Generation. Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots. A Wind of Change: …

tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification

Title tax2vec: Constructing Interpretable Features from Taxonomies for Short Text Classification
Authors Blaž Škrlj, Matej Martinc, Jan Kralj, Nada Lavrač, Senja Pollak
Abstract The use of background knowledge remains largely unexploited in many text classification tasks. In this work, we explore word taxonomies as means for constructing new semantic features, which may improve the performance and robustness of the learned classifiers. We propose tax2vec, a parallel algorithm for constructing taxonomy based features, and demonstrate its use on six short-text classification problems, including gender, age and personality type prediction, drug effectiveness and side effect prediction, and news topic prediction. The experimental results indicate that the interpretable features constructed using tax2vec can notably improve the performance of classifiers; the constructed features, in combination with fast, linear classifiers tested against strong baselines, such as hierarchical attention neural networks, achieved comparable or better classification results on short documents. Further, tax2vec can also serve for extraction of corpus-specific keywords. Finally, we investigated the semantic space of potential features where we observe a similarity with the well known Zipf’s law.
Tasks Text Classification
Published 2019-02-01
URL http://arxiv.org/abs/1902.00438v2
PDF http://arxiv.org/pdf/1902.00438v2.pdf
PWC https://paperswithcode.com/paper/tax2vec-constructing-interpretable-features
Repo https://github.com/SkBlaz/tax2vec
Framework tf

CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking

Title CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking
Authors Anubrata Das, Kunjan Mehta, Matthew Lease
Abstract The effect of user bias in fact-checking has not been explored extensively from a user-experience perspective. We estimate the user bias as a function of the user’s perceived reputation of the news sources (e.g., a user with liberal beliefs may tend to trust liberal sources). We build an interface to communicate the role of estimated user bias in the context of a fact-checking task. We also explore the utility of helping users visualize their detected level of bias. 80% of the users of our system find that the presence of an indicator for user bias is useful in judging the veracity of a political claim.
Tasks
Published 2019-07-08
URL https://arxiv.org/abs/1907.03718v1
PDF https://arxiv.org/pdf/1907.03718v1.pdf
PWC https://paperswithcode.com/paper/cobweb-a-research-prototype-for-exploring
Repo https://github.com/anubrata/anubrata.github.io
Framework none

Mixture Content Selection for Diverse Sequence Generation

Title Mixture Content Selection for Diverse Sequence Generation
Authors Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
Abstract Generating diverse sequences is important in many NLP applications such as question generation or summarization that exhibit semantically one-to-many relationships between source and the target sequences. We present a method to explicitly separate diversification from generation using a general plug-and-play module (called SELECTOR) that wraps around and guides an existing encoder-decoder model. The diversification stage uses a mixture of experts to sample different binary masks on the source sequence for diverse content selection. The generation stage uses a standard encoder-decoder model given each selected content from the source sequence. Due to the non-differentiable nature of discrete sampling and the lack of ground truth labels for binary mask, we leverage a proxy for ground truth mask and adopt stochastic hard-EM for training. In question generation (SQuAD) and abstractive summarization (CNN-DM), our method demonstrates significant improvements in accuracy, diversity and training efficiency, including state-of-the-art top-1 accuracy in both datasets, 6% gain in top-5 accuracy, and 3.7 times faster training over a state of the art model. Our code is publicly available at https://github.com/clovaai/FocusSeq2Seq.
Tasks Abstractive Text Summarization, Document Summarization, Question Generation
Published 2019-09-04
URL https://arxiv.org/abs/1909.01953v1
PDF https://arxiv.org/pdf/1909.01953v1.pdf
PWC https://paperswithcode.com/paper/mixture-content-selection-for-diverse
Repo https://github.com/clovaai/FocusSeq2Seq
Framework pytorch

Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots

Title Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots
Authors Jia-Chen Gu, Zhen-Hua Ling, Xiaodan Zhu, Quan Liu
Abstract This paper proposes a dually interactive matching network (DIM) for presenting the personalities of dialogue agents in retrieval-based chatbots. This model develops from the interactive matching network (IMN) which models the matching degree between a context composed of multiple utterances and a response candidate. Compared with previous persona fusion approaches which enhance the representation of a context by calculating its similarity with a given persona, the DIM model adopts a dual matching architecture, which performs interactive matching between responses and contexts and between responses and personas respectively for ranking response candidates. Experimental results on PERSONA-CHAT dataset show that the DIM model outperforms its baseline model, i.e., IMN with persona fusion, by a margin of 14.5% and outperforms the current state-of-the-art model by a margin of 27.7% in terms of top-1 accuracy hits@1.
Tasks
Published 2019-08-16
URL https://arxiv.org/abs/1908.05859v3
PDF https://arxiv.org/pdf/1908.05859v3.pdf
PWC https://paperswithcode.com/paper/dually-interactive-matching-network-for
Repo https://github.com/JasonForJoy/DIM
Framework tf

A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains

Title A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains
Authors Dominik Schlechtweg, Anna Hätty, Marco del Tredici, Sabine Schulte im Walde
Abstract We perform an interdisciplinary large-scale evaluation for detecting lexical semantic divergences in a diachronic and in a synchronic task: semantic sense changes across time, and semantic sense changes across domains. Our work addresses the superficialness and lack of comparison in assessing models of diachronic lexical change, by bringing together and extending benchmark models on a common state-of-the-art evaluation task. In addition, we demonstrate that the same evaluation task and modelling approaches can successfully be utilised for the synchronic detection of domain-specific sense divergences in the field of term extraction.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.02979v1
PDF https://arxiv.org/pdf/1906.02979v1.pdf
PWC https://paperswithcode.com/paper/a-wind-of-change-detecting-and-evaluating
Repo https://github.com/Garrafao/LSCDetection
Framework none

Large-Scale Long-Tailed Recognition in an Open World

Title Large-Scale Long-Tailed Recognition in an Open World
Authors Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
Abstract Real world data often have a long-tailed and open-ended distribution. A practical recognition system must classify among majority and minority classes, generalize from a few known instances, and acknowledge novelty upon a never seen instance. We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes. OLTR must handle imbalanced classification, few-shot learning, and open-set recognition in one integrated algorithm, whereas existing classification approaches focus only on one aspect and deliver poorly over the entire class spectrum. The key challenges are how to share visual knowledge between head and tail classes and how to reduce confusion between tail and open classes. We develop an integrated OLTR algorithm that maps an image to a feature space such that visual concepts can easily relate to each other based on a learned metric that respects the closed-world classification while acknowledging the novelty of the open world. Our so-called dynamic meta-embedding combines a direct image feature and an associated memory feature, with the feature norm indicating the familiarity to known classes. On three large-scale OLTR datasets we curate from object-centric ImageNet, scene-centric Places, and face-centric MS1M data, our method consistently outperforms the state-of-the-art. Our code, datasets, and models enable future OLTR research and are publicly available at https://liuziwei7.github.io/projects/LongTail.html.
Tasks Few-Shot Learning, Open Set Learning
Published 2019-04-10
URL http://arxiv.org/abs/1904.05160v2
PDF http://arxiv.org/pdf/1904.05160v2.pdf
PWC https://paperswithcode.com/paper/large-scale-long-tailed-recognition-in-an
Repo https://github.com/zhmiao/OpenLongTailRecognition-OLTR
Framework pytorch

Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network

Title Real-Time Emotion Recognition via Attention Gated Hierarchical Memory Network
Authors Wenxiang Jiao, Michael R. Lyu, Irwin King
Abstract Real-time emotion recognition (RTER) in conversations is significant for developing emotionally intelligent chatting machines. Without the future context in RTER, it becomes critical to build the memory bank carefully for capturing historical context and summarize the memories appropriately to retrieve relevant information. We propose an Attention Gated Hierarchical Memory Network (AGHMN) to address the problems of prior work: (1) Commonly used convolutional neural networks (CNNs) for utterance feature extraction are less compatible in the memory modules; (2) Unidirectional gated recurrent units (GRUs) only allow each historical utterance to have context before it, preventing information propagation in the opposite direction; (3) The Soft Attention for summarizing loses the positional and ordering information of memories, regardless of how the memory bank is built. Particularly, we propose a Hierarchical Memory Network (HMN) with a bidirectional GRU (BiGRU) as the utterance reader and a BiGRU fusion layer for the interaction between historical utterances. For memory summarizing, we propose an Attention GRU (AGRU) where we utilize the attention weights to update the internal state of GRU. We further promote the AGRU to a bidirectional variant (BiAGRU) to balance the contextual information from recent memories and that from distant memories. We conduct experiments on two emotion conversation datasets with extensive analysis, demonstrating the efficacy of our AGHMN models.
Tasks Emotion Recognition
Published 2019-11-20
URL https://arxiv.org/abs/1911.09075v1
PDF https://arxiv.org/pdf/1911.09075v1.pdf
PWC https://paperswithcode.com/paper/real-time-emotion-recognition-via-attention
Repo https://github.com/wxjiao/AGHMN
Framework pytorch

A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism

Title A Hardware Friendly Unsupervised Memristive Neural Network with Weight Sharing Mechanism
Authors Zhiri Tang, Ruohua Zhu, Peng Lin, Jin He, Hao Wang, Qijun Huang, Sheng Chang, Qiming Ma
Abstract Memristive neural networks (MNNs), which use memristors as neurons or synapses, have become a hot research topic recently. However, most memristors are not compatible with mainstream integrated circuit technology and their stabilities in large-scale are not very well so far. In this paper, a hardware friendly MNN circuit is introduced, in which the memristive characteristics are implemented by digital integrated circuit. Through this method, spike timing dependent plasticity (STDP) and unsupervised learning are realized. A weight sharing mechanism is proposed to bridge the gap of network scale and hardware resource. Experiment results show the hardware resource is significantly saved with it, maintaining good recognition accuracy and high speed. Moreover, the tendency of resource increase is slower than the expansion of network scale, which infers our method’s potential on large scale neuromorphic network’s realization.
Tasks
Published 2019-01-01
URL http://arxiv.org/abs/1901.00100v1
PDF http://arxiv.org/pdf/1901.00100v1.pdf
PWC https://paperswithcode.com/paper/a-hardware-friendly-unsupervised-memristive
Repo https://github.com/GerinTang/InnovateFPGA2018_PR039
Framework none

Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction

Title Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction
Authors Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, Kun Gai
Abstract User response prediction, which models the user preference w.r.t. the presented items, plays a key role in online services. With two-decade rapid development, nowadays the cumulated user behavior sequences on mature Internet service platforms have become extremely long since the user’s first registration. Each user not only has intrinsic tastes, but also keeps changing her personal interests during lifetime. Hence, it is challenging to handle such lifelong sequential modeling for each individual user. Existing methodologies for sequential modeling are only capable of dealing with relatively recent user behaviors, which leaves huge space for modeling long-term especially lifelong sequential patterns to facilitate user modeling. Moreover, one user’s behavior may be accounted for various previous behaviors within her whole online activity history, i.e., long-term dependency with multi-scale sequential patterns. In order to tackle these challenges, in this paper, we propose a Hierarchical Periodic Memory Network for lifelong sequential modeling with personalized memorization of sequential patterns for each user. The model also adopts a hierarchical and periodical updating mechanism to capture multi-scale sequential patterns of user interests while supporting the evolving user behavior logs. The experimental results over three large-scale real-world datasets have demonstrated the advantages of our proposed model with significant improvement in user response prediction performance against the state-of-the-arts.
Tasks
Published 2019-05-02
URL https://arxiv.org/abs/1905.00758v2
PDF https://arxiv.org/pdf/1905.00758v2.pdf
PWC https://paperswithcode.com/paper/lifelong-sequential-modeling-with
Repo https://github.com/alimamarankgroup/HPMN
Framework tf

DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence

Title DeepSwarm: Optimising Convolutional Neural Networks using Swarm Intelligence
Authors Edvinas Byla, Wei Pang
Abstract In this paper we propose DeepSwarm, a novel neural architecture search (NAS) method based on Swarm Intelligence principles. At its core DeepSwarm uses Ant Colony Optimization (ACO) to generate ant population which uses the pheromone information to collectively search for the best neural architecture. Furthermore, by using local and global pheromone update rules our method ensures the balance between exploitation and exploration. On top of this, to make our method more efficient we combine progressive neural architecture search with weight reusability. Furthermore, due to the nature of ACO our method can incorporate heuristic information which can further speed up the search process. After systematic and extensive evaluation, we discover that on three different datasets (MNIST, Fashion-MNIST, and CIFAR-10) when compared to existing systems our proposed method demonstrates competitive performance. Finally, we open source DeepSwarm as a NAS library and hope it can be used by more deep learning researchers and practitioners.
Tasks Neural Architecture Search
Published 2019-05-17
URL https://arxiv.org/abs/1905.07350v1
PDF https://arxiv.org/pdf/1905.07350v1.pdf
PWC https://paperswithcode.com/paper/deepswarm-optimising-convolutional-neural
Repo https://github.com/Pattio/DeepSwarm
Framework tf

On the Evaluation of Conditional GANs

Title On the Evaluation of Conditional GANs
Authors Terrance DeVries, Adriana Romero, Luis Pineda, Graham W. Taylor, Michal Drozdzal
Abstract Conditional Generative Adversarial Networks (cGANs) are finding increasingly widespread use in many application domains. Despite outstanding progress, quantitative evaluation of such models often involves multiple distinct metrics to assess different desirable properties, such as image quality, conditional consistency, and intra-conditioning diversity. In this setting, model benchmarking becomes a challenge, as each metric may indicate a different “best” model. In this paper, we propose the Frechet Joint Distance (FJD), which is defined as the Frechet distance between joint distributions of images and conditioning, allowing it to implicitly capture the aforementioned properties in a single metric. We conduct proof-of-concept experiments on a controllable synthetic dataset, which consistently highlight the benefits of FJD when compared to currently established metrics. Moreover, we use the newly introduced metric to compare existing cGAN-based models for a variety of conditioning modalities (e.g. class labels, object masks, bounding boxes, images, and text captions). We show that FJD can be used as a promising single metric for cGAN benchmarking and model selection. Code can be found at https://github.com/facebookresearch/fjd.
Tasks Model Selection
Published 2019-07-11
URL https://arxiv.org/abs/1907.08175v3
PDF https://arxiv.org/pdf/1907.08175v3.pdf
PWC https://paperswithcode.com/paper/on-the-evaluation-of-conditional-gans
Repo https://github.com/facebookresearch/fjd
Framework pytorch

Learning Priors in High-frequency Domain for Inverse Imaging Reconstruction

Title Learning Priors in High-frequency Domain for Inverse Imaging Reconstruction
Authors Zhuonan He, Jinjie Zhou, Dong Liang, Yuhao Wang, Qiegen Liu
Abstract Ill-posed inverse problems in imaging remain an active research topic in several decades, with new approaches constantly emerging. Recognizing that the popular dictionary learning and convolutional sparse coding are both essentially modeling the high-frequency component of an image, which convey most of the semantic information such as texture details, in this work we propose a novel multi-profile high-frequency transform-guided denoising autoencoder as prior (HF-DAEP). To achieve this goal, we first extract a set of multi-profile high-frequency components via a specific transformation and add the artificial Gaussian noise to these high-frequency components as training samples. Then, as the high-frequency prior information is learned, we incorporate it into classical iterative reconstruction process by proximal gradient descent technique. Preliminary results on highly under-sampled magnetic resonance imaging and sparse-view computed tomography reconstruction demonstrate that the proposed method can efficiently reconstruct feature details and present advantages over state-of-the-arts.
Tasks Denoising, Dictionary Learning
Published 2019-10-23
URL https://arxiv.org/abs/1910.11148v1
PDF https://arxiv.org/pdf/1910.11148v1.pdf
PWC https://paperswithcode.com/paper/learning-priors-in-high-frequency-domain-for
Repo https://github.com/yqx7150/HFDAEP
Framework none

Neural Machine Translating from Natural Language to SPARQL

Title Neural Machine Translating from Natural Language to SPARQL
Authors Xiaoyu Yin, Dagmar Gromann, Sebastian Rudolph
Abstract SPARQL is a highly powerful query language for an ever-growing number of Linked Data resources and Knowledge Graphs. Using it requires a certain familiarity with the entities in the domain to be queried as well as expertise in the language’s syntax and semantics, none of which average human web users can be assumed to possess. To overcome this limitation, automatically translating natural language questions to SPARQL queries has been a vibrant field of research. However, to this date, the vast success of deep learning methods has not yet been fully propagated to this research problem. This paper contributes to filling this gap by evaluating the utilization of eight different Neural Machine Translation (NMT) models for the task of translating from natural language to the structured query language SPARQL. While highlighting the importance of high-quantity and high-quality datasets, the results show a dominance of a CNN-based architecture with a BLEU score of up to 98 and accuracy of up to 94%.
Tasks Knowledge Graphs, Machine Translation
Published 2019-06-21
URL https://arxiv.org/abs/1906.09302v1
PDF https://arxiv.org/pdf/1906.09302v1.pdf
PWC https://paperswithcode.com/paper/neural-machine-translating-from-natural
Repo https://github.com/AKSW/DBNQA
Framework none

Integrals over Gaussians under Linear Domain Constraints

Title Integrals over Gaussians under Linear Domain Constraints
Authors Alexandra Gessner, Oindrila Kanjilal, Philipp Hennig
Abstract Integrals of linearly constrained multivariate Gaussian densities are a frequent problem in machine learning and statistics, arising in tasks like generalized linear models and Bayesian optimization. Yet they are notoriously hard to compute, and to further complicate matters, the numerical values of such integrals may be very small. We present an efficient black-box algorithm that exploits geometry for the estimation of integrals over a small, truncated Gaussian volume, and to simulate therefrom. Our algorithm uses the Holmes-Diaconis-Ross (HDR) method combined with an analytic version of elliptical slice sampling (ESS). Adapted to the linear setting, ESS allows for rejection-free sampling, because intersections of ellipses and domain boundaries have closed-form solutions. The key idea of HDR is to decompose the integral into easier-to-compute conditional probabilities by using a sequence of nested domains. Remarkably, it allows for direct computation of the logarithm of the integral value and thus enables the computation of extremely small probability masses. We demonstrate the effectiveness of our tailored combination of HDR and ESS on high-dimensional integrals and on entropy search for Bayesian optimization.
Tasks
Published 2019-10-21
URL https://arxiv.org/abs/1910.09328v2
PDF https://arxiv.org/pdf/1910.09328v2.pdf
PWC https://paperswithcode.com/paper/integrals-over-gaussians-under-linear-domain
Repo https://github.com/alpiges/LinConGauss
Framework none

Learning Conditional Deformable Templates with Convolutional Networks

Title Learning Conditional Deformable Templates with Convolutional Networks
Authors Adrian V. Dalca, Marianne Rakic, John Guttag, Mert R. Sabuncu
Abstract We develop a learning framework for building deformable templates, which play a fundamental role in many image analysis and computational anatomy tasks. Conventional methods for template creation and image alignment to the template have undergone decades of rich technical development. In these frameworks, templates are constructed using an iterative process of template estimation and alignment, which is often computationally very expensive. Due in part to this shortcoming, most methods compute a single template for the entire population of images, or a few templates for specific sub-groups of the data. In this work, we present a probabilistic model and efficient learning strategy that yields either universal or conditional templates, jointly with a neural network that provides efficient alignment of the images to these templates. We demonstrate the usefulness of this method on a variety of domains, with a special focus on neuroimaging. This is particularly useful for clinical applications where a pre-existing template does not exist, or creating a new one with traditional methods can be prohibitively expensive. Our code and atlases are available online as part of the VoxelMorph library at http://voxelmorph.csail.mit.edu.
Tasks Deformable Medical Image Registration, Medical Image Registration
Published 2019-08-07
URL https://arxiv.org/abs/1908.02738v2
PDF https://arxiv.org/pdf/1908.02738v2.pdf
PWC https://paperswithcode.com/paper/learning-conditional-deformable-templates
Repo https://github.com/voxelmorph/voxelmorph
Framework tf
comments powered by Disqus