January 28, 2020

3311 words 16 mins read

Paper Group ANR 1057

Discovering topics with neural topic models built from PLSA assumptions. Frustratingly Poor Performance of Reading Comprehension Models on Non-adversarial Examples. Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models. The FRENK Datasets of Socially Unacceptable Discourse in Slovene and English. DeepCenterline: a Multi-task Fully Convolu …

Discovering topics with neural topic models built from PLSA assumptions


Title	Discovering topics with neural topic models built from PLSA assumptions
Authors	Sileye 0. Ba
Abstract	In this paper we present a model for unsupervised topic discovery in texts corpora. The proposed model uses documents, words, and topics lookup table embedding as neural network model parameters to build probabilities of words given topics, and probabilities of topics given documents. These probabilities are used to recover by marginalization probabilities of words given documents. For very large corpora where the number of documents can be in the order of billions, using a neural auto-encoder based document embedding is more scalable then using a lookup table embedding as classically done. We thus extended the lookup based document embedding model to continuous auto-encoder based model. Our models are trained using probabilistic latent semantic analysis (PLSA) assumptions. We evaluated our models on six datasets with a rich variety of contents. Conducted experiments demonstrate that the proposed neural topic models are very effective in capturing relevant topics. Furthermore, considering perplexity metric, conducted evaluation benchmarks show that our topic models outperform latent Dirichlet allocation (LDA) model which is classically used to address topic discovery tasks.
Tasks	Document Embedding, Topic Models
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10924v1
PDF	https://arxiv.org/pdf/1911.10924v1.pdf
PWC	https://paperswithcode.com/paper/discovering-topics-with-neural-topic-models-1
Repo
Framework

Frustratingly Poor Performance of Reading Comprehension Models on Non-adversarial Examples


Title	Frustratingly Poor Performance of Reading Comprehension Models on Non-adversarial Examples
Authors	Soham Parikh, Ananya B. Sai, Preksha Nema, Mitesh M. Khapra
Abstract	When humans learn to perform a difficult task (say, reading comprehension (RC) over longer passages), it is typically the case that their performance improves significantly on an easier version of this task (say, RC over shorter passages). Ideally, we would want an intelligent agent to also exhibit such a behavior. However, on experimenting with state of the art RC models using the standard RACE dataset, we observe that this is not true. Specifically, we see counter-intuitive results wherein even when we show frustratingly easy examples to the model at test time, there is hardly any improvement in its performance. We refer to this as non-adversarial evaluation as opposed to adversarial evaluation. Such non-adversarial examples allow us to assess the utility of specialized neural components. For example, we show that even for easy examples where the answer is clearly embedded in the passage, the neural components designed for paying attention to relevant portions of the passage fail to serve their intended purpose. We believe that the non-adversarial dataset created as a part of this work would complement the research on adversarial evaluation and give a more realistic assessment of the ability of RC models. All the datasets and codes developed as a part of this work will be made publicly available.
Tasks	Reading Comprehension
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02665v1
PDF	http://arxiv.org/pdf/1904.02665v1.pdf
PWC	https://paperswithcode.com/paper/frustratingly-poor-performance-of-reading
Repo
Framework

Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models


Title	Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models
Authors	Guangyong Chen, Pengfei Chen, Chang-Yu Hsieh, Chee-Kong Lee, Benben Liao, Renjie Liao, Weiwen Liu, Jiezhong Qiu, Qiming Sun, Jie Tang, Richard Zemel, Shengyu Zhang
Abstract	We introduce a new molecular dataset, named Alchemy, for developing machine learning models useful in chemistry and material science. As of June 20th 2019, the dataset comprises of 12 quantum mechanical properties of 119,487 organic molecules with up to 14 heavy atoms, sampled from the GDB MedChem database. The Alchemy dataset expands the volume and diversity of existing molecular datasets. Our extensive benchmarks of the state-of-the-art graph neural network models on Alchemy clearly manifest the usefulness of new data in validating and developing machine learning models for chemistry and material science. We further launch a contest to attract attentions from researchers in the related fields. More details can be found on the contest website \footnote{https://alchemy.tencent.com}. At the time of benchamrking experiment, we have generated 119,487 molecules in our Alchemy dataset. More molecular samples are generated since then. Hence, we provide a list of molecules used in the reported benchmarks.
Tasks
Published	2019-06-22
URL	https://arxiv.org/abs/1906.09427v1
PDF	https://arxiv.org/pdf/1906.09427v1.pdf
PWC	https://paperswithcode.com/paper/alchemy-a-quantum-chemistry-dataset-for
Repo
Framework

The FRENK Datasets of Socially Unacceptable Discourse in Slovene and English


Title	The FRENK Datasets of Socially Unacceptable Discourse in Slovene and English
Authors	Nikola Ljubešić, Darja Fišer, Tomaž Erjavec
Abstract	In this paper we present datasets of Facebook comment threads to mainstream media posts in Slovene and English developed inside the Slovene national project FRENK which cover two topics, migrants and LGBT, and are manually annotated for different types of socially unacceptable discourse (SUD). The main advantages of these datasets compared to the existing ones are identical sampling procedures, producing comparable data across languages and an annotation schema that takes into account six types of SUD and five targets at which SUD is directed. We describe the sampling and annotation procedures, and analyze the annotation distributions and inter-annotator agreements. We consider this dataset to be an important milestone in understanding and combating SUD for both languages.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02045v2
PDF	https://arxiv.org/pdf/1906.02045v2.pdf
PWC	https://paperswithcode.com/paper/the-frenk-datasets-of-socially-unacceptable
Repo
Framework

DeepCenterline: a Multi-task Fully Convolutional Network for Centerline Extraction


Title	DeepCenterline: a Multi-task Fully Convolutional Network for Centerline Extraction
Authors	Zhihui Guo, Junjie Bai, Yi Lu, Xin Wang, Kunlin Cao, Qi Song, Milan Sonka, Youbing Yin
Abstract	A novel centerline extraction framework is reported which combines an end-to-end trainable multi-task fully convolutional network (FCN) with a minimal path extractor. The FCN simultaneously computes centerline distance maps and detects branch endpoints. The method generates single-pixel-wide centerlines with no spurious branches. It handles arbitrary tree-structured object with no prior assumption regarding depth of the tree or its bifurcation pattern. It is also robust to substantial scale changes across different parts of the target object and minor imperfections of the object’s segmentation mask. To the best of our knowledge, this is the first deep-learning based centerline extraction method that guarantees single-pixel-wide centerline for a complex tree-structured object. The proposed method is validated in coronary artery centerline extraction on a dataset of 620 patients (400 of which used as test set). This application is challenging due to the large number of coronary branches, branch tortuosity, and large variations in length, thickness, shape, etc. The proposed method generates well-positioned centerlines, exhibiting lower number of missing branches and is more robust in the presence of minor imperfections of the object segmentation mask. Compared to a state-of-the-art traditional minimal path approach, our method improves patient-level success rate of centerline extraction from 54.3% to 88.8% according to independent human expert review.
Tasks	Semantic Segmentation
Published	2019-03-25
URL	http://arxiv.org/abs/1903.10481v1
PDF	http://arxiv.org/pdf/1903.10481v1.pdf
PWC	https://paperswithcode.com/paper/deepcenterline-a-multi-task-fully
Repo
Framework

W-Net: Reinforced U-Net for Density Map Estimation


Title	W-Net: Reinforced U-Net for Density Map Estimation
Authors	Varun Kannadi Valloli, Kinal Mehta
Abstract	Crowd management is of paramount importance when it comes to preventing stampedes and saving lives, especially in a countries like China and India where the combined population is a third of the global population. Millions of people convene annually all around the nation to celebrate a myriad of events and crowd count estimation is the linchpin of the crowd management system that could prevent stampedes and save lives. We present a network for crowd counting which reports state of the art results on crowd counting benchmarks. Our contributions are, first, a U-Net inspired model which affords us to report state of the art results. Second, we propose an independent decoding Reinforcement branch which helps the network converge much earlier and also enables the network to estimate density maps with high Structural Similarity Index (SSIM). Third, we discuss the drawbacks of the contemporary architectures and empirically show that even though our architecture achieves state of the art results, the merit may be due to the encoder-decoder pipeline instead. Finally, we report the error analysis which shows that the contemporary line of work is at saturation and leaves certain prominent problems unsolved.
Tasks	Crowd Counting
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11249v2
PDF	http://arxiv.org/pdf/1903.11249v2.pdf
PWC	https://paperswithcode.com/paper/w-net-reinforced-u-net-for-density-map
Repo
Framework

GLMNet: Graph Learning-Matching Networks for Feature Matching


Title	GLMNet: Graph Learning-Matching Networks for Feature Matching
Authors	Bo Jiang, Pengfei Sun, Jin Tang, Bin Luo
Abstract	Recently, graph convolutional networks (GCNs) have shown great potential for the task of graph matching. It can integrate graph node feature embedding, node-wise affinity learning and matching optimization together in a unified end-to-end model. One important aspect of graph matching is the construction of two matching graphs. However, the matching graphs we feed to existing graph convolutional matching networks are generally fixed and independent of graph matching, which thus are not guaranteed to be optimal for the graph matching task. Also, existing GCN matching method employs several general smoothing-based graph convolutional layers to generate graph node embeddings, in which extensive smoothing convolution operation may dilute the desired discriminatory information of graph nodes. To overcome these issues, we propose a novel Graph Learning-Matching Network (GLMNet) for graph matching problem. GLMNet has three main aspects. (1) It integrates graph learning into graph matching which thus adaptively learn a pair of optimal graphs that best serve graph matching task. (2) It further employs a Laplacian sharpening convolutional module to generate more discriminative node embeddings for graph matching. (3) A new constraint regularized loss is designed for GLMNet training which can encode the desired one-to-one matching constraints in matching optimization. Experiments on two benchmarks demonstrate the effectiveness of GLMNet and advantages of its main modules.
Tasks	Graph Matching
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07681v1
PDF	https://arxiv.org/pdf/1911.07681v1.pdf
PWC	https://paperswithcode.com/paper/glmnet-graph-learning-matching-networks-for
Repo
Framework

Crowd Counting with Decomposed Uncertainty


Title	Crowd Counting with Decomposed Uncertainty
Authors	Min-hwan Oh, Peder A. Olsen, Karthikeyan Natesan Ramamurthy
Abstract	Research in neural networks in the field of computer vision has achieved remarkable accuracy for point estimation. However, the uncertainty in the estimation is rarely addressed. Uncertainty quantification accompanied by point estimation can lead to a more informed decision, and even improve the prediction quality. In this work, we focus on uncertainty estimation in the domain of crowd counting. We propose a scalable neural network framework with quantification of decomposed uncertainty using a bootstrap ensemble. We demonstrate that the proposed uncertainty quantification method provides additional insight to the crowd counting problem and is simple to implement. We also show that our proposed method outperforms the current state of the art method in many benchmark data sets. To the best of our knowledge, we have the best system for ShanghaiTech part A and B, UCF CC 50, UCSD, and UCF-QNRF datasets.
Tasks	Crowd Counting
Published	2019-03-15
URL	https://arxiv.org/abs/1903.07427v2
PDF	https://arxiv.org/pdf/1903.07427v2.pdf
PWC	https://paperswithcode.com/paper/crowd-counting-with-decomposed-uncertainty
Repo
Framework

DASPS: A Database for Anxious States based on a Psychological Stimulation


Title	DASPS: A Database for Anxious States based on a Psychological Stimulation
Authors	Asma Baghdadi, Yassine Aribi, Rahma Fourati, Najla Halouani, Patrick Siarry, Adel M. Alimi
Abstract	Anxiety affects human capabilities and behavior as much as it affects productivity and quality of life. It can be considered as the main cause of depression and suicide. Anxious states are easily detectable by humans due to their acquired cognition, humans interpret the interlocutor’s tone of speech, gesture, facial expressions and recognize their mental state. There is a need for non-invasive reliable techniques that performs the complex task of anxiety detection. In this paper, we present DASPS database containing recorded Electroencephalogram (EEG) signals of 23 participants during anxiety elicitation by means of face-to-face psychological stimuli. EEG signals were captured with Emotiv Epoc headset as it’s a wireless wearable low-cost equipment. In our study, we investigate the impact of different parameters, notably: trial duration, feature type, feature combination and anxiety levels number. Our findings showed that anxiety is well elicited in 1 second. For instance, stacked sparse autoencoder with different type of features achieves 83.50% and 74.60% for 2 and 4 anxiety levels detection, respectively. The presented results prove the benefits of the use of a low-cost EEG headset instead of medical non-wireless devices and create a starting point for new researches in the field of anxiety detection.
Tasks	EEG
Published	2019-01-09
URL	https://arxiv.org/abs/1901.02942v2
PDF	https://arxiv.org/pdf/1901.02942v2.pdf
PWC	https://paperswithcode.com/paper/dasps-a-database-for-anxious-states-based-on
Repo
Framework

Deep Image Feature Learning with Fuzzy Rules


Title	Deep Image Feature Learning with Fuzzy Rules
Authors	Xiang Ma, Zhaohong Deng, Peng Xu, Kup-Sze Choi, Dongrui Wu, Shitong Wang
Abstract	The methods of extracting image features are the key to many image processing tasks. At present, the most popular method is the deep neural network which can automatically extract robust features through end-to-end training instead of hand-crafted feature extraction. However, the deep neural network currently faces many challenges: 1) its effectiveness is heavily dependent on large datasets, so the computational complexity is very high; 2) it is usually regarded as a black box model with poor interpretability. To meet the above challenges, a more interpretable and scalable feature learning method, i.e., deep image feature learning with fuzzy rules (DIFL-FR), is proposed in the paper, which combines the rule-based fuzzy modeling technique and the deep stacked learning strategy. The method progressively learns image features through a layer-by-layer manner based on fuzzy rules, so the feature learning process can be better explained by the generated rules. More importantly, the learning process of the method is only based on forward propagation without back propagation and iterative learning, which results in the high learning efficiency. In addition, the method is under the settings of unsupervised learning and can be easily extended to scenes of supervised and semi-supervised learning. Extensive experiments are conducted on image datasets of different scales. The results obviously show the effectiveness of the proposed method.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10575v2
PDF	https://arxiv.org/pdf/1905.10575v2.pdf
PWC	https://paperswithcode.com/paper/deep-image-feature-learning-with-fuzzy-rules
Repo
Framework

Canonical Surface Mapping via Geometric Cycle Consistency


Title	Canonical Surface Mapping via Geometric Cycle Consistency
Authors	Nilesh Kulkarni, Abhinav Gupta, Shubham Tulsiani
Abstract	We explore the task of Canonical Surface Mapping (CSM). Specifically, given an image, we learn to map pixels on the object to their corresponding locations on an abstract 3D model of the category. But how do we learn such a mapping? A supervised approach would require extensive manual labeling which is not scalable beyond a few hand-picked categories. Our key insight is that the CSM task (pixel to 3D), when combined with 3D projection (3D to pixel), completes a cycle. Hence, we can exploit a geometric cycle consistency loss, thereby allowing us to forgo the dense manual supervision. Our approach allows us to train a CSM model for a diverse set of classes, without sparse or dense keypoint annotation, by leveraging only foreground mask labels for training. We show that our predictions also allow us to infer dense correspondence between two images, and compare the performance of our approach against several methods that predict correspondence by leveraging varying amount of supervision.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10043v2
PDF	https://arxiv.org/pdf/1907.10043v2.pdf
PWC	https://paperswithcode.com/paper/canonical-surface-mapping-via-geometric-cycle
Repo
Framework

A Comparison of Hybrid and End-to-End Models for Syllable Recognition


Title	A Comparison of Hybrid and End-to-End Models for Syllable Recognition
Authors	Sebastian P. Bayerl, Korbinian Riedhammer
Abstract	This paper presents a comparison of a traditional hybrid speech recognition system (kaldi using WFST and TDNN with lattice-free MMI) and a lexicon-free end-to-end (TensorFlow implementation of multi-layer LSTM with CTC training) models for German syllable recognition on the Verbmobil corpus. The results show that explicitly modeling prior knowledge is still valuable in building recognition systems. With a strong language model (LM) based on syllables, the structured approach significantly outperforms the end-to-end model. The best word error rate (WER) regarding syllables was achieved using kaldi with a 4-gram LM, modeling all syllables observed in the training set. It achieved 10.0% WER w.r.t. the syllables, compared to the end-to-end approach where the best WER was 27.53%. The work presented here has implications for building future recognition systems that operate independent of a large vocabulary, as typically used in a tasks such as recognition of syllabic or agglutinative languages, out-of-vocabulary techniques, keyword search indexing and medical speech processing.
Tasks	Language Modelling, Speech Recognition
Published	2019-09-19
URL	https://arxiv.org/abs/1909.12232v1
PDF	https://arxiv.org/pdf/1909.12232v1.pdf
PWC	https://paperswithcode.com/paper/a-comparison-of-hybrid-and-end-to-end-models
Repo
Framework

Automated Word Stress Detection in Russian


Title	Automated Word Stress Detection in Russian
Authors	Maria Ponomareva, Kirill Milintsevich, Ekaterina Chernyak, Anatoly Starostin
Abstract	In this study we address the problem of automated word stress detection in Russian using character level models and no part-speech-taggers. We use a simple bidirectional RNN with LSTM nodes and achieve the accuracy of 90% or higher. We experiment with two training datasets and show that using the data from an annotated corpus is much more efficient than using a dictionary, since it allows us to take into account word frequencies and the morphological context of the word.
Tasks
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05757v1
PDF	https://arxiv.org/pdf/1907.05757v1.pdf
PWC	https://paperswithcode.com/paper/automated-word-stress-detection-in-russian-1
Repo
Framework

DeepCount: Crowd Counting with WiFi via Deep Learning


Title	DeepCount: Crowd Counting with WiFi via Deep Learning
Authors	Shangqing Liu, Yanchao Zhao, Fanggang Xue, Bing Chen, Xiang Chen
Abstract	Recently, the research of wireless sensing has achieved more intelligent results, and the intelligent sensing of human location and activity can be realized by means of WiFi devices. However, most of the current human environment perception work is limited to a single person’s environment, because the environment in which multiple people exist is more complicated than the environment in which a single person exists. In order to solve the problem of human behavior perception in a multi-human environment, we first proposed a solution to achieve crowd counting (inferred population) using deep learning in a closed environment with WIFI signals - DeepCout, which is the first in a multi-human environment. step. Since the use of WiFi to directly count the crowd is too complicated, we use deep learning to solve this problem, use Convolutional Neural Network(CNN) to automatically extract the relationship between the number of people and the channel, and use Long Short Term Memory(LSTM) to resolve the dependencies of number of people and Channel State Information(CSI) . To overcome the massive labelled data required by deep learning method, we add an online learning mechanism to determine whether or not someone is entering/leaving the room by activity recognition model, so as to correct the deep learning model in the fine-tune stage, which, in turn, reduces the required training data and make our method evolving over time. The system of DeepCount is performed and evaluated on the commercial WiFi devices. By massive training samples, our end-to-end learning approach can achieve an average of 86.4% prediction accuracy in an environment of up to 5 people. Meanwhile, by the amendment mechanism of the activity recognition model to judge door switch to get the variance of crowd to amend deep learning predicted results, the accuracy is up to 90%.
Tasks	Activity Recognition, Crowd Counting
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05316v1
PDF	http://arxiv.org/pdf/1903.05316v1.pdf
PWC	https://paperswithcode.com/paper/deepcount-crowd-counting-with-wifi-via-deep
Repo
Framework

On the Universality of Invariant Networks


Title	On the Universality of Invariant Networks
Authors	Haggai Maron, Ethan Fetaya, Nimrod Segol, Yaron Lipman
Abstract	Constraining linear layers in neural networks to respect symmetry transformations from a group $G$ is a common design principle for invariant networks that has found many applications in machine learning. In this paper, we consider a fundamental question that has received little attention to date: Can these networks approximate any (continuous) invariant function? We tackle the rather general case where $G\leq S_n$ (an arbitrary subgroup of the symmetric group) that acts on $\mathbb{R}^n$ by permuting coordinates. This setting includes several recent popular invariant networks. We present two main results: First, $G$-invariant networks are universal if high-order tensors are allowed. Second, there are groups $G$ for which higher-order tensors are unavoidable for obtaining universality. $G$-invariant networks consisting of only first-order tensors are of special interest due to their practical value. We conclude the paper by proving a necessary condition for the universality of $G$-invariant networks that incorporate only first-order tensors.
Tasks
Published	2019-01-27
URL	https://arxiv.org/abs/1901.09342v4
PDF	https://arxiv.org/pdf/1901.09342v4.pdf
PWC	https://paperswithcode.com/paper/on-the-universality-of-invariant-networks
Repo
Framework