February 1, 2020

2956 words 14 mins read

Paper Group AWR 249

OpenVSLAM: A Versatile Visual SLAM Framework. Time-warping invariants of multidimensional time series. A new evaluation framework for topic modeling algorithms based on synthetic corpora. Supervised online diarization with sample mean loss for multi-domain data. Learning Perspective Undistortion of Portraits. Scalable Kernel Learning via the Discri …

OpenVSLAM: A Versatile Visual SLAM Framework


Title	OpenVSLAM: A Versatile Visual SLAM Framework
Authors	Shinya Sumikura, Mikiya Shibuya, Ken Sakurada
Abstract	In this paper, we introduce OpenVSLAM, a visual SLAM framework with high usability and extensibility. Visual SLAM systems are essential for AR devices, autonomous control of robots and drones, etc. However, conventional open-source visual SLAM frameworks are not appropriately designed as libraries called from third-party programs. To overcome this situation, we have developed a novel visual SLAM framework. This software is designed to be easily used and extended. It incorporates several useful features and functions for research and development. OpenVSLAM is released at https://github.com/xdspacelab/openvslam under the 2-clause BSD license.
Tasks
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01122v2
PDF	https://arxiv.org/pdf/1910.01122v2.pdf
PWC	https://paperswithcode.com/paper/openvslam-a-versatile-visual-slam-framework
Repo	https://github.com/xdspacelab/openvslam
Framework	none

Time-warping invariants of multidimensional time series


Title	Time-warping invariants of multidimensional time series
Authors	Joscha Diehl, Kurusch Ebrahimi-Fard, Nikolas Tapia
Abstract	In data science, one is often confronted with a time series representing measurements of some quantity of interest. Usually, as a first step, features of the time series need to be extracted. These are numerical quantities that aim to succinctly describe the data and to dampen the influence of noise. In some applications, these features are also required to satisfy some invariance properties. In this paper, we concentrate on time-warping invariants. We show that these correspond to a certain family of iterated sums of the increments of the time series, known as quasisymmetric functions in the mathematics literature. We present these invariant features in an algebraic framework, and we develop some of their basic properties.
Tasks	Time Series
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05823v2
PDF	https://arxiv.org/pdf/1906.05823v2.pdf
PWC	https://paperswithcode.com/paper/time-warping-invariants-of-multidimensional
Repo	https://github.com/diehlj/iterated-sums-signature-py
Framework	none

A new evaluation framework for topic modeling algorithms based on synthetic corpora


Title	A new evaluation framework for topic modeling algorithms based on synthetic corpora
Authors	Hanyu Shi, Martin Gerlach, Isabel Diersen, Doug Downey, Luis A. N. Amaral
Abstract	Topic models are in widespread use in natural language processing and beyond. Here, we propose a new framework for the evaluation of probabilistic topic modeling algorithms based on synthetic corpora containing an unambiguously defined ground truth topic structure. The major innovation of our approach is the ability to quantify the agreement between the planted and inferred topic structures by comparing the assigned topic labels at the level of the tokens. In experiments, our approach yields novel insights about the relative strengths of topic models as corpus characteristics vary, and the first evidence of an “undetectable phase” for topic models when the planted structure is weak. We also establish the practical relevance of the insights gained for synthetic corpora by predicting the performance of topic modeling algorithms in classification tasks in real-world corpora.
Tasks	Topic Models
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09848v1
PDF	http://arxiv.org/pdf/1901.09848v1.pdf
PWC	https://paperswithcode.com/paper/a-new-evaluation-framework-for-topic-modeling
Repo	https://github.com/amarallab/synthetic_benchmark_topic_model
Framework	none

Supervised online diarization with sample mean loss for multi-domain data


Title	Supervised online diarization with sample mean loss for multi-domain data
Authors	Enrico Fini, Alessio Brutti
Abstract	Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which models speakers using multiple instances of a parameter-sharing recurrent neural network. In this paper we propose qualitative modifications to the model that significantly improve the learning efficiency and the overall diarization performance. In particular, we introduce a novel loss function, we called Sample Mean Loss and we present a better modelling of the speaker turn behaviour, by devising an analytical expression to compute the probability of a new speaker joining the conversation. In addition, we demonstrate that our model can be trained on fixed-length speech segments, removing the need for speaker change information in inference. Using x-vectors as input features, we evaluate our proposed approach on the multi-domain dataset employed in the DIHARD II challenge: our online method improves with respect to the original UIS-RNN and achieves similar performance to an offline agglomerative clustering baseline using PLDA scoring.
Tasks	Speaker Diarization
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01266v3
PDF	https://arxiv.org/pdf/1911.01266v3.pdf
PWC	https://paperswithcode.com/paper/supervised-online-diarization-with-sample
Repo	https://github.com/DonkeyShot21/uis-rnn-sml
Framework	pytorch

Learning Perspective Undistortion of Portraits


Title	Learning Perspective Undistortion of Portraits
Authors	Yajie Zhao, Zeng Huang, Tianye Li, Weikai Chen, Chloe LeGendre, Xinglei Ren, Jun Xing, Ari Shapiro, Hao Li
Abstract	Near-range portrait photographs often contain perspective distortion artifacts that bias human perception and challenge both facial recognition and reconstruction techniques. We present the first deep learning based approach to remove such artifacts from unconstrained portraits. In contrast to the previous state-of-the-art approach, our method handles even portraits with extreme perspective distortion, as we avoid the inaccurate and error-prone step of first fitting a 3D face model. Instead, we predict a distortion correction flow map that encodes a per-pixel displacement that removes distortion artifacts when applied to the input image. Our method also automatically infers missing facial features, i.e. occluded ears caused by strong perspective distortion, with coherent details. We demonstrate that our approach significantly outperforms the previous state-of-the-art both qualitatively and quantitatively, particularly for portraits with extreme perspective distortion or facial expressions. We further show that our technique benefits a number of fundamental tasks, significantly improving the accuracy of both face recognition and 3D reconstruction and enables a novel camera calibration technique from a single portrait. Moreover, we also build the first perspective portrait database with a large diversity in identities, expression and poses, which will benefit the related research in this area.
Tasks	3D Reconstruction, Calibration, Face Recognition
Published	2019-05-18
URL	https://arxiv.org/abs/1905.07515v1
PDF	https://arxiv.org/pdf/1905.07515v1.pdf
PWC	https://paperswithcode.com/paper/learning-perspective-undistortion-of
Repo	https://github.com/bearjoy730/Learning-Perspective-Undistortion-of-Portraits
Framework	none

Scalable Kernel Learning via the Discriminant Information


Title	Scalable Kernel Learning via the Discriminant Information
Authors	Mert Al, Zejiang Hou, Sun-Yuan Kung
Abstract	Kernel approximation methods create explicit, low-dimensional kernel feature maps to deal with the high computational and memory complexity of standard techniques. This work studies a supervised kernel learning methodology to optimize such mappings. We utilize the Discriminant Information criterion, a measure of class separability with a strong connection to Discriminant Analysis. By generalizing this measure to cover a wider range of kernel maps and learning settings, we develop scalable methods to learn kernel features with high discriminant power. Experimental results on several datasets showcase that our techniques can improve optimization and generalization performances over state of the art kernel learning methods.
Tasks
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10432v2
PDF	https://arxiv.org/pdf/1909.10432v2.pdf
PWC	https://paperswithcode.com/paper/190910432
Repo	https://github.com/almert/di-kernel-optimization
Framework	tf

Normalizing Flows for Probabilistic Modeling and Inference


Title	Normalizing Flows for Probabilistic Modeling and Inference
Authors	George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan
Abstract	Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective transformations. There has been much recent work on normalizing flows, ranging from improving their expressive power to expanding their application. We believe the field has now matured and is in need of a unified perspective. In this review, we attempt to provide such a perspective by describing flows through the lens of probabilistic modeling and inference. We place special emphasis on the fundamental principles of flow design, and discuss foundational topics such as expressive power and computational trade-offs. We also broaden the conceptual framing of flows by relating them to more general probability transformations. Lastly, we summarize the use of flows for tasks such as generative modeling, approximate inference, and supervised learning.
Tasks
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02762v1
PDF	https://arxiv.org/pdf/1912.02762v1.pdf
PWC	https://paperswithcode.com/paper/normalizing-flows-for-probabilistic-modeling
Repo	https://github.com/mackelab/nflows
Framework	pytorch

CFEA: Collaborative Feature Ensembling Adaptation for Domain Adaptation in Unsupervised Optic Disc and Cup Segmentation


Title	CFEA: Collaborative Feature Ensembling Adaptation for Domain Adaptation in Unsupervised Optic Disc and Cup Segmentation
Authors	Peng Liu, Bin Kong, Zhongyu Li, Shaoting Zhang, Ruogu Fang
Abstract	Recently, deep neural networks have demonstrated comparable and even better performance with board-certified ophthalmologists in well-annotated datasets. However, the diversity of retinal imaging devices poses a significant challenge: domain shift, which leads to performance degradation when applying the deep learning models to new testing domains. In this paper, we propose a novel unsupervised domain adaptation framework, called Collaborative Feature Ensembling Adaptation (CFEA), to effectively overcome this challenge. Our proposed CFEA is an interactive paradigm which presents an exquisite of collaborative adaptation through both adversarial learning and ensembling weights. In particular, we simultaneously achieve domain-invariance and maintain an exponential moving average of the historical predictions, which achieves a better prediction for the unlabeled data, via ensembling weights during training. Without annotating any sample from the target domain, multiple adversarial losses in encoder and decoder layers guide the extraction of domain-invariant features to confuse the domain classifier and meanwhile benefit the ensembling of smoothing weights. Comprehensive experimental results demonstrate that our CFEA model can overcome performance degradation and outperform the state-of-the-art methods in segmenting retinal optic disc and cup from fundus images. \textit{Code is available at \url{https://github.com/cswin/AWC}}.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07638v1
PDF	https://arxiv.org/pdf/1910.07638v1.pdf
PWC	https://paperswithcode.com/paper/cfea-collaborative-feature-ensembling
Repo	https://github.com/cswin/AWC
Framework	pytorch

Efficient Decoupled Neural Architecture Search by Structure and Operation Sampling


Title	Efficient Decoupled Neural Architecture Search by Structure and Operation Sampling
Authors	Heung-Chang Lee, Do-Guk Kim, Bohyung Han
Abstract	We propose a novel neural architecture search algorithm via reinforcement learning by decoupling structure and operation search processes. Our approach samples candidate models from the multinomial distribution on the policy vectors defined on the two search spaces independently. The proposed technique improves the efficiency of architecture search process significantly compared to the conventional methods based on reinforcement learning with the RNN controllers while achieving competitive accuracy and model size in target tasks. Our policy vectors are easily interpretable throughout the training procedure, which allows to analyze the search progress and the discovered architectures; the black-box characteristics of the RNN controllers hamper understanding training progress in terms of policy parameter updates. Our experiments demonstrate outstanding performance compared to the state-of-the-art methods with a fraction of search cost.
Tasks	Neural Architecture Search
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10397v1
PDF	https://arxiv.org/pdf/1910.10397v1.pdf
PWC	https://paperswithcode.com/paper/efficient-decoupled-neural-architecture
Repo	https://github.com/logue311/EDNAS
Framework	pytorch

Semi-Supervised Semantic Segmentation with High- and Low-level Consistency


Title	Semi-Supervised Semantic Segmentation with High- and Low-level Consistency
Authors	Sudhanshu Mittal, Maxim Tatarchenko, Thomas Brox
Abstract	The ability to understand visual information from limited labeled data is an important aspect of machine learning. While image-level classification has been extensively studied in a semi-supervised setting, dense pixel-level classification with limited data has only drawn attention recently. In this work, we propose an approach for semi-supervised semantic segmentation that learns from limited pixel-wise annotated samples while exploiting additional annotation-free images. It uses two network branches that link semi-supervised classification with semi-supervised segmentation including self-training. The dual-branch approach reduces both the low-level and the high-level artifacts typical when training with few labels. The approach attains significant improvement over existing methods, especially when trained with very few labeled samples. On several standard benchmarks - PASCAL VOC 2012, PASCAL-Context, and Cityscapes - the approach achieves new state-of-the-art in semi-supervised learning.
Tasks	Semantic Segmentation, Semi-Supervised Semantic Segmentation
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05724v1
PDF	https://arxiv.org/pdf/1908.05724v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-semantic-segmentation-with
Repo	https://github.com/sud0301/semisup-semseg
Framework	pytorch

Unsupervised Domain Adaptation through Self-Supervision


Title	Unsupervised Domain Adaptation through Self-Supervision
Authors	Yu Sun, Eric Tzeng, Trevor Darrell, Alexei A. Efros
Abstract	This paper addresses unsupervised domain adaptation, the setting where labeled training data is available on a source domain, but the goal is to have good performance on a target domain with only unlabeled data. Like much of previous work, we seek to align the learned representations of the source and target domains while preserving discriminability. The way we accomplish alignment is by learning to perform auxiliary self-supervised task(s) on both domains simultaneously. Each self-supervised task brings the two domains closer together along the direction relevant to that task. Training this jointly with the main task classifier on the source domain is shown to successfully generalize to the unlabeled target domain. The presented objective is straightforward to implement and easy to optimize. We achieve state-of-the-art results on four out of seven standard benchmarks, and competitive results on segmentation adaptation. We also demonstrate that our method composes well with another popular pixel-level adaptation method.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11825v2
PDF	https://arxiv.org/pdf/1909.11825v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-domain-adaptation-through-self-1
Repo	https://github.com/yueatsprograms/uda_release
Framework	pytorch

Computer-aided diagnosis in histopathological images of the endometrium using a convolutional neural network and attention mechanisms


Title	Computer-aided diagnosis in histopathological images of the endometrium using a convolutional neural network and attention mechanisms
Authors	Hao Sun, Xianxu Zeng, Tao Xu, Gang Peng, Yutao Ma
Abstract	Uterine cancer, also known as endometrial cancer, can seriously affect the female reproductive organs, and histopathological image analysis is the gold standard for diagnosing endometrial cancer. However, due to the limited capability of modeling the complicated relationships between histopathological images and their interpretations, these computer-aided diagnosis (CADx) approaches based on traditional machine learning algorithms often failed to achieve satisfying results. In this study, we developed a CADx approach using a convolutional neural network (CNN) and attention mechanisms, called HIENet. Because HIENet used the attention mechanisms and feature map visualization techniques, it can provide pathologists better interpretability of diagnoses by highlighting the histopathological correlations of local (pixel-level) image features to morphological characteristics of endometrial tissue. In the ten-fold cross-validation process, the CADx approach, HIENet, achieved a 76.91 $\pm$ 1.17% (mean $\pm$ s. d.) classification accuracy for four classes of endometrial tissue, namely normal endometrium, endometrial polyp, endometrial hyperplasia, and endometrial adenocarcinoma. Also, HIENet achieved an area-under-the-curve (AUC) of 0.9579 $\pm$ 0.0103 with an 81.04 $\pm$ 3.87% sensitivity and 94.78 $\pm$ 0.87% specificity in a binary classification task that detected endometrioid adenocarcinoma (Malignant). Besides, in the external validation process, HIENet achieved an 84.50% accuracy in the four-class classification task, and it achieved an AUC of 0.9829 with a 77.97% (95% CI, 65.27%-87.71%) sensitivity and 100% (95% CI, 97.42%-100.00%) specificity. In summary, the proposed CADx approach, HIENet, outperformed three human experts and four end-to-end CNN-based classifiers on this small-scale dataset composed of 3,500 hematoxylin and eosin (H&E) images regarding overall classification performance.
Tasks
Published	2019-04-24
URL	http://arxiv.org/abs/1904.10626v1
PDF	http://arxiv.org/pdf/1904.10626v1.pdf
PWC	https://paperswithcode.com/paper/computer-aided-diagnosis-in-histopathological
Repo	https://github.com/ssea-lab/DL4ETI
Framework	tf

Evaluating Rewards for Question Generation Models


Title	Evaluating Rewards for Question Generation Models
Authors	Tom Hosking, Sebastian Riedel
Abstract	Recent approaches to question generation have used modifications to a Seq2Seq architecture inspired by advances in machine translation. Models are trained using teacher forcing to optimise only the one-step-ahead prediction. However, at test time, the model is asked to generate a whole sequence, causing errors to propagate through the generation process (exposure bias). A number of authors have proposed countering this bias by optimising for a reward that is less tightly coupled to the training data, using reinforcement learning. We optimise directly for quality metrics, including a novel approach using a discriminator learned directly from the training data. We confirm that policy gradient methods can be used to decouple training from the ground truth, leading to increases in the metrics used as rewards. We perform a human evaluation, and show that although these metrics have previously been assumed to be good proxies for question quality, they are poorly aligned with human judgement and the model simply learns to exploit the weaknesses of the reward source.
Tasks	Machine Translation, Policy Gradient Methods, Question Generation
Published	2019-02-28
URL	https://arxiv.org/abs/1902.11049v2
PDF	https://arxiv.org/pdf/1902.11049v2.pdf
PWC	https://paperswithcode.com/paper/evaluating-rewards-for-question-generation
Repo	https://github.com/bloomsburyai/question-generation
Framework	tf

Compositionality decomposed: how do neural networks generalise?


Title	Compositionality decomposed: how do neural networks generalise?
Authors	Dieuwke Hupkes, Verna Dankers, Mathijs Mul, Elia Bruni
Abstract	Despite a multitude of empirical studies, little consensus exists on whether neural networks are able to generalise compositionally, a controversy that, in part, stems from a lack of agreement about what it means for a neural model to be compositional. As a response to this controversy, we present a set of tests that provide a bridge between, on the one hand, the vast amount of linguistic and philosophical theory about compositionality of language and, on the other, the successful neural models of language. We collect different interpretations of compositionality and translate them into five theoretically grounded tests for models that are formulated on a task-independent level. In particular, we provide tests to investigate (i) if models systematically recombine known parts and rules (ii) if models can extend their predictions beyond the length they have seen in the training data (iii) if models’ composition operations are local or global (iv) if models’ predictions are robust to synonym substitutions and (v) if models favour rules or exceptions during training. To demonstrate the usefulness of this evaluation paradigm, we instantiate these five tests on a highly compositional data set which we dub PCFG SET and apply the resulting tests to three popular sequence-to-sequence models: a recurrent, a convolution-based and a transformer model. We provide an in-depth analysis of the results, which uncover the strengths and weaknesses of these three architectures and point to potential areas of improvement.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08351v2
PDF	https://arxiv.org/pdf/1908.08351v2.pdf
PWC	https://paperswithcode.com/paper/the-compositionality-of-neural-networks
Repo	https://github.com/i-machine-think/am-i-compositional
Framework	pytorch

Correlating Twitter Language with Community-Level Health Outcomes


Title	Correlating Twitter Language with Community-Level Health Outcomes
Authors	Arno Schneuwly, Ralf Grubenmann, Séverine Rion Logean, Mark Cieliebak, Martin Jaggi
Abstract	We study how language on social media is linked to diseases such as atherosclerotic heart disease (AHD), diabetes and various types of cancer. Our proposed model leverages state-of-the-art sentence embeddings, followed by a regression model and clustering, without the need of additional labelled data. It allows to predict community-level medical outcomes from language, and thereby potentially translate these to the individual level. The method is applicable to a wide range of target variables and allows us to discover known and potentially novel correlations of medical outcomes with life-style aspects and other socioeconomic risk factors.
Tasks	Sentence Embeddings
Published	2019-06-13
URL	https://arxiv.org/abs/1906.06465v2
PDF	https://arxiv.org/pdf/1906.06465v2.pdf
PWC	https://paperswithcode.com/paper/correlating-twitter-language-with-community
Repo	https://github.com/epfml/correlating-tweets
Framework	none