October 20, 2019

3036 words 15 mins read

Paper Group AWR 317

Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data. Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation. Towards real-time unsupervised monocular depth estimation on CPU. Integrating Transformer and Paraphrase Rules for Sentence Simplification. Practical Text Classification …

Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data


Title	Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data
Authors	Andrew L. Beam, Benjamin Kompa, Allen Schmaltz, Inbar Fried, Griffin Weber, Nathan P. Palmer, Xu Shi, Tianxi Cai, Isaac S. Kohane
Abstract	Word embeddings are a popular approach to unsupervised learning of word relationships that are widely used in natural language processing. In this article, we present a new set of embeddings for medical concepts learned using an extremely large collection of multimodal medical data. Leaning on recent theoretical insights, we demonstrate how an insurance claims database of 60 million members, a collection of 20 million clinical notes, and 1.7 million full text biomedical journal articles can be combined to embed concepts into a common space, resulting in the largest ever set of embeddings for 108,477 medical concepts. To evaluate our approach, we present a new benchmark methodology based on statistical power specifically designed to test embeddings of medical concepts. Our approach, called cui2vec, attains state-of-the-art performance relative to previous methods in most instances. Finally, we provide a downloadable set of pre-trained embeddings for other researchers to use, as well as an online tool for interactive exploration of the cui2vec embeddings
Tasks	Word Embeddings
Published	2018-04-04
URL	https://arxiv.org/abs/1804.01486v3
PDF	https://arxiv.org/pdf/1804.01486v3.pdf
PWC	https://paperswithcode.com/paper/clinical-concept-embeddings-learned-from
Repo	https://github.com/hscells/cui2vec
Framework	none

Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation


Title	Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation
Authors	John Benhardt, Peter Hase, Liuyi Zhu, Cynthia Rudin
Abstract	We provide an approach for generating beautiful poetry. Our sonnet-generation algorithm includes several novel elements that improve over the state of the art, leading to metrical, rhyming poetry with many human-like qualities. These novel elements include in-line punctuation, part of speech restrictions, and more appropriate training corpora. Our work is the winner of the 2018 PoetiX Literary Turing Test Award for computer-generated poetry.
Tasks	Sonnet Generation
Published	2018-11-13
URL	https://arxiv.org/abs/1811.05067v2
PDF	https://arxiv.org/pdf/1811.05067v2.pdf
PWC	https://paperswithcode.com/paper/shall-i-compare-thee-to-a-machine-written
Repo	https://github.com/peterbhase/poetry-generation
Framework	tf

Towards real-time unsupervised monocular depth estimation on CPU


Title	Towards real-time unsupervised monocular depth estimation on CPU
Authors	Matteo Poggi, Filippo Aleotti, Fabio Tosi, Stefano Mattoccia
Abstract	Unsupervised depth estimation from a single image is a very attractive technique with several implications in robotic, autonomous navigation, augmented reality and so on. This topic represents a very challenging task and the advent of deep learning enabled to tackle this problem with excellent results. However, these architectures are extremely deep and complex. Thus, real-time performance can be achieved only by leveraging power-hungry GPUs that do not allow to infer depth maps in application fields characterized by low-power constraints. To tackle this issue, in this paper we propose a novel architecture capable to quickly infer an accurate depth map on a CPU, even of an embedded system, using a pyramid of features extracted from a single input image. Similarly to state-of-the-art, we train our network in an unsupervised manner casting depth estimation as an image reconstruction problem. Extensive experimental results on the KITTI dataset show that compared to the top performing approach our network has similar accuracy but a much lower complexity (about 6% of parameters) enabling to infer a depth map for a KITTI image in about 1.7 s on the Raspberry Pi 3 and at more than 8 Hz on a standard CPU. Moreover, by trading accuracy for efficiency, our network allows to infer maps at about 2 Hz and 40 Hz respectively, still being more accurate than most state-of-the-art slower methods. To the best of our knowledge, it is the first method enabling such performance on CPUs paving the way for effective deployment of unsupervised monocular depth estimation even on embedded systems.
Tasks	Autonomous Navigation, Depth Estimation, Image Reconstruction, Monocular Depth Estimation
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11430v3
PDF	http://arxiv.org/pdf/1806.11430v3.pdf
PWC	https://paperswithcode.com/paper/towards-real-time-unsupervised-monocular
Repo	https://github.com/00marco/pydnet-duplicate
Framework	tf

Integrating Transformer and Paraphrase Rules for Sentence Simplification


Title	Integrating Transformer and Paraphrase Rules for Sentence Simplification
Authors	Sanqiang Zhao, Rui Meng, Daqing He, Saptono Andi, Parmanto Bambang
Abstract	Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- pose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state- of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification.
Tasks
Published	2018-10-26
URL	http://arxiv.org/abs/1810.11193v1
PDF	http://arxiv.org/pdf/1810.11193v1.pdf
PWC	https://paperswithcode.com/paper/integrating-transformer-and-paraphrase-rules
Repo	https://github.com/Sanqiang/text_simplification
Framework	none

Practical Text Classification With Large Pre-Trained Language Models


Title	Practical Text Classification With Large Pre-Trained Language Models
Authors	Neel Kant, Raul Puri, Nikolai Yakovenko, Bryan Catanzaro
Abstract	Multi-emotion sentiment classification is a natural language processing (NLP) problem with valuable use cases on real-world data. We demonstrate that large-scale unsupervised language modeling combined with finetuning offers a practical solution to this task on difficult datasets, including those with label class imbalance and domain-specific context. By training an attention-based Transformer network (Vaswani et al. 2017) on 40GB of text (Amazon reviews) (McAuley et al. 2015) and fine-tuning on the training set, our model achieves a 0.69 F1 score on the SemEval Task 1:E-c multi-dimensional emotion classification problem (Mohammad et al. 2018), based on the Plutchik wheel of emotions (Plutchik 1979). These results are competitive with state of the art models, including strong F1 scores on difficult (emotion) categories such as Fear (0.73), Disgust (0.77) and Anger (0.78), as well as competitive results on rare categories such as Anticipation (0.42) and Surprise (0.37). Furthermore, we demonstrate our application on a real world text classification task. We create a narrowly collected text dataset of real tweets on several topics, and show that our finetuned model outperforms general purpose commercially available APIs for sentiment and multidimensional emotion classification on this dataset by a significant margin. We also perform a variety of additional studies, investigating properties of deep learning architectures, datasets and algorithms for achieving practical multidimensional sentiment classification. Overall, we find that unsupervised language modeling and finetuning is a simple framework for achieving high quality results on real-world sentiment classification.
Tasks	Emotion Classification, Language Modelling, Sentiment Analysis, Text Classification
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01207v1
PDF	http://arxiv.org/pdf/1812.01207v1.pdf
PWC	https://paperswithcode.com/paper/practical-text-classification-with-large-pre
Repo	https://github.com/NVIDIA/sentiment-discovery
Framework	pytorch

Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data


Title	Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
Authors	Guokun Lai, Bohan Li, Guoqing Zheng, Yiming Yang
Abstract	How to model distribution of sequential data, including but not limited to speech and human motions, is an important ongoing research problem. It has been demonstrated that model capacity can be significantly enhanced by introducing stochastic latent variables in the hidden states of recurrent neural networks. Simultaneously, WaveNet, equipped with dilated convolutions, achieves astonishing empirical performance in natural speech generation task. In this paper, we combine the ideas from both stochastic latent variables and dilated convolutions, and propose a new architecture to model sequential data, termed as Stochastic WaveNet, where stochastic latent variables are injected into the WaveNet structure. We argue that Stochastic WaveNet enjoys powerful distribution modeling capacity and the advantage of parallel training from dilated convolutions. In order to efficiently infer the posterior distribution of the latent variables, a novel inference network structure is designed based on the characteristics of WaveNet architecture. State-of-the-art performances on benchmark datasets are obtained by Stochastic WaveNet on natural speech modeling and high quality human handwriting samples can be generated as well.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.06116v1
PDF	http://arxiv.org/pdf/1806.06116v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-wavenet-a-generative-latent
Repo	https://github.com/laiguokun/SWaveNet
Framework	pytorch

SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images


Title	SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images
Authors	Yeonkun Lee, Jaeseok Jeong, Jongseob Yun, Wonjune Cho, Kuk-Jin Yoon
Abstract	Omni-directional cameras have many advantages overconventional cameras in that they have a much wider field-of-view (FOV). Accordingly, several approaches have beenproposed recently to apply convolutional neural networks(CNNs) to omni-directional images for various visual tasks.However, most of them use image representations defined inthe Euclidean space after transforming the omni-directionalviews originally formed in the non-Euclidean space. Thistransformation leads to shape distortion due to nonuniformspatial resolving power and the loss of continuity. Theseeffects make existing convolution kernels experience diffi-culties in extracting meaningful information.This paper presents a novel method to resolve such prob-lems of applying CNNs to omni-directional images. Theproposed method utilizes a spherical polyhedron to rep-resent omni-directional views. This method minimizes thevariance of the spatial resolving power on the sphere sur-face, and includes new convolution and pooling methodsfor the proposed representation. The proposed method canalso be adopted by any existing CNN-based methods. Thefeasibility of the proposed method is demonstrated throughclassification, detection, and semantic segmentation taskswith synthetic and real datasets.
Tasks	Semantic Segmentation
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08196v2
PDF	http://arxiv.org/pdf/1811.08196v2.pdf
PWC	https://paperswithcode.com/paper/spherephd-applying-cnns-on-a-spherical
Repo	https://github.com/keevin60907/SpherePHD
Framework	tf

Dank Learning: Generating Memes Using Deep Neural Networks


Title	Dank Learning: Generating Memes Using Deep Neural Networks
Authors	Abel L Peirson V, E Meltem Tolunay
Abstract	We introduce a novel meme generation system, which given any image can produce a humorous and relevant caption. Furthermore, the system can be conditioned on not only an image but also a user-defined label relating to the meme template, giving a handle to the user on meme content. The system uses a pretrained Inception-v3 network to return an image embedding which is passed to an attention-based deep-layer LSTM model producing the caption - inspired by the widely recognised Show and Tell Model. We implement a modified beam search to encourage diversity in the captions. We evaluate the quality of our model using perplexity and human assessment on both the quality of memes generated and whether they can be differentiated from real ones. Our model produces original memes that cannot on the whole be differentiated from real ones.
Tasks
Published	2018-06-08
URL	http://arxiv.org/abs/1806.04510v1
PDF	http://arxiv.org/pdf/1806.04510v1.pdf
PWC	https://paperswithcode.com/paper/dank-learning-generating-memes-using-deep
Repo	https://github.com/alpv95/MemeProject
Framework	tf

Matrix optimization on universal unitary photonic devices


Title	Matrix optimization on universal unitary photonic devices
Authors	Sunil Pai, Ben Bartlett, Olav Solgaard, David A. B. Miller
Abstract	Universal unitary photonic devices can apply arbitrary unitary transformations to a vector of input modes and provide a promising hardware platform for fast and energy-efficient machine learning using light. We simulate the gradient-based optimization of random unitary matrices on universal photonic devices composed of imperfect tunable interferometers. If device components are initialized uniform-randomly, the locally-interacting nature of the mesh components biases the optimization search space towards banded unitary matrices, limiting convergence to random unitary matrices. We detail a procedure for initializing the device by sampling from the distribution of random unitary matrices and show that this greatly improves convergence speed. We also explore mesh architecture improvements such as adding extra tunable beamsplitters or permuting waveguide layers to further improve the training speed and scalability of these devices.
Tasks
Published	2018-08-02
URL	https://arxiv.org/abs/1808.00458v3
PDF	https://arxiv.org/pdf/1808.00458v3.pdf
PWC	https://paperswithcode.com/paper/matrix-optimization-on-universal-unitary
Repo	https://github.com/solgaardlab/neurophox-notebooks
Framework	tf

Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces


Title	Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Authors	Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, Maël Primet, Joseph Dureau
Abstract	This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. The embedded inference is fast and accurate while enforcing privacy by design, as no personal user data is ever collected. Focusing on Automatic Speech Recognition and Natural Language Understanding, we detail our approach to training high-performance Machine Learning models that are small enough to run in real-time on small devices. Additionally, we describe a data generation procedure that provides sufficient, high-quality training data without compromising user privacy.
Tasks	Speech Recognition, Spoken Language Understanding
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10190v3
PDF	http://arxiv.org/pdf/1805.10190v3.pdf
PWC	https://paperswithcode.com/paper/snips-voice-platform-an-embedded-spoken
Repo	https://github.com/snipsco/snips-nlu
Framework	none

Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer


Title	Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
Authors	Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu
Abstract	Human body part parsing, or human semantic part segmentation, is fundamental to many computer vision tasks. In conventional semantic segmentation methods, the ground truth segmentations are provided, and fully convolutional networks (FCN) are trained in an end-to-end scheme. Although these methods have demonstrated impressive results, their performance highly depends on the quantity and quality of training data. In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations. Our key idea is to exploit the anatomical similarity among human to transfer the parsing results of a person to another person with similar pose. Using these estimated results as additional training data, our semi-supervised model outperforms its strong-supervised counterpart by 6 mIOU on the PASCAL-Person-Part dataset, and we achieve state-of-the-art human parsing results. Our approach is general and can be readily extended to other object/animal parsing task assuming that their anatomical similarity can be annotated by keypoints. The proposed model and accompanying source code are available at https://github.com/MVIG-SJTU/WSHP
Tasks	Human Parsing, Human Part Segmentation, Semantic Segmentation, Transfer Learning
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04310v1
PDF	http://arxiv.org/pdf/1805.04310v1.pdf
PWC	https://paperswithcode.com/paper/weakly-and-semi-supervised-human-body-part
Repo	https://github.com/MVIG-SJTU/WSHP
Framework	pytorch

Structural Agnostic Modeling: Adversarial Learning of Causal Graphs


Title	Structural Agnostic Modeling: Adversarial Learning of Causal Graphs
Authors	Diviyan Kalainathan, Olivier Goudet, Isabelle Guyon, David Lopez-Paz, Michèle Sebag
Abstract	A new causal discovery method, Structural Agnostic Modeling (SAM), is presented in this paper. Leveraging both conditional independencies and distributional asymmetries in the data, SAM aims at recovering full causal models from continuous observational data along a multivariate non-parametric setting. The approach is based on a game between $d$ players estimating each variable distribution conditionally to the others as a neural net, and an adversary aimed at discriminating the overall joint conditional distribution, and that of the original data. An original learning criterion combining distribution estimation, sparsity and acyclicity constraints is used to enforce the end-to-end optimization of the graph structure and parameters through stochastic gradient descent. Besides the theoretical analysis of the approach in the large sample limit, SAM is extensively experimentally validated on synthetic and real data.
Tasks	Causal Discovery
Published	2018-03-13
URL	https://arxiv.org/abs/1803.04929v2
PDF	https://arxiv.org/pdf/1803.04929v2.pdf
PWC	https://paperswithcode.com/paper/sam-structural-agnostic-model-causal
Repo	https://github.com/Diviyan-Kalainathan/SAMv1
Framework	pytorch

BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images


Title	BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images
Authors	Yucheng Fu, Yang Liu
Abstract	Bubble segmentation and size detection algorithms have been developed in recent years for their high efficiency and accuracy in measuring bubbly two-phase flows. In this work, we proposed an architecture called bubble generative adversarial networks (BubGAN) for the generation of realistic synthetic images which could be further used as training or benchmarking data for the development of advanced image processing algorithms. The BubGAN is trained initially on a labeled bubble dataset consisting of ten thousand images. By learning the distribution of these bubbles, the BubGAN can generate more realistic bubbles compared to the conventional models used in the literature. The trained BubGAN is conditioned on bubble feature parameters and has full control of bubble properties in terms of aspect ratio, rotation angle, circularity and edge ratio. A million bubble dataset is pre-generated using the trained BubGAN. One can then assemble realistic bubbly flow images using this dataset and associated image processing tool. These images contain detailed bubble information, therefore do not require additional manual labeling. This is more useful compared with the conventional GAN which generates images without labeling information. The tool could be used to provide benchmarking and training data for existing image processing algorithms and to guide the future development of bubble detecting algorithms.
Tasks
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02266v1
PDF	http://arxiv.org/pdf/1809.02266v1.pdf
PWC	https://paperswithcode.com/paper/bubgan-bubble-generative-adversarial-networks
Repo	https://github.com/ycfu/BubGAN
Framework	none

Open Vocabulary Learning for Neural Chinese Pinyin IME


Title	Open Vocabulary Learning for Neural Chinese Pinyin IME
Authors	Zhuosheng Zhang, Yafang Huang, Hai Zhao
Abstract	Pinyin-to-character (P2C) conversion is the core component of pinyin-based Chinese input method engine (IME). However, the conversion is seriously compromised by the ambiguities of Chinese characters corresponding to pinyin as well as the predefined fixed vocabularies. To alleviate such inconveniences, we propose a neural P2C conversion model augmented by an online updated vocabulary with a sampling mechanism to support open vocabulary learning during IME working. Our experiments show that the proposed method outperforms commercial IMEs and state-of-the-art traditional models on standard corpus and true inputting history dataset in terms of multiple metrics and thus the online updated vocabulary indeed helps our IME effectively follows user inputting behavior.
Tasks
Published	2018-11-11
URL	https://arxiv.org/abs/1811.04352v4
PDF	https://arxiv.org/pdf/1811.04352v4.pdf
PWC	https://paperswithcode.com/paper/neural-based-pinyin-to-character-conversion
Repo	https://github.com/cooelf/OpenIME
Framework	none

An AMR Aligner Tuned by Transition-based Parser


Title	An AMR Aligner Tuned by Transition-based Parser
Authors	Yijia Liu, Wanxiang Che, Bo Zheng, Bing Qin, Ting Liu
Abstract	In this paper, we propose a new rich resource enhanced AMR aligner which produces multiple alignments and a new transition system for AMR parsing along with its oracle parser. Our aligner is further tuned by our oracle parser via picking the alignment that leads to the highest-scored achievable AMR graph. Experimental results show that our aligner outperforms the rule-based aligner in previous work by achieving higher alignment F1 score and consistently improving two open-sourced AMR parsers. Based on our aligner and transition system, we develop a transition-based AMR parser that parses a sentence into its AMR graph directly. An ensemble of our parsers with only words and POS tags as input leads to 68.4 Smatch F1 score.
Tasks	Amr Parsing
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03541v1
PDF	http://arxiv.org/pdf/1810.03541v1.pdf
PWC	https://paperswithcode.com/paper/an-amr-aligner-tuned-by-transition-based
Repo	https://github.com/Oneplus/tamr
Framework	none