October 20, 2019

3036 words 15 mins read

Paper Group AWR 317

Paper Group AWR 317

Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data. Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation. Towards real-time unsupervised monocular depth estimation on CPU. Integrating Transformer and Paraphrase Rules for Sentence Simplification. Practical Text Classification …

Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data

Title Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data
Authors Andrew L. Beam, Benjamin Kompa, Allen Schmaltz, Inbar Fried, Griffin Weber, Nathan P. Palmer, Xu Shi, Tianxi Cai, Isaac S. Kohane
Abstract Word embeddings are a popular approach to unsupervised learning of word relationships that are widely used in natural language processing. In this article, we present a new set of embeddings for medical concepts learned using an extremely large collection of multimodal medical data. Leaning on recent theoretical insights, we demonstrate how an insurance claims database of 60 million members, a collection of 20 million clinical notes, and 1.7 million full text biomedical journal articles can be combined to embed concepts into a common space, resulting in the largest ever set of embeddings for 108,477 medical concepts. To evaluate our approach, we present a new benchmark methodology based on statistical power specifically designed to test embeddings of medical concepts. Our approach, called cui2vec, attains state-of-the-art performance relative to previous methods in most instances. Finally, we provide a downloadable set of pre-trained embeddings for other researchers to use, as well as an online tool for interactive exploration of the cui2vec embeddings
Tasks Word Embeddings
Published 2018-04-04
URL https://arxiv.org/abs/1804.01486v3
PDF https://arxiv.org/pdf/1804.01486v3.pdf
PWC https://paperswithcode.com/paper/clinical-concept-embeddings-learned-from
Repo https://github.com/hscells/cui2vec
Framework none

Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation

Title Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation
Authors John Benhardt, Peter Hase, Liuyi Zhu, Cynthia Rudin
Abstract We provide an approach for generating beautiful poetry. Our sonnet-generation algorithm includes several novel elements that improve over the state of the art, leading to metrical, rhyming poetry with many human-like qualities. These novel elements include in-line punctuation, part of speech restrictions, and more appropriate training corpora. Our work is the winner of the 2018 PoetiX Literary Turing Test Award for computer-generated poetry.
Tasks Sonnet Generation
Published 2018-11-13
URL https://arxiv.org/abs/1811.05067v2
PDF https://arxiv.org/pdf/1811.05067v2.pdf
PWC https://paperswithcode.com/paper/shall-i-compare-thee-to-a-machine-written
Repo https://github.com/peterbhase/poetry-generation
Framework tf

Towards real-time unsupervised monocular depth estimation on CPU

Title Towards real-time unsupervised monocular depth estimation on CPU
Authors Matteo Poggi, Filippo Aleotti, Fabio Tosi, Stefano Mattoccia
Abstract Unsupervised depth estimation from a single image is a very attractive technique with several implications in robotic, autonomous navigation, augmented reality and so on. This topic represents a very challenging task and the advent of deep learning enabled to tackle this problem with excellent results. However, these architectures are extremely deep and complex. Thus, real-time performance can be achieved only by leveraging power-hungry GPUs that do not allow to infer depth maps in application fields characterized by low-power constraints. To tackle this issue, in this paper we propose a novel architecture capable to quickly infer an accurate depth map on a CPU, even of an embedded system, using a pyramid of features extracted from a single input image. Similarly to state-of-the-art, we train our network in an unsupervised manner casting depth estimation as an image reconstruction problem. Extensive experimental results on the KITTI dataset show that compared to the top performing approach our network has similar accuracy but a much lower complexity (about 6% of parameters) enabling to infer a depth map for a KITTI image in about 1.7 s on the Raspberry Pi 3 and at more than 8 Hz on a standard CPU. Moreover, by trading accuracy for efficiency, our network allows to infer maps at about 2 Hz and 40 Hz respectively, still being more accurate than most state-of-the-art slower methods. To the best of our knowledge, it is the first method enabling such performance on CPUs paving the way for effective deployment of unsupervised monocular depth estimation even on embedded systems.
Tasks Autonomous Navigation, Depth Estimation, Image Reconstruction, Monocular Depth Estimation
Published 2018-06-29
URL http://arxiv.org/abs/1806.11430v3
PDF http://arxiv.org/pdf/1806.11430v3.pdf
PWC https://paperswithcode.com/paper/towards-real-time-unsupervised-monocular
Repo https://github.com/00marco/pydnet-duplicate
Framework tf

Integrating Transformer and Paraphrase Rules for Sentence Simplification

Title Integrating Transformer and Paraphrase Rules for Sentence Simplification
Authors Sanqiang Zhao, Rui Meng, Daqing He, Saptono Andi, Parmanto Bambang
Abstract Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- pose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state- of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification.
Tasks
Published 2018-10-26
URL http://arxiv.org/abs/1810.11193v1
PDF http://arxiv.org/pdf/1810.11193v1.pdf
PWC https://paperswithcode.com/paper/integrating-transformer-and-paraphrase-rules
Repo https://github.com/Sanqiang/text_simplification
Framework none

Practical Text Classification With Large Pre-Trained Language Models

Title Practical Text Classification With Large Pre-Trained Language Models
Authors Neel Kant, Raul Puri, Nikolai Yakovenko, Bryan Catanzaro
Abstract Multi-emotion sentiment classification is a natural language processing (NLP) problem with valuable use cases on real-world data. We demonstrate that large-scale unsupervised language modeling combined with finetuning offers a practical solution to this task on difficult datasets, including those with label class imbalance and domain-specific context. By training an attention-based Transformer network (Vaswani et al. 2017) on 40GB of text (Amazon reviews) (McAuley et al. 2015) and fine-tuning on the training set, our model achieves a 0.69 F1 score on the SemEval Task 1:E-c multi-dimensional emotion classification problem (Mohammad et al. 2018), based on the Plutchik wheel of emotions (Plutchik 1979). These results are competitive with state of the art models, including strong F1 scores on difficult (emotion) categories such as Fear (0.73), Disgust (0.77) and Anger (0.78), as well as competitive results on rare categories such as Anticipation (0.42) and Surprise (0.37). Furthermore, we demonstrate our application on a real world text classification task. We create a narrowly collected text dataset of real tweets on several topics, and show that our finetuned model outperforms general purpose commercially available APIs for sentiment and multidimensional emotion classification on this dataset by a significant margin. We also perform a variety of additional studies, investigating properties of deep learning architectures, datasets and algorithms for achieving practical multidimensional sentiment classification. Overall, we find that unsupervised language modeling and finetuning is a simple framework for achieving high quality results on real-world sentiment classification.
Tasks Emotion Classification, Language Modelling, Sentiment Analysis, Text Classification
Published 2018-12-04
URL http://arxiv.org/abs/1812.01207v1
PDF http://arxiv.org/pdf/1812.01207v1.pdf
PWC https://paperswithcode.com/paper/practical-text-classification-with-large-pre
Repo https://github.com/NVIDIA/sentiment-discovery
Framework pytorch

Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data

Title Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
Authors Guokun Lai, Bohan Li, Guoqing Zheng, Yiming Yang
Abstract How to model distribution of sequential data, including but not limited to speech and human motions, is an important ongoing research problem. It has been demonstrated that model capacity can be significantly enhanced by introducing stochastic latent variables in the hidden states of recurrent neural networks. Simultaneously, WaveNet, equipped with dilated convolutions, achieves astonishing empirical performance in natural speech generation task. In this paper, we combine the ideas from both stochastic latent variables and dilated convolutions, and propose a new architecture to model sequential data, termed as Stochastic WaveNet, where stochastic latent variables are injected into the WaveNet structure. We argue that Stochastic WaveNet enjoys powerful distribution modeling capacity and the advantage of parallel training from dilated convolutions. In order to efficiently infer the posterior distribution of the latent variables, a novel inference network structure is designed based on the characteristics of WaveNet architecture. State-of-the-art performances on benchmark datasets are obtained by Stochastic WaveNet on natural speech modeling and high quality human handwriting samples can be generated as well.
Tasks
Published 2018-06-15
URL http://arxiv.org/abs/1806.06116v1
PDF http://arxiv.org/pdf/1806.06116v1.pdf
PWC https://paperswithcode.com/paper/stochastic-wavenet-a-generative-latent
Repo https://github.com/laiguokun/SWaveNet
Framework pytorch

SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images

Title SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images
Authors Yeonkun Lee, Jaeseok Jeong, Jongseob Yun, Wonjune Cho, Kuk-Jin Yoon
Abstract Omni-directional cameras have many advantages overconventional cameras in that they have a much wider field-of-view (FOV). Accordingly, several approaches have beenproposed recently to apply convolutional neural networks(CNNs) to omni-directional images for various visual tasks.However, most of them use image representations defined inthe Euclidean space after transforming the omni-directionalviews originally formed in the non-Euclidean space. Thistransformation leads to shape distortion due to nonuniformspatial resolving power and the loss of continuity. Theseeffects make existing convolution kernels experience diffi-culties in extracting meaningful information.This paper presents a novel method to resolve such prob-lems of applying CNNs to omni-directional images. Theproposed method utilizes a spherical polyhedron to rep-resent omni-directional views. This method minimizes thevariance of the spatial resolving power on the sphere sur-face, and includes new convolution and pooling methodsfor the proposed representation. The proposed method canalso be adopted by any existing CNN-based methods. Thefeasibility of the proposed method is demonstrated throughclassification, detection, and semantic segmentation taskswith synthetic and real datasets.
Tasks Semantic Segmentation
Published 2018-11-20
URL http://arxiv.org/abs/1811.08196v2
PDF http://arxiv.org/pdf/1811.08196v2.pdf
PWC https://paperswithcode.com/paper/spherephd-applying-cnns-on-a-spherical
Repo https://github.com/keevin60907/SpherePHD
Framework tf

Dank Learning: Generating Memes Using Deep Neural Networks

Title Dank Learning: Generating Memes Using Deep Neural Networks
Authors Abel L Peirson V, E Meltem Tolunay
Abstract We introduce a novel meme generation system, which given any image can produce a humorous and relevant caption. Furthermore, the system can be conditioned on not only an image but also a user-defined label relating to the meme template, giving a handle to the user on meme content. The system uses a pretrained Inception-v3 network to return an image embedding which is passed to an attention-based deep-layer LSTM model producing the caption - inspired by the widely recognised Show and Tell Model. We implement a modified beam search to encourage diversity in the captions. We evaluate the quality of our model using perplexity and human assessment on both the quality of memes generated and whether they can be differentiated from real ones. Our model produces original memes that cannot on the whole be differentiated from real ones.
Tasks
Published 2018-06-08
URL http://arxiv.org/abs/1806.04510v1
PDF http://arxiv.org/pdf/1806.04510v1.pdf
PWC https://paperswithcode.com/paper/dank-learning-generating-memes-using-deep
Repo https://github.com/alpv95/MemeProject
Framework tf

Matrix optimization on universal unitary photonic devices

Title Matrix optimization on universal unitary photonic devices
Authors Sunil Pai, Ben Bartlett, Olav Solgaard, David A. B. Miller
Abstract Universal unitary photonic devices can apply arbitrary unitary transformations to a vector of input modes and provide a promising hardware platform for fast and energy-efficient machine learning using light. We simulate the gradient-based optimization of random unitary matrices on universal photonic devices composed of imperfect tunable interferometers. If device components are initialized uniform-randomly, the locally-interacting nature of the mesh components biases the optimization search space towards banded unitary matrices, limiting convergence to random unitary matrices. We detail a procedure for initializing the device by sampling from the distribution of random unitary matrices and show that this greatly improves convergence speed. We also explore mesh architecture improvements such as adding extra tunable beamsplitters or permuting waveguide layers to further improve the training speed and scalability of these devices.
Tasks
Published 2018-08-02
URL https://arxiv.org/abs/1808.00458v3
PDF https://arxiv.org/pdf/1808.00458v3.pdf
PWC https://paperswithcode.com/paper/matrix-optimization-on-universal-unitary
Repo https://github.com/solgaardlab/neurophox-notebooks
Framework tf

Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces

Title Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Authors Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, Maël Primet, Joseph Dureau
Abstract This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. The embedded inference is fast and accurate while enforcing privacy by design, as no personal user data is ever collected. Focusing on Automatic Speech Recognition and Natural Language Understanding, we detail our approach to training high-performance Machine Learning models that are small enough to run in real-time on small devices. Additionally, we describe a data generation procedure that provides sufficient, high-quality training data without compromising user privacy.
Tasks Speech Recognition, Spoken Language Understanding
Published 2018-05-25
URL http://arxiv.org/abs/1805.10190v3
PDF http://arxiv.org/pdf/1805.10190v3.pdf
PWC https://paperswithcode.com/paper/snips-voice-platform-an-embedded-spoken
Repo https://github.com/snipsco/snips-nlu
Framework none

Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer

Title Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
Authors Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu
Abstract Human body part parsing, or human semantic part segmentation, is fundamental to many computer vision tasks. In conventional semantic segmentation methods, the ground truth segmentations are provided, and fully convolutional networks (FCN) are trained in an end-to-end scheme. Although these methods have demonstrated impressive results, their performance highly depends on the quantity and quality of training data. In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations. Our key idea is to exploit the anatomical similarity among human to transfer the parsing results of a person to another person with similar pose. Using these estimated results as additional training data, our semi-supervised model outperforms its strong-supervised counterpart by 6 mIOU on the PASCAL-Person-Part dataset, and we achieve state-of-the-art human parsing results. Our approach is general and can be readily extended to other object/animal parsing task assuming that their anatomical similarity can be annotated by keypoints. The proposed model and accompanying source code are available at https://github.com/MVIG-SJTU/WSHP
Tasks Human Parsing, Human Part Segmentation, Semantic Segmentation, Transfer Learning
Published 2018-05-11
URL http://arxiv.org/abs/1805.04310v1
PDF http://arxiv.org/pdf/1805.04310v1.pdf
PWC https://paperswithcode.com/paper/weakly-and-semi-supervised-human-body-part
Repo https://github.com/MVIG-SJTU/WSHP
Framework pytorch

Structural Agnostic Modeling: Adversarial Learning of Causal Graphs

Title Structural Agnostic Modeling: Adversarial Learning of Causal Graphs
Authors Diviyan Kalainathan, Olivier Goudet, Isabelle Guyon, David Lopez-Paz, Michèle Sebag
Abstract A new causal discovery method, Structural Agnostic Modeling (SAM), is presented in this paper. Leveraging both conditional independencies and distributional asymmetries in the data, SAM aims at recovering full causal models from continuous observational data along a multivariate non-parametric setting. The approach is based on a game between $d$ players estimating each variable distribution conditionally to the others as a neural net, and an adversary aimed at discriminating the overall joint conditional distribution, and that of the original data. An original learning criterion combining distribution estimation, sparsity and acyclicity constraints is used to enforce the end-to-end optimization of the graph structure and parameters through stochastic gradient descent. Besides the theoretical analysis of the approach in the large sample limit, SAM is extensively experimentally validated on synthetic and real data.
Tasks Causal Discovery
Published 2018-03-13
URL https://arxiv.org/abs/1803.04929v2
PDF https://arxiv.org/pdf/1803.04929v2.pdf
PWC https://paperswithcode.com/paper/sam-structural-agnostic-model-causal
Repo https://github.com/Diviyan-Kalainathan/SAMv1
Framework pytorch

BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images

Title BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images
Authors Yucheng Fu, Yang Liu
Abstract Bubble segmentation and size detection algorithms have been developed in recent years for their high efficiency and accuracy in measuring bubbly two-phase flows. In this work, we proposed an architecture called bubble generative adversarial networks (BubGAN) for the generation of realistic synthetic images which could be further used as training or benchmarking data for the development of advanced image processing algorithms. The BubGAN is trained initially on a labeled bubble dataset consisting of ten thousand images. By learning the distribution of these bubbles, the BubGAN can generate more realistic bubbles compared to the conventional models used in the literature. The trained BubGAN is conditioned on bubble feature parameters and has full control of bubble properties in terms of aspect ratio, rotation angle, circularity and edge ratio. A million bubble dataset is pre-generated using the trained BubGAN. One can then assemble realistic bubbly flow images using this dataset and associated image processing tool. These images contain detailed bubble information, therefore do not require additional manual labeling. This is more useful compared with the conventional GAN which generates images without labeling information. The tool could be used to provide benchmarking and training data for existing image processing algorithms and to guide the future development of bubble detecting algorithms.
Tasks
Published 2018-09-07
URL http://arxiv.org/abs/1809.02266v1
PDF http://arxiv.org/pdf/1809.02266v1.pdf
PWC https://paperswithcode.com/paper/bubgan-bubble-generative-adversarial-networks
Repo https://github.com/ycfu/BubGAN
Framework none

Open Vocabulary Learning for Neural Chinese Pinyin IME

Title Open Vocabulary Learning for Neural Chinese Pinyin IME
Authors Zhuosheng Zhang, Yafang Huang, Hai Zhao
Abstract Pinyin-to-character (P2C) conversion is the core component of pinyin-based Chinese input method engine (IME). However, the conversion is seriously compromised by the ambiguities of Chinese characters corresponding to pinyin as well as the predefined fixed vocabularies. To alleviate such inconveniences, we propose a neural P2C conversion model augmented by an online updated vocabulary with a sampling mechanism to support open vocabulary learning during IME working. Our experiments show that the proposed method outperforms commercial IMEs and state-of-the-art traditional models on standard corpus and true inputting history dataset in terms of multiple metrics and thus the online updated vocabulary indeed helps our IME effectively follows user inputting behavior.
Tasks
Published 2018-11-11
URL https://arxiv.org/abs/1811.04352v4
PDF https://arxiv.org/pdf/1811.04352v4.pdf
PWC https://paperswithcode.com/paper/neural-based-pinyin-to-character-conversion
Repo https://github.com/cooelf/OpenIME
Framework none

An AMR Aligner Tuned by Transition-based Parser

Title An AMR Aligner Tuned by Transition-based Parser
Authors Yijia Liu, Wanxiang Che, Bo Zheng, Bing Qin, Ting Liu
Abstract In this paper, we propose a new rich resource enhanced AMR aligner which produces multiple alignments and a new transition system for AMR parsing along with its oracle parser. Our aligner is further tuned by our oracle parser via picking the alignment that leads to the highest-scored achievable AMR graph. Experimental results show that our aligner outperforms the rule-based aligner in previous work by achieving higher alignment F1 score and consistently improving two open-sourced AMR parsers. Based on our aligner and transition system, we develop a transition-based AMR parser that parses a sentence into its AMR graph directly. An ensemble of our parsers with only words and POS tags as input leads to 68.4 Smatch F1 score.
Tasks Amr Parsing
Published 2018-10-08
URL http://arxiv.org/abs/1810.03541v1
PDF http://arxiv.org/pdf/1810.03541v1.pdf
PWC https://paperswithcode.com/paper/an-amr-aligner-tuned-by-transition-based
Repo https://github.com/Oneplus/tamr
Framework none
comments powered by Disqus