Paper Group AWR 317
Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data. Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation. Towards real-time unsupervised monocular depth estimation on CPU. Integrating Transformer and Paraphrase Rules for Sentence Simplification. Practical Text Classification …
Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data
Title | Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data |
Authors | Andrew L. Beam, Benjamin Kompa, Allen Schmaltz, Inbar Fried, Griffin Weber, Nathan P. Palmer, Xu Shi, Tianxi Cai, Isaac S. Kohane |
Abstract | Word embeddings are a popular approach to unsupervised learning of word relationships that are widely used in natural language processing. In this article, we present a new set of embeddings for medical concepts learned using an extremely large collection of multimodal medical data. Leaning on recent theoretical insights, we demonstrate how an insurance claims database of 60 million members, a collection of 20 million clinical notes, and 1.7 million full text biomedical journal articles can be combined to embed concepts into a common space, resulting in the largest ever set of embeddings for 108,477 medical concepts. To evaluate our approach, we present a new benchmark methodology based on statistical power specifically designed to test embeddings of medical concepts. Our approach, called cui2vec, attains state-of-the-art performance relative to previous methods in most instances. Finally, we provide a downloadable set of pre-trained embeddings for other researchers to use, as well as an online tool for interactive exploration of the cui2vec embeddings |
Tasks | Word Embeddings |
Published | 2018-04-04 |
URL | https://arxiv.org/abs/1804.01486v3 |
https://arxiv.org/pdf/1804.01486v3.pdf | |
PWC | https://paperswithcode.com/paper/clinical-concept-embeddings-learned-from |
Repo | https://github.com/hscells/cui2vec |
Framework | none |
Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation
Title | Shall I Compare Thee to a Machine-Written Sonnet? An Approach to Algorithmic Sonnet Generation |
Authors | John Benhardt, Peter Hase, Liuyi Zhu, Cynthia Rudin |
Abstract | We provide an approach for generating beautiful poetry. Our sonnet-generation algorithm includes several novel elements that improve over the state of the art, leading to metrical, rhyming poetry with many human-like qualities. These novel elements include in-line punctuation, part of speech restrictions, and more appropriate training corpora. Our work is the winner of the 2018 PoetiX Literary Turing Test Award for computer-generated poetry. |
Tasks | Sonnet Generation |
Published | 2018-11-13 |
URL | https://arxiv.org/abs/1811.05067v2 |
https://arxiv.org/pdf/1811.05067v2.pdf | |
PWC | https://paperswithcode.com/paper/shall-i-compare-thee-to-a-machine-written |
Repo | https://github.com/peterbhase/poetry-generation |
Framework | tf |
Towards real-time unsupervised monocular depth estimation on CPU
Title | Towards real-time unsupervised monocular depth estimation on CPU |
Authors | Matteo Poggi, Filippo Aleotti, Fabio Tosi, Stefano Mattoccia |
Abstract | Unsupervised depth estimation from a single image is a very attractive technique with several implications in robotic, autonomous navigation, augmented reality and so on. This topic represents a very challenging task and the advent of deep learning enabled to tackle this problem with excellent results. However, these architectures are extremely deep and complex. Thus, real-time performance can be achieved only by leveraging power-hungry GPUs that do not allow to infer depth maps in application fields characterized by low-power constraints. To tackle this issue, in this paper we propose a novel architecture capable to quickly infer an accurate depth map on a CPU, even of an embedded system, using a pyramid of features extracted from a single input image. Similarly to state-of-the-art, we train our network in an unsupervised manner casting depth estimation as an image reconstruction problem. Extensive experimental results on the KITTI dataset show that compared to the top performing approach our network has similar accuracy but a much lower complexity (about 6% of parameters) enabling to infer a depth map for a KITTI image in about 1.7 s on the Raspberry Pi 3 and at more than 8 Hz on a standard CPU. Moreover, by trading accuracy for efficiency, our network allows to infer maps at about 2 Hz and 40 Hz respectively, still being more accurate than most state-of-the-art slower methods. To the best of our knowledge, it is the first method enabling such performance on CPUs paving the way for effective deployment of unsupervised monocular depth estimation even on embedded systems. |
Tasks | Autonomous Navigation, Depth Estimation, Image Reconstruction, Monocular Depth Estimation |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1806.11430v3 |
http://arxiv.org/pdf/1806.11430v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-real-time-unsupervised-monocular |
Repo | https://github.com/00marco/pydnet-duplicate |
Framework | tf |
Integrating Transformer and Paraphrase Rules for Sentence Simplification
Title | Integrating Transformer and Paraphrase Rules for Sentence Simplification |
Authors | Sanqiang Zhao, Rui Meng, Daqing He, Saptono Andi, Parmanto Bambang |
Abstract | Sentence simplification aims to reduce the complexity of a sentence while retaining its original meaning. Current models for sentence simplification adopted ideas from ma- chine translation studies and implicitly learned simplification mapping rules from normal- simple sentence pairs. In this paper, we explore a novel model based on a multi-layer and multi-head attention architecture and we pro- pose two innovative approaches to integrate the Simple PPDB (A Paraphrase Database for Simplification), an external paraphrase knowledge base for simplification that covers a wide range of real-world simplification rules. The experiments show that the integration provides two major benefits: (1) the integrated model outperforms multiple state- of-the-art baseline models for sentence simplification in the literature (2) through analysis of the rule utilization, the model seeks to select more accurate simplification rules. The code and models used in the paper are available at https://github.com/ Sanqiang/text_simplification. |
Tasks | |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11193v1 |
http://arxiv.org/pdf/1810.11193v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-transformer-and-paraphrase-rules |
Repo | https://github.com/Sanqiang/text_simplification |
Framework | none |
Practical Text Classification With Large Pre-Trained Language Models
Title | Practical Text Classification With Large Pre-Trained Language Models |
Authors | Neel Kant, Raul Puri, Nikolai Yakovenko, Bryan Catanzaro |
Abstract | Multi-emotion sentiment classification is a natural language processing (NLP) problem with valuable use cases on real-world data. We demonstrate that large-scale unsupervised language modeling combined with finetuning offers a practical solution to this task on difficult datasets, including those with label class imbalance and domain-specific context. By training an attention-based Transformer network (Vaswani et al. 2017) on 40GB of text (Amazon reviews) (McAuley et al. 2015) and fine-tuning on the training set, our model achieves a 0.69 F1 score on the SemEval Task 1:E-c multi-dimensional emotion classification problem (Mohammad et al. 2018), based on the Plutchik wheel of emotions (Plutchik 1979). These results are competitive with state of the art models, including strong F1 scores on difficult (emotion) categories such as Fear (0.73), Disgust (0.77) and Anger (0.78), as well as competitive results on rare categories such as Anticipation (0.42) and Surprise (0.37). Furthermore, we demonstrate our application on a real world text classification task. We create a narrowly collected text dataset of real tweets on several topics, and show that our finetuned model outperforms general purpose commercially available APIs for sentiment and multidimensional emotion classification on this dataset by a significant margin. We also perform a variety of additional studies, investigating properties of deep learning architectures, datasets and algorithms for achieving practical multidimensional sentiment classification. Overall, we find that unsupervised language modeling and finetuning is a simple framework for achieving high quality results on real-world sentiment classification. |
Tasks | Emotion Classification, Language Modelling, Sentiment Analysis, Text Classification |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01207v1 |
http://arxiv.org/pdf/1812.01207v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-text-classification-with-large-pre |
Repo | https://github.com/NVIDIA/sentiment-discovery |
Framework | pytorch |
Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
Title | Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data |
Authors | Guokun Lai, Bohan Li, Guoqing Zheng, Yiming Yang |
Abstract | How to model distribution of sequential data, including but not limited to speech and human motions, is an important ongoing research problem. It has been demonstrated that model capacity can be significantly enhanced by introducing stochastic latent variables in the hidden states of recurrent neural networks. Simultaneously, WaveNet, equipped with dilated convolutions, achieves astonishing empirical performance in natural speech generation task. In this paper, we combine the ideas from both stochastic latent variables and dilated convolutions, and propose a new architecture to model sequential data, termed as Stochastic WaveNet, where stochastic latent variables are injected into the WaveNet structure. We argue that Stochastic WaveNet enjoys powerful distribution modeling capacity and the advantage of parallel training from dilated convolutions. In order to efficiently infer the posterior distribution of the latent variables, a novel inference network structure is designed based on the characteristics of WaveNet architecture. State-of-the-art performances on benchmark datasets are obtained by Stochastic WaveNet on natural speech modeling and high quality human handwriting samples can be generated as well. |
Tasks | |
Published | 2018-06-15 |
URL | http://arxiv.org/abs/1806.06116v1 |
http://arxiv.org/pdf/1806.06116v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-wavenet-a-generative-latent |
Repo | https://github.com/laiguokun/SWaveNet |
Framework | pytorch |
SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images
Title | SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images |
Authors | Yeonkun Lee, Jaeseok Jeong, Jongseob Yun, Wonjune Cho, Kuk-Jin Yoon |
Abstract | Omni-directional cameras have many advantages overconventional cameras in that they have a much wider field-of-view (FOV). Accordingly, several approaches have beenproposed recently to apply convolutional neural networks(CNNs) to omni-directional images for various visual tasks.However, most of them use image representations defined inthe Euclidean space after transforming the omni-directionalviews originally formed in the non-Euclidean space. Thistransformation leads to shape distortion due to nonuniformspatial resolving power and the loss of continuity. Theseeffects make existing convolution kernels experience diffi-culties in extracting meaningful information.This paper presents a novel method to resolve such prob-lems of applying CNNs to omni-directional images. Theproposed method utilizes a spherical polyhedron to rep-resent omni-directional views. This method minimizes thevariance of the spatial resolving power on the sphere sur-face, and includes new convolution and pooling methodsfor the proposed representation. The proposed method canalso be adopted by any existing CNN-based methods. Thefeasibility of the proposed method is demonstrated throughclassification, detection, and semantic segmentation taskswith synthetic and real datasets. |
Tasks | Semantic Segmentation |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08196v2 |
http://arxiv.org/pdf/1811.08196v2.pdf | |
PWC | https://paperswithcode.com/paper/spherephd-applying-cnns-on-a-spherical |
Repo | https://github.com/keevin60907/SpherePHD |
Framework | tf |
Dank Learning: Generating Memes Using Deep Neural Networks
Title | Dank Learning: Generating Memes Using Deep Neural Networks |
Authors | Abel L Peirson V, E Meltem Tolunay |
Abstract | We introduce a novel meme generation system, which given any image can produce a humorous and relevant caption. Furthermore, the system can be conditioned on not only an image but also a user-defined label relating to the meme template, giving a handle to the user on meme content. The system uses a pretrained Inception-v3 network to return an image embedding which is passed to an attention-based deep-layer LSTM model producing the caption - inspired by the widely recognised Show and Tell Model. We implement a modified beam search to encourage diversity in the captions. We evaluate the quality of our model using perplexity and human assessment on both the quality of memes generated and whether they can be differentiated from real ones. Our model produces original memes that cannot on the whole be differentiated from real ones. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.04510v1 |
http://arxiv.org/pdf/1806.04510v1.pdf | |
PWC | https://paperswithcode.com/paper/dank-learning-generating-memes-using-deep |
Repo | https://github.com/alpv95/MemeProject |
Framework | tf |
Matrix optimization on universal unitary photonic devices
Title | Matrix optimization on universal unitary photonic devices |
Authors | Sunil Pai, Ben Bartlett, Olav Solgaard, David A. B. Miller |
Abstract | Universal unitary photonic devices can apply arbitrary unitary transformations to a vector of input modes and provide a promising hardware platform for fast and energy-efficient machine learning using light. We simulate the gradient-based optimization of random unitary matrices on universal photonic devices composed of imperfect tunable interferometers. If device components are initialized uniform-randomly, the locally-interacting nature of the mesh components biases the optimization search space towards banded unitary matrices, limiting convergence to random unitary matrices. We detail a procedure for initializing the device by sampling from the distribution of random unitary matrices and show that this greatly improves convergence speed. We also explore mesh architecture improvements such as adding extra tunable beamsplitters or permuting waveguide layers to further improve the training speed and scalability of these devices. |
Tasks | |
Published | 2018-08-02 |
URL | https://arxiv.org/abs/1808.00458v3 |
https://arxiv.org/pdf/1808.00458v3.pdf | |
PWC | https://paperswithcode.com/paper/matrix-optimization-on-universal-unitary |
Repo | https://github.com/solgaardlab/neurophox-notebooks |
Framework | tf |
Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Title | Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces |
Authors | Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier, David Leroy, Clément Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, Maël Primet, Joseph Dureau |
Abstract | This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. The embedded inference is fast and accurate while enforcing privacy by design, as no personal user data is ever collected. Focusing on Automatic Speech Recognition and Natural Language Understanding, we detail our approach to training high-performance Machine Learning models that are small enough to run in real-time on small devices. Additionally, we describe a data generation procedure that provides sufficient, high-quality training data without compromising user privacy. |
Tasks | Speech Recognition, Spoken Language Understanding |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10190v3 |
http://arxiv.org/pdf/1805.10190v3.pdf | |
PWC | https://paperswithcode.com/paper/snips-voice-platform-an-embedded-spoken |
Repo | https://github.com/snipsco/snips-nlu |
Framework | none |
Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
Title | Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer |
Authors | Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, Cewu Lu |
Abstract | Human body part parsing, or human semantic part segmentation, is fundamental to many computer vision tasks. In conventional semantic segmentation methods, the ground truth segmentations are provided, and fully convolutional networks (FCN) are trained in an end-to-end scheme. Although these methods have demonstrated impressive results, their performance highly depends on the quantity and quality of training data. In this paper, we present a novel method to generate synthetic human part segmentation data using easily-obtained human keypoint annotations. Our key idea is to exploit the anatomical similarity among human to transfer the parsing results of a person to another person with similar pose. Using these estimated results as additional training data, our semi-supervised model outperforms its strong-supervised counterpart by 6 mIOU on the PASCAL-Person-Part dataset, and we achieve state-of-the-art human parsing results. Our approach is general and can be readily extended to other object/animal parsing task assuming that their anatomical similarity can be annotated by keypoints. The proposed model and accompanying source code are available at https://github.com/MVIG-SJTU/WSHP |
Tasks | Human Parsing, Human Part Segmentation, Semantic Segmentation, Transfer Learning |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04310v1 |
http://arxiv.org/pdf/1805.04310v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-and-semi-supervised-human-body-part |
Repo | https://github.com/MVIG-SJTU/WSHP |
Framework | pytorch |
Structural Agnostic Modeling: Adversarial Learning of Causal Graphs
Title | Structural Agnostic Modeling: Adversarial Learning of Causal Graphs |
Authors | Diviyan Kalainathan, Olivier Goudet, Isabelle Guyon, David Lopez-Paz, Michèle Sebag |
Abstract | A new causal discovery method, Structural Agnostic Modeling (SAM), is presented in this paper. Leveraging both conditional independencies and distributional asymmetries in the data, SAM aims at recovering full causal models from continuous observational data along a multivariate non-parametric setting. The approach is based on a game between $d$ players estimating each variable distribution conditionally to the others as a neural net, and an adversary aimed at discriminating the overall joint conditional distribution, and that of the original data. An original learning criterion combining distribution estimation, sparsity and acyclicity constraints is used to enforce the end-to-end optimization of the graph structure and parameters through stochastic gradient descent. Besides the theoretical analysis of the approach in the large sample limit, SAM is extensively experimentally validated on synthetic and real data. |
Tasks | Causal Discovery |
Published | 2018-03-13 |
URL | https://arxiv.org/abs/1803.04929v2 |
https://arxiv.org/pdf/1803.04929v2.pdf | |
PWC | https://paperswithcode.com/paper/sam-structural-agnostic-model-causal |
Repo | https://github.com/Diviyan-Kalainathan/SAMv1 |
Framework | pytorch |
BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images
Title | BubGAN: Bubble Generative Adversarial Networks for Synthesizing Realistic Bubbly Flow Images |
Authors | Yucheng Fu, Yang Liu |
Abstract | Bubble segmentation and size detection algorithms have been developed in recent years for their high efficiency and accuracy in measuring bubbly two-phase flows. In this work, we proposed an architecture called bubble generative adversarial networks (BubGAN) for the generation of realistic synthetic images which could be further used as training or benchmarking data for the development of advanced image processing algorithms. The BubGAN is trained initially on a labeled bubble dataset consisting of ten thousand images. By learning the distribution of these bubbles, the BubGAN can generate more realistic bubbles compared to the conventional models used in the literature. The trained BubGAN is conditioned on bubble feature parameters and has full control of bubble properties in terms of aspect ratio, rotation angle, circularity and edge ratio. A million bubble dataset is pre-generated using the trained BubGAN. One can then assemble realistic bubbly flow images using this dataset and associated image processing tool. These images contain detailed bubble information, therefore do not require additional manual labeling. This is more useful compared with the conventional GAN which generates images without labeling information. The tool could be used to provide benchmarking and training data for existing image processing algorithms and to guide the future development of bubble detecting algorithms. |
Tasks | |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02266v1 |
http://arxiv.org/pdf/1809.02266v1.pdf | |
PWC | https://paperswithcode.com/paper/bubgan-bubble-generative-adversarial-networks |
Repo | https://github.com/ycfu/BubGAN |
Framework | none |
Open Vocabulary Learning for Neural Chinese Pinyin IME
Title | Open Vocabulary Learning for Neural Chinese Pinyin IME |
Authors | Zhuosheng Zhang, Yafang Huang, Hai Zhao |
Abstract | Pinyin-to-character (P2C) conversion is the core component of pinyin-based Chinese input method engine (IME). However, the conversion is seriously compromised by the ambiguities of Chinese characters corresponding to pinyin as well as the predefined fixed vocabularies. To alleviate such inconveniences, we propose a neural P2C conversion model augmented by an online updated vocabulary with a sampling mechanism to support open vocabulary learning during IME working. Our experiments show that the proposed method outperforms commercial IMEs and state-of-the-art traditional models on standard corpus and true inputting history dataset in terms of multiple metrics and thus the online updated vocabulary indeed helps our IME effectively follows user inputting behavior. |
Tasks | |
Published | 2018-11-11 |
URL | https://arxiv.org/abs/1811.04352v4 |
https://arxiv.org/pdf/1811.04352v4.pdf | |
PWC | https://paperswithcode.com/paper/neural-based-pinyin-to-character-conversion |
Repo | https://github.com/cooelf/OpenIME |
Framework | none |
An AMR Aligner Tuned by Transition-based Parser
Title | An AMR Aligner Tuned by Transition-based Parser |
Authors | Yijia Liu, Wanxiang Che, Bo Zheng, Bing Qin, Ting Liu |
Abstract | In this paper, we propose a new rich resource enhanced AMR aligner which produces multiple alignments and a new transition system for AMR parsing along with its oracle parser. Our aligner is further tuned by our oracle parser via picking the alignment that leads to the highest-scored achievable AMR graph. Experimental results show that our aligner outperforms the rule-based aligner in previous work by achieving higher alignment F1 score and consistently improving two open-sourced AMR parsers. Based on our aligner and transition system, we develop a transition-based AMR parser that parses a sentence into its AMR graph directly. An ensemble of our parsers with only words and POS tags as input leads to 68.4 Smatch F1 score. |
Tasks | Amr Parsing |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.03541v1 |
http://arxiv.org/pdf/1810.03541v1.pdf | |
PWC | https://paperswithcode.com/paper/an-amr-aligner-tuned-by-transition-based |
Repo | https://github.com/Oneplus/tamr |
Framework | none |