January 29, 2020

2800 words 14 mins read

Paper Group ANR 538

Quantum Speedup in Adaptive Boosting of Binary Classification. Dying ReLU and Initialization: Theory and Numerical Examples. Graph Embedding VAE: A Permutation Invariant Model of Graph Structure. Learning event representations in image sequences by dynamic graph embedding. Representation Learning of EHR Data via Graph-Based Medical Entity Embedding …

Quantum Speedup in Adaptive Boosting of Binary Classification


Title	Quantum Speedup in Adaptive Boosting of Binary Classification
Authors	Ximing Wang, Yuechi Ma, Min-Hsiu Hsieh, Manhong Yung
Abstract	In classical machine learning, a set of weak classifiers can be adaptively combined to form a strong classifier for improving the overall performance, a technique called adaptive boosting (or AdaBoost). However, constructing the strong classifier for a large data set is typically resource consuming. Here we propose a quantum extension of AdaBoost, demonstrating a quantum algorithm that can output the optimal strong classifier with a quadratic speedup in the number of queries of the weak classifiers. Our results also include a generalization of the standard AdaBoost to the cases where the output of each classifier may be probabilistic even for the same input. We prove that the update rules and the query complexity of the non-deterministic classifiers are the same as those of deterministic classifiers, which may be of independent interest to the classical machine-learning community. Furthermore, the AdaBoost algorithm can also be applied to data encoded in the form of quantum states; we show how the training set can be simplified by using the tools of t-design. Our approach describes a model of quantum machine learning where quantum speedup is achieved in finding the optimal classifier, which can then be applied for classical machine-learning applications.
Tasks	Quantum Machine Learning
Published	2019-02-03
URL	http://arxiv.org/abs/1902.00869v1
PDF	http://arxiv.org/pdf/1902.00869v1.pdf
PWC	https://paperswithcode.com/paper/quantum-speedup-in-adaptive-boosting-of
Repo
Framework

Dying ReLU and Initialization: Theory and Numerical Examples


Title	Dying ReLU and Initialization: Theory and Numerical Examples
Authors	Lu Lu, Yeonjong Shin, Yanhui Su, George Em Karniadakis
Abstract	The dying ReLU refers to the problem when ReLU neurons become inactive and only output 0 for any input. There are many empirical and heuristic explanations of why ReLU neurons die. However, little is known about its theoretical analysis. In this paper, we rigorously prove that a deep ReLU network will eventually die in probability as the depth goes to infinite. Several methods have been proposed to alleviate the dying ReLU. Perhaps, one of the simplest treatments is to modify the initialization procedure. One common way of initializing weights and biases uses symmetric probability distributions, which suffers from the dying ReLU. We thus propose a new initialization procedure, namely, a randomized asymmetric initialization. We prove that the new initialization can effectively prevent the dying ReLU. All parameters required for the new initialization are theoretically designed. Numerical examples are provided to demonstrate the effectiveness of the new initialization procedure.
Tasks
Published	2019-03-15
URL	https://arxiv.org/abs/1903.06733v2
PDF	https://arxiv.org/pdf/1903.06733v2.pdf
PWC	https://paperswithcode.com/paper/dying-relu-and-initialization-theory-and
Repo
Framework

Graph Embedding VAE: A Permutation Invariant Model of Graph Structure


Title	Graph Embedding VAE: A Permutation Invariant Model of Graph Structure
Authors	Tony Duan, Juho Lee
Abstract	Generative models of graph structure have applications in biology and social sciences. The state of the art is GraphRNN, which decomposes the graph generation process into a series of sequential steps. While effective for modest sizes, it loses its permutation invariance for larger graphs. Instead, we present a permutation invariant latent-variable generative model relying on graph embeddings to encode structure. Using tools from the random graph literature, our model is highly scalable to large graphs with likelihood evaluation and generation in $O(V + E)$.
Tasks	Graph Embedding, Graph Generation
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08057v1
PDF	https://arxiv.org/pdf/1910.08057v1.pdf
PWC	https://paperswithcode.com/paper/graph-embedding-vae-a-permutation-invariant
Repo
Framework

Learning event representations in image sequences by dynamic graph embedding


Title	Learning event representations in image sequences by dynamic graph embedding
Authors	Mariella Dimiccoli, Herwig Wendt
Abstract	Recently, self-supervised learning has proved to be effective to learn representations of events in image sequences, where events are understood as sets of temporally adjacent images that are semantically perceived as a whole. However, although this approach does not require expensive manual annotations, it is data hungry and suffers from domain adaptation problems. As an alternative, in this work, we propose a novel approach for learning event representations named Dynamic Graph Embedding (DGE). The assumption underlying our model is that a sequence of images can be represented by a graph that encodes both semantic and temporal similarity. The key novelty of DGE is to learn jointly the graph and its graph embedding. At its core, DGE works by iterating over two steps: 1) updating the graph representing the semantic and temporal structure of the data based on the current data representation, and 2) updating the data representation to take into account the current data graph structure. The main advantage of DGE over state-of-the-art self-supervised approaches is that it does not require any training set, but instead learns iteratively from the data itself a low-dimensional embedding that reflects their temporal and semantic structure. Experimental results on two benchmark datasets of real image sequences captured at regular intervals demonstrate that the proposed DGE leads to effective event representations. In particular, it achieves robust temporal segmentation on the EDUBSeg and EDUBSeg-Desc benchmark datasets, outperforming the state of the art.
Tasks	Domain Adaptation, Graph Embedding
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03483v1
PDF	https://arxiv.org/pdf/1910.03483v1.pdf
PWC	https://paperswithcode.com/paper/learning-event-representations-in-image
Repo
Framework

Representation Learning of EHR Data via Graph-Based Medical Entity Embedding


Title	Representation Learning of EHR Data via Graph-Based Medical Entity Embedding
Authors	Tong Wu, Yunlong Wang, Yue Wang, Emily Zhao, Yilian Yuan, Zhi Yang
Abstract	Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare informatics that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning low-dimensional vectors of the most common entities in EHR: medical services, doctors, and patients. ME2Vec leverages diverse graph embedding techniques to cater for the unique characteristic of each medical entity. Using real-world clinical data, we demonstrate the efficacy of ME2Vec over competitive baselines on disease diagnosis prediction.
Tasks	Graph Embedding, Representation Learning
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02574v1
PDF	https://arxiv.org/pdf/1910.02574v1.pdf
PWC	https://paperswithcode.com/paper/representation-learning-of-ehr-data-via-graph
Repo
Framework

SFSegNet: Parse Freehand Sketches using Deep Fully Convolutional Networks


Title	SFSegNet: Parse Freehand Sketches using Deep Fully Convolutional Networks
Authors	Junkun Jiang, Ruomei Wang, Shujin Lin, Fei Wang
Abstract	Parsing sketches via semantic segmentation is attractive but challenging, because (i) free-hand drawings are abstract with large variances in depicting objects due to different drawing styles and skills; (ii) distorting lines drawn on the touchpad make sketches more difficult to be recognized; (iii) the high-performance image segmentation via deep learning technologies needs enormous annotated sketch datasets during the training stage. In this paper, we propose a Sketch-target deep FCN Segmentation Network(SFSegNet) for automatic free-hand sketch segmentation, labeling each sketch in a single object with multiple parts. SFSegNet has an end-to-end network process between the input sketches and the segmentation results, composed of 2 parts: (i) a modified deep Fully Convolutional Network(FCN) using a reweighting strategy to ignore background pixels and classify which part each pixel belongs to; (ii) affine transform encoders that attempt to canonicalize the shaking strokes. We train our network with the dataset that consists of 10,000 annotated sketches, to find an extensively applicable model to segment stokes semantically in one ground truth. Extensive experiments are carried out and segmentation results show that our method outperforms other state-of-the-art networks.
Tasks	Semantic Segmentation
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05389v1
PDF	https://arxiv.org/pdf/1908.05389v1.pdf
PWC	https://paperswithcode.com/paper/sfsegnet-parse-freehand-sketches-using-deep
Repo
Framework

Mean-field Analysis of Batch Normalization


Title	Mean-field Analysis of Batch Normalization
Authors	Mingwei Wei, James Stokes, David J Schwab
Abstract	Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence. In this paper, we use mean-field theory to analytically quantify the impact of BatchNorm on the geometry of the loss landscape for multi-layer networks consisting of fully-connected and convolutional layers. We show that it has a flattening effect on the loss landscape, as quantified by the maximum eigenvalue of the Fisher Information Matrix. These findings are then used to justify the use of larger learning rates for networks that use BatchNorm, and we provide quantitative characterization of the maximal allowable learning rate to ensure convergence. Experiments support our theoretically predicted maximum learning rate, and furthermore suggest that networks with smaller values of the BatchNorm parameter achieve lower loss after the same number of epochs of training.
Tasks
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02606v1
PDF	http://arxiv.org/pdf/1903.02606v1.pdf
PWC	https://paperswithcode.com/paper/mean-field-analysis-of-batch-normalization
Repo
Framework

Hybrid Neural Models For Sequence Modelling: The Best Of Three Worlds


Title	Hybrid Neural Models For Sequence Modelling: The Best Of Three Worlds
Authors	Marco Dinarelli, Loïc Grobol
Abstract	We propose a neural architecture with the main characteristics of the most successful neural models of the last years: bidirectional RNNs, encoder-decoder, and the Transformer model. Evaluation on three sequence labelling tasks yields results that are close to the state-of-the-art for all tasks and better than it for some of them, showing the pertinence of this hybrid architecture for this kind of tasks.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07102v1
PDF	https://arxiv.org/pdf/1909.07102v1.pdf
PWC	https://paperswithcode.com/paper/hybrid-neural-models-for-sequence-modelling
Repo
Framework

RoPAD: Robust Presentation Attack Detection through Unsupervised Adversarial Invariance


Title	RoPAD: Robust Presentation Attack Detection through Unsupervised Adversarial Invariance
Authors	Ayush Jaiswal, Shuai Xia, Iacopo Masi, Wael AbdAlmageed
Abstract	For enterprise, personal and societal applications, there is now an increasing demand for automated authentication of identity from images using computer vision. However, current authentication technologies are still vulnerable to presentation attacks. We present RoPAD, an end-to-end deep learning model for presentation attack detection that employs unsupervised adversarial invariance to ignore visual distractors in images for increased robustness and reduced overfitting. Experiments show that the proposed framework exhibits state-of-the-art performance on presentation attack detection on several benchmark datasets.
Tasks
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03691v2
PDF	http://arxiv.org/pdf/1903.03691v2.pdf
PWC	https://paperswithcode.com/paper/ropad-robust-presentation-attack-detection
Repo
Framework

Informative Image Captioning with External Sources of Information


Title	Informative Image Captioning with External Sources of Information
Authors	Sanqiang Zhao, Piyush Sharma, Tomer Levinboim, Radu Soricut
Abstract	An image caption should fluently present the essential information in a given image, including informative, fine-grained entity mentions and the manner in which these entities interact. However, current captioning models are usually trained to generate captions that only contain common object names, thus falling short on an important “informativeness” dimension. We present a mechanism for integrating image information together with fine-grained labels (assumed to be generated by some upstream models) into a caption that describes the image in a fluent and informative manner. We introduce a multimodal, multi-encoder model based on Transformer that ingests both image features and multiple sources of entity labels. We demonstrate that we can learn to control the appearance of these entity labels in the output, resulting in captions that are both fluent and informative.
Tasks	Image Captioning
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08876v1
PDF	https://arxiv.org/pdf/1906.08876v1.pdf
PWC	https://paperswithcode.com/paper/informative-image-captioning-with-external
Repo
Framework

A Neural Rendering Framework for Free-Viewpoint Relighting


Title	A Neural Rendering Framework for Free-Viewpoint Relighting
Authors	Zhang Chen, Anpei Chen, Guli Zhang, Chengyuan Wang, Yu Ji, Kiriakos N. Kutulakos, Jingyi Yu
Abstract	We present a novel Relightable Neural Renderer (RNR) for simultaneous view synthesis and relighting using multi-view image inputs. Existing neural rendering (NR) does not explicitly model the physical rendering process and hence has limited capabilities on relighting. RNR instead models image formation in terms of environment lighting, object intrinsic attributes, and the light transport function (LTF), each corresponding to a learnable component. In particular, the incorporation of a physically based rendering process not only enables relighting but also improves the quality of novel view synthesis. Comprehensive experiments on synthetic and real data show that RNR provides a practical and effective solution for conducting free-viewpoint relighting.
Tasks	Novel View Synthesis
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11530v1
PDF	https://arxiv.org/pdf/1911.11530v1.pdf
PWC	https://paperswithcode.com/paper/a-neural-rendering-framework-for-free
Repo
Framework

Enumerative Data Compression with Non-Uniquely Decodable Codes


Title	Enumerative Data Compression with Non-Uniquely Decodable Codes
Authors	M. Oğuzhan Külekci, Yasin Öztürk, Elif Altunok, Can Altıniğne
Abstract	Non-uniquely decodable codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non-prefix-free codes, where a codeword can be a prefix of other(s), and thus, the codeword boundary information is essential for correct decoding. Although the codeword bit stream consumes significantly less space when compared to prefix–free codes, the additional disambiguation information makes it difficult to catch the performance of prefix-free codes in total. Previous studies considered compression with non-prefix-free codes by integrating rank/select dictionaries or wavelet trees to mark the code-word boundaries. In this study we focus on another dimension with a block–wise enumeration scheme that improves the compression ratios of the previous studies significantly. Experiments conducted on a known corpus showed that the proposed scheme successfully represents a source within its entropy, even performing better than the Huffman and arithmetic coding in some cases. The non-uniquely decodable codes also provides an intrinsic security feature due to lack of unique-decodability. We investigate this dimension as an opportunity to provide compressed data security without (or with less) encryption, and discuss various possible practical advantages supported by such codes.
Tasks
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05676v1
PDF	https://arxiv.org/pdf/1911.05676v1.pdf
PWC	https://paperswithcode.com/paper/enumerative-data-compression-with-non
Repo
Framework


Title	Predicting the Role of Political Trolls in Social Media
Authors	Atanas Atanasov, Gianmarco De Francisci Morales, Preslav Nakov
Abstract	We investigate the political roles of “Internet trolls” in social media. Political trolls, such as the ones linked to the Russian Internet Research Agency (IRA), have recently gained enormous attention for their ability to sway public opinion and even influence elections. Analysis of the online traces of trolls has shown different behavioral patterns, which target different slices of the population. However, this analysis is manual and labor-intensive, thus making it impractical as a first-response tool for newly-discovered troll farms. In this paper, we show how to automate this analysis by using machine learning in a realistic setting. In particular, we show how to classify trolls according to their political role —left, news feed, right— by using features extracted from social media, i.e., Twitter, in two scenarios: (i) in a traditional supervised learning scenario, where labels for trolls are available, and (ii) in a distant supervision scenario, where labels for trolls are not available, and we rely on more-commonly-available labels for news outlets mentioned by the trolls. Technically, we leverage the community structure and the text of the messages in the online social network of trolls represented as a graph, from which we extract several types of learned representations, i.e.,~embeddings, for the trolls. Experiments on the “IRA Russian Troll” dataset show that our methodology improves over the state-of-the-art in the first scenario, while providing a compelling case for the second scenario, which has not been explored in the literature thus far.
Tasks
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02001v1
PDF	https://arxiv.org/pdf/1910.02001v1.pdf
PWC	https://paperswithcode.com/paper/predicting-the-role-of-political-trolls-in
Repo
Framework

Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards


Title	Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards
Authors	Yuqing Song, Shizhe Chen, Yida Zhao, Qin Jin
Abstract	Generating image descriptions in different languages is essential to satisfy users worldwide. However, it is prohibitively expensive to collect large-scale paired image-caption dataset for every target language which is critical for training descent image captioning models. Previous works tackle the unpaired cross-lingual image captioning problem through a pivot language, which is with the help of paired image-caption data in the pivot language and pivot-to-target machine translation models. However, such language-pivoted approach suffers from inaccuracy brought by the pivot-to-target translation, including disfluency and visual irrelevancy errors. In this paper, we propose to generate cross-lingual image captions with self-supervised rewards in the reinforcement learning framework to alleviate these two types of errors. We employ self-supervision from mono-lingual corpus in the target language to provide fluency reward, and propose a multi-level visual semantic matching model to provide both sentence-level and concept-level visual relevancy rewards. We conduct extensive experiments for unpaired cross-lingual image captioning in both English and Chinese respectively on two widely used image caption corpora. The proposed approach achieves significant performance improvement over state-of-the-art methods.
Tasks	Image Captioning, Machine Translation
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05407v1
PDF	https://arxiv.org/pdf/1908.05407v1.pdf
PWC	https://paperswithcode.com/paper/unpaired-cross-lingual-image-caption
Repo
Framework

Face Image Reflection Removal


Title	Face Image Reflection Removal
Authors	Renjie Wan, Boxin Shi, Haoliang Li, Ling-Yu Duan, Alex C. Kot
Abstract	Face images captured through the glass are usually contaminated by reflections. The non-transmitted reflections make the reflection removal more challenging than for general scenes, because important facial features are completely occluded. In this paper, we propose and solve the face image reflection removal problem. We remove non-transmitted reflections by incorporating inpainting ideas into a guided reflection removal framework and recover facial features by considering various face-specific priors. We use a newly collected face reflection image dataset to train our model and compare with state-of-the-art methods. The proposed method shows advantages in estimating reflection-free face images for improving face recognition.
Tasks	Face Recognition
Published	2019-03-03
URL	http://arxiv.org/abs/1903.00865v1
PDF	http://arxiv.org/pdf/1903.00865v1.pdf
PWC	https://paperswithcode.com/paper/face-image-reflection-removal
Repo
Framework