October 21, 2019

3284 words 16 mins read

Paper Group AWR 28

Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages. Path Aggregation Network for Instance Segmentation. Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN. ScratchDet: Training Single-Shot Object Detectors from Scratch. TSViz: Demystification of Deep Learning Models for Time-Series Analysis. Latent- …

Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages


Title	Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages
Authors	Shyam Upadhyay, Jordan Kodner, Dan Roth
Abstract	Generating the English transliteration of a name written in a foreign script is an important and challenging step in multilingual knowledge acquisition and information extraction. Existing approaches to transliteration generation require a large (>5000) number of training examples. This difficulty contrasts with transliteration discovery, a somewhat easier task that involves picking a plausible transliteration from a given list. In this work, we present a bootstrapping algorithm that uses constrained discovery to improve generation, and can be used with as few as 500 training examples, which we show can be sourced from annotators in a matter of hours. This opens the task to languages for which large number of training examples are unavailable. We evaluate transliteration generation performance itself, as well the improvement it brings to cross-lingual candidate generation for entity linking, a typical downstream task. We present a comprehensive evaluation of our approach on nine languages, each written in a unique script.
Tasks	Entity Linking, Transliteration
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07807v1
PDF	http://arxiv.org/pdf/1809.07807v1.pdf
PWC	https://paperswithcode.com/paper/bootstrapping-transliteration-with
Repo	https://github.com/shyamupa/hma-translit
Framework	pytorch

Path Aggregation Network for Instance Segmentation


Title	Path Aggregation Network for Instance Segmentation
Authors	Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia
Abstract	The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework. Specifically, we enhance the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path between lower layers and topmost feature. We present adaptive feature pooling, which links feature grid and all feature levels to make useful information in each feature level propagate directly to following proposal subnetworks. A complementary branch capturing different views for each proposal is created to further improve mask prediction. These improvements are simple to implement, with subtle extra computational overhead. Our PANet reaches the 1st place in the COCO 2017 Challenge Instance Segmentation task and the 2nd place in Object Detection task without large-batch training. It is also state-of-the-art on MVD and Cityscapes. Code is available at https://github.com/ShuLiu1993/PANet
Tasks	Instance Segmentation, Object Detection, Semantic Segmentation
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01534v4
PDF	http://arxiv.org/pdf/1803.01534v4.pdf
PWC	https://paperswithcode.com/paper/path-aggregation-network-for-instance
Repo	https://github.com/YuefeiZ/PANet
Framework	tf

Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN


Title	Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN
Authors	Shiyi Lan, Ruichi Yu, Gang Yu, Larry S. Davis
Abstract	Recent advances in deep convolutional neural networks (CNNs) have motivated researchers to adapt CNNs to directly model points in 3D point clouds. Modeling local structure has been proven to be important for the success of convolutional architectures, and researchers exploited the modeling of local point sets in the feature extraction hierarchy. However, limited attention has been paid to explicitly model the geometric structure amongst points in a local region. To address this problem, we propose Geo-CNN, which applies a generic convolution-like operation dubbed as GeoConv to each point and its local neighborhood. Local geometric relationships among points are captured when extracting edge features between the center and its neighboring points. We first decompose the edge feature extraction process onto three orthogonal bases, and then aggregate the extracted features based on the angles between the edge vector and the bases. This encourages the network to preserve the geometric structure in Euclidean space throughout the feature extraction hierarchy. GeoConv is a generic and efficient operation that can be easily integrated into 3D point cloud analysis pipelines for multiple applications. We evaluate Geo-CNN on ModelNet40 and KITTI and achieve state-of-the-art performance.
Tasks	Modeling Local Geometric Structure
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07782v1
PDF	http://arxiv.org/pdf/1811.07782v1.pdf
PWC	https://paperswithcode.com/paper/modeling-local-geometric-structure-of-3d
Repo	https://github.com/voidrank/Geo-CNN
Framework	tf

ScratchDet: Training Single-Shot Object Detectors from Scratch


Title	ScratchDet: Training Single-Shot Object Detectors from Scratch
Authors	Rui Zhu, Shifeng Zhang, Xiaobo Wang, Longyin Wen, Hailin Shi, Liefeng Bo, Tao Mei
Abstract	Current state-of-the-art object objectors are fine-tuned from the off-the-shelf networks pretrained on large-scale classification dataset ImageNet, which incurs some additional problems: 1) The classification and detection have different degrees of sensitivity to translation, resulting in the learning objective bias; 2) The architecture is limited by the classification network, leading to the inconvenience of modification. To cope with these problems, training detectors from scratch is a feasible solution. However, the detectors trained from scratch generally perform worse than the pretrained ones, even suffer from the convergence issue in training. In this paper, we explore to train object detectors from scratch robustly. By analysing the previous work on optimization landscape, we find that one of the overlooked points in current trained-from-scratch detector is the BatchNorm. Resorting to the stable and predictable gradient brought by BatchNorm, detectors can be trained from scratch stably while keeping the favourable performance independent to the network architecture. Taking this advantage, we are able to explore various types of networks for object detection, without suffering from the poor convergence. By extensive experiments and analyses on downsampling factor, we propose the Root-ResNet backbone network, which makes full use of the information from original images. Our ScratchDet achieves the state-of-the-art accuracy on PASCAL VOC 2007, 2012 and MS COCO among all the train-from-scratch detectors and even performs better than several one-stage pretrained methods. Codes will be made publicly available at https://github.com/KimSoybean/ScratchDet.
Tasks	Object Detection
Published	2018-10-19
URL	https://arxiv.org/abs/1810.08425v4
PDF	https://arxiv.org/pdf/1810.08425v4.pdf
PWC	https://paperswithcode.com/paper/scratchdetexploring-to-train-single-shot
Repo	https://github.com/KimSoybean/ScratchDet
Framework	pytorch

TSViz: Demystification of Deep Learning Models for Time-Series Analysis


Title	TSViz: Demystification of Deep Learning Models for Time-Series Analysis
Authors	Shoaib Ahmed Siddiqui, Dominik Mercier, Mohsin Munir, Andreas Dengel, Sheraz Ahmed
Abstract	This paper presents a novel framework for demystification of convolutional deep learning models for time-series analysis. This is a step towards making informed/explainable decisions in the domain of time-series, powered by deep learning. There have been numerous efforts to increase the interpretability of image-centric deep neural network models, where the learned features are more intuitive to visualize. Visualization in time-series domain is much more complicated as there is no direct interpretation of the filters and inputs as compared to the image modality. In addition, little or no concentration has been devoted for the development of such tools in the domain of time-series in the past. TSViz provides possibilities to explore and analyze a network from different dimensions at different levels of abstraction which includes identification of parts of the input that were responsible for a prediction (including per filter saliency), importance of different filters present in the network for a particular prediction, notion of diversity present in the network through filter clustering, understanding of the main sources of variation learnt by the network through inverse optimization, and analysis of the network’s robustness against adversarial noise. As a sanity check for the computed influence values, we demonstrate results regarding pruning of neural networks based on the computed influence information. These representations allow to understand the network features so that the acceptability of deep networks for time-series data can be enhanced. This is extremely important in domains like finance, industry 4.0, self-driving cars, health-care, counter-terrorism etc., where reasons for reaching a particular prediction are equally important as the prediction itself. We assess the proposed framework for interpretability with a set of desirable properties essential for any method.
Tasks	Self-Driving Cars, Time Series, Time Series Analysis
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02952v2
PDF	http://arxiv.org/pdf/1802.02952v2.pdf
PWC	https://paperswithcode.com/paper/tsviz-demystification-of-deep-learning-models
Repo	https://github.com/shoaibahmed/TSViz-Core
Framework	tf

Latent-space Physics: Towards Learning the Temporal Evolution of Fluid Flow


Title	Latent-space Physics: Towards Learning the Temporal Evolution of Fluid Flow
Authors	Steffen Wiewel, Moritz Becher, Nils Thuerey
Abstract	We propose a method for the data-driven inference of temporal evolutions of physical functions with deep learning. More specifically, we target fluid flows, i.e. Navier-Stokes problems, and we propose a novel LSTM-based approach to predict the changes of pressure fields over time. The central challenge in this context is the high dimensionality of Eulerian space-time data sets. We demonstrate for the first time that dense 3D+time functions of physics system can be predicted within the latent spaces of neural networks, and we arrive at a neural-network based simulation algorithm with significant practical speed-ups. We highlight the capabilities of our method with a series of complex liquid simulations, and with a set of single-phase buoyancy simulations. With a set of trained networks, our method is more than two orders of magnitudes faster than a traditional pressure solver. Additionally, we present and discuss a series of detailed evaluations for the different components of our algorithm.
Tasks	Dimensionality Reduction
Published	2018-02-27
URL	http://arxiv.org/abs/1802.10123v3
PDF	http://arxiv.org/pdf/1802.10123v3.pdf
PWC	https://paperswithcode.com/paper/latent-space-physics-towards-learning-the
Repo	https://github.com/wiewel/LatentSpacePhysics
Framework	tf

Information-Theoretic Active Learning for Content-Based Image Retrieval


Title	Information-Theoretic Active Learning for Content-Based Image Retrieval
Authors	Björn Barz, Christoph Käding, Joachim Denzler
Abstract	We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval. Instead of combining different heuristics such as uncertainty, diversity, or density, our method is based on maximizing the mutual information between the predicted relevance of the images and the expected user feedback regarding the selected batch. We propose suitable approximations to this computationally demanding problem and also integrate an explicit model of user behavior that accounts for possible incorrect labels and unnameable instances. Furthermore, our approach does not only take the structure of the data but also the expected model output change caused by the user feedback into account. In contrast to other methods, ITAL turns out to be highly flexible and provides state-of-the-art performance across various datasets, such as MIRFLICKR and ImageNet.
Tasks	Active Learning, Content-Based Image Retrieval, Image Retrieval
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02337v2
PDF	http://arxiv.org/pdf/1809.02337v2.pdf
PWC	https://paperswithcode.com/paper/information-theoretic-active-learning-for
Repo	https://github.com/cvjena/ITAL
Framework	none

Learning Deep Generative Models of Graphs


Title	Learning Deep Generative Models of Graphs
Authors	Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia
Abstract	Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry. Here we introduce a powerful new approach for learning generative models over graphs, which can capture both their structure and attributes. Our approach uses graph neural networks to express probabilistic dependencies among a graph’s nodes and edges, and can, in principle, learn distributions over any arbitrary graph. In a series of experiments our results show that once trained, our models can generate good quality samples of both synthetic graphs as well as real molecular graphs, both unconditionally and conditioned on data. Compared to baselines that do not use graph-structured representations, our models often perform far better. We also explore key challenges of learning generative models of graphs, such as how to handle symmetries and ordering of elements during the graph generation process, and offer possible solutions. Our work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures.
Tasks	Graph Generation, Knowledge Graphs
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03324v1
PDF	http://arxiv.org/pdf/1803.03324v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-generative-models-of-graphs
Repo	https://github.com/snap-stanford/GraphRNN
Framework	pytorch

Differentially Private Generative Adversarial Network


Title	Differentially Private Generative Adversarial Network
Authors	Liyang Xie, Kaixiang Lin, Shu Wang, Fei Wang, Jiayu Zhou
Abstract	Generative Adversarial Network (GAN) and its variants have recently attracted intensive research interests due to their elegant theoretical foundation and excellent empirical performance as generative models. These tools provide a promising direction in the studies where data availability is limited. One common issue in GANs is that the density of the learned generative distribution could concentrate on the training data points, meaning that they can easily remember training samples due to the high model complexity of deep networks. This becomes a major concern when GANs are applied to private or sensitive data such as patient medical records, and the concentration of distribution may divulge critical patient information. To address this issue, in this paper we propose a differentially private GAN (DPGAN) model, in which we achieve differential privacy in GANs by adding carefully designed noise to gradients during the learning procedure. We provide rigorous proof for the privacy guarantee, as well as comprehensive empirical evidence to support our analysis, where we demonstrate that our method can generate high quality data points at a reasonable privacy level.
Tasks
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06739v1
PDF	http://arxiv.org/pdf/1802.06739v1.pdf
PWC	https://paperswithcode.com/paper/differentially-private-generative-adversarial-1
Repo	https://github.com/illidanlab/dpgan
Framework	tf

Three hypergraph eigenvector centralities


Title	Three hypergraph eigenvector centralities
Authors	Austin R. Benson
Abstract	Eigenvector centrality is a standard network analysis tool for determining the importance of (or ranking of) entities in a connected system that is represented by a graph. However, many complex systems and datasets have natural multi-way interactions that are more faithfully modeled by a hypergraph. Here we extend the notion of graph eigenvector centrality to uniform hypergraphs. Traditional graph eigenvector centralities are given by a positive eigenvector of the adjacency matrix, which is guaranteed to exist by the Perron-Frobenius theorem under some mild conditions. The natural representation of a hypergraph is a hypermatrix (colloquially, a tensor). Using recently established Perron-Frobenius theory for tensors, we develop three tensor eigenvectors centralities for hypergraphs, each with different interpretations. We show that these centralities can reveal different information on real-world data by analyzing hypergraphs constructed from n-gram frequencies, co-tagging on stack exchange, and drug combinations observed in patient emergency room visits.
Tasks
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09644v3
PDF	http://arxiv.org/pdf/1807.09644v3.pdf
PWC	https://paperswithcode.com/paper/three-hypergraph-eigenvector-centralities
Repo	https://github.com/arbenson/Hyper-Evec-Centrality
Framework	none

Using Syntax to Ground Referring Expressions in Natural Images


Title	Using Syntax to Ground Referring Expressions in Natural Images
Authors	Volkan Cirik, Taylor Berg-Kirkpatrick, Louis-Philippe Morency
Abstract	We introduce GroundNet, a neural network for referring expression recognition – the task of localizing (or grounding) in an image the object referred to by a natural language expression. Our approach to this task is the first to rely on a syntactic analysis of the input referring expression in order to inform the structure of the computation graph. Given a parse tree for an input expression, we explicitly map the syntactic constituents and relationships present in the tree to a composed graph of neural modules that defines our architecture for performing localization. This syntax-based approach aids localization of \textit{both} the target object and auxiliary supporting objects mentioned in the expression. As a result, GroundNet is more interpretable than previous methods: we can (1) determine which phrase of the referring expression points to which object in the image and (2) track how the localization of the target object is determined by the network. We study this property empirically by introducing a new set of annotations on the GoogleRef dataset to evaluate localization of supporting objects. Our experiments show that GroundNet achieves state-of-the-art accuracy in identifying supporting objects, while maintaining comparable performance in the localization of target objects.
Tasks
Published	2018-05-26
URL	http://arxiv.org/abs/1805.10547v1
PDF	http://arxiv.org/pdf/1805.10547v1.pdf
PWC	https://paperswithcode.com/paper/using-syntax-to-ground-referring-expressions
Repo	https://github.com/volkancirik/groundnet
Framework	pytorch

Unifying Probabilistic Models for Time-Frequency Analysis


Title	Unifying Probabilistic Models for Time-Frequency Analysis
Authors	William J. Wilkinson, Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, Arno Solin
Abstract	In audio signal processing, probabilistic time-frequency models have many benefits over their non-probabilistic counterparts. They adapt to the incoming signal, quantify uncertainty, and measure correlation between the signal’s amplitude and phase information, making time domain resynthesis straightforward. However, these models are still not widely used since they come at a high computational cost, and because they are formulated in such a way that it can be difficult to interpret all the modelling assumptions. By showing their equivalence to Spectral Mixture Gaussian processes, we illuminate the underlying model assumptions and provide a general framework for constructing more complex models that better approximate real-world signals. Our interpretation makes it intuitive to inspect, compare, and alter the models since all prior knowledge is encoded in the Gaussian process kernel functions. We utilise a state space representation to perform efficient inference via Kalman smoothing, and we demonstrate how our interpretation allows for efficient parameter learning in the frequency domain.
Tasks	Gaussian Processes
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02489v6
PDF	http://arxiv.org/pdf/1811.02489v6.pdf
PWC	https://paperswithcode.com/paper/unifying-probabilistic-models-for-time
Repo	https://github.com/wil-j-wil/unifying-prob-time-freq
Framework	none

Multi-Mapping Image-to-Image Translation with Central Biasing Normalization


Title	Multi-Mapping Image-to-Image Translation with Central Biasing Normalization
Authors	Xiaoming Yu, Zhenqiang Ying, Thomas Li, Shan Liu, Ge Li
Abstract	Image-to-image translation is a class of image processing and vision problems that translates an image to a different style or domain. To improve the capacity and performance of one-to-one translation models, multi-mapping image translation have been attempting to extend them for multiple mappings by injecting latent code. Through the analysis of the existing latent code injection models, we find that latent code can determine the target mapping of a generator by controlling the output statistical properties, especially the mean value. However, we find that in some cases the normalization will reduce the consistency of same mapping or the diversity of different mappings. After mathematical analysis, we find the reason behind that is that the distributions of same mapping become inconsistent after batch normalization, and that the effects of latent code are eliminated after instance normalization. To solve these problems, we propose consistency within diversity design criteria for multi-mapping networks. Based on the design criteria, we propose central biasing normalization (CBN) to replace existing latent code injection. CBN can be easily integrated into existing multi-mapping models, significantly reducing model parameters. Experiments show that the results of our method is more stable and diverse than that of existing models. https://github.com/Xiaoming-Yu/cbn .
Tasks	Image-to-Image Translation
Published	2018-06-26
URL	http://arxiv.org/abs/1806.10050v4
PDF	http://arxiv.org/pdf/1806.10050v4.pdf
PWC	https://paperswithcode.com/paper/multi-mapping-image-to-image-translation-with
Repo	https://github.com/Xiaoming-Yu/cbn
Framework	pytorch

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods


Title	Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods
Authors	Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang
Abstract	We introduce a new benchmark, WinoBias, for coreference resolution focused on gender bias. Our corpus contains Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). We demonstrate that a rule-based, a feature-rich, and a neural coreference system all link gendered pronouns to pro-stereotypical entities with higher accuracy than anti-stereotypical entities, by an average difference of 21.1 in F1 score. Finally, we demonstrate a data-augmentation approach that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by these systems in WinoBias without significantly affecting their performance on existing coreference benchmark datasets. Our dataset and code are available at http://winobias.org.
Tasks	Coreference Resolution, Data Augmentation
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06876v1
PDF	http://arxiv.org/pdf/1804.06876v1.pdf
PWC	https://paperswithcode.com/paper/gender-bias-in-coreference-resolution-1
Repo	https://github.com/uclanlp/corefBias
Framework	none

Building an Ellipsis-aware Chinese Dependency Treebank for Web Text


Title	Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Authors	Xuancheng Ren, Xu Sun, Ji Wen, Bingzhen Wei, Weidong Zhan, Zhiyuan Zhang
Abstract	Web 2.0 has brought with it numerous user-produced data revealing one’s thoughts, experiences, and knowledge, which are a great source for many tasks, such as information extraction, and knowledge base construction. However, the colloquial nature of the texts poses new challenges for current natural language processing techniques, which are more adapt to the formal form of the language. Ellipsis is a common linguistic phenomenon that some words are left out as they are understood from the context, especially in oral utterance, hindering the improvement of dependency parsing, which is of great importance for tasks relied on the meaning of the sentence. In order to promote research in this area, we are releasing a Chinese dependency treebank of 319 weibos, containing 572 sentences with omissions restored and contexts reserved.
Tasks	Dependency Parsing
Published	2018-01-20
URL	http://arxiv.org/abs/1801.06613v2
PDF	http://arxiv.org/pdf/1801.06613v2.pdf
PWC	https://paperswithcode.com/paper/building-an-ellipsis-aware-chinese-dependency
Repo	https://github.com/lancopku/Chinese-Dependency-Treebank-with-Ellipsis
Framework	none