Paper Group AWR 28
Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages. Path Aggregation Network for Instance Segmentation. Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN. ScratchDet: Training Single-Shot Object Detectors from Scratch. TSViz: Demystification of Deep Learning Models for Time-Series Analysis. Latent- …
Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages
Title | Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages |
Authors | Shyam Upadhyay, Jordan Kodner, Dan Roth |
Abstract | Generating the English transliteration of a name written in a foreign script is an important and challenging step in multilingual knowledge acquisition and information extraction. Existing approaches to transliteration generation require a large (>5000) number of training examples. This difficulty contrasts with transliteration discovery, a somewhat easier task that involves picking a plausible transliteration from a given list. In this work, we present a bootstrapping algorithm that uses constrained discovery to improve generation, and can be used with as few as 500 training examples, which we show can be sourced from annotators in a matter of hours. This opens the task to languages for which large number of training examples are unavailable. We evaluate transliteration generation performance itself, as well the improvement it brings to cross-lingual candidate generation for entity linking, a typical downstream task. We present a comprehensive evaluation of our approach on nine languages, each written in a unique script. |
Tasks | Entity Linking, Transliteration |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07807v1 |
http://arxiv.org/pdf/1809.07807v1.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-transliteration-with |
Repo | https://github.com/shyamupa/hma-translit |
Framework | pytorch |
Path Aggregation Network for Instance Segmentation
Title | Path Aggregation Network for Instance Segmentation |
Authors | Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, Jiaya Jia |
Abstract | The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework. Specifically, we enhance the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path between lower layers and topmost feature. We present adaptive feature pooling, which links feature grid and all feature levels to make useful information in each feature level propagate directly to following proposal subnetworks. A complementary branch capturing different views for each proposal is created to further improve mask prediction. These improvements are simple to implement, with subtle extra computational overhead. Our PANet reaches the 1st place in the COCO 2017 Challenge Instance Segmentation task and the 2nd place in Object Detection task without large-batch training. It is also state-of-the-art on MVD and Cityscapes. Code is available at https://github.com/ShuLiu1993/PANet |
Tasks | Instance Segmentation, Object Detection, Semantic Segmentation |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01534v4 |
http://arxiv.org/pdf/1803.01534v4.pdf | |
PWC | https://paperswithcode.com/paper/path-aggregation-network-for-instance |
Repo | https://github.com/YuefeiZ/PANet |
Framework | tf |
Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN
Title | Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN |
Authors | Shiyi Lan, Ruichi Yu, Gang Yu, Larry S. Davis |
Abstract | Recent advances in deep convolutional neural networks (CNNs) have motivated researchers to adapt CNNs to directly model points in 3D point clouds. Modeling local structure has been proven to be important for the success of convolutional architectures, and researchers exploited the modeling of local point sets in the feature extraction hierarchy. However, limited attention has been paid to explicitly model the geometric structure amongst points in a local region. To address this problem, we propose Geo-CNN, which applies a generic convolution-like operation dubbed as GeoConv to each point and its local neighborhood. Local geometric relationships among points are captured when extracting edge features between the center and its neighboring points. We first decompose the edge feature extraction process onto three orthogonal bases, and then aggregate the extracted features based on the angles between the edge vector and the bases. This encourages the network to preserve the geometric structure in Euclidean space throughout the feature extraction hierarchy. GeoConv is a generic and efficient operation that can be easily integrated into 3D point cloud analysis pipelines for multiple applications. We evaluate Geo-CNN on ModelNet40 and KITTI and achieve state-of-the-art performance. |
Tasks | Modeling Local Geometric Structure |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07782v1 |
http://arxiv.org/pdf/1811.07782v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-local-geometric-structure-of-3d |
Repo | https://github.com/voidrank/Geo-CNN |
Framework | tf |
ScratchDet: Training Single-Shot Object Detectors from Scratch
Title | ScratchDet: Training Single-Shot Object Detectors from Scratch |
Authors | Rui Zhu, Shifeng Zhang, Xiaobo Wang, Longyin Wen, Hailin Shi, Liefeng Bo, Tao Mei |
Abstract | Current state-of-the-art object objectors are fine-tuned from the off-the-shelf networks pretrained on large-scale classification dataset ImageNet, which incurs some additional problems: 1) The classification and detection have different degrees of sensitivity to translation, resulting in the learning objective bias; 2) The architecture is limited by the classification network, leading to the inconvenience of modification. To cope with these problems, training detectors from scratch is a feasible solution. However, the detectors trained from scratch generally perform worse than the pretrained ones, even suffer from the convergence issue in training. In this paper, we explore to train object detectors from scratch robustly. By analysing the previous work on optimization landscape, we find that one of the overlooked points in current trained-from-scratch detector is the BatchNorm. Resorting to the stable and predictable gradient brought by BatchNorm, detectors can be trained from scratch stably while keeping the favourable performance independent to the network architecture. Taking this advantage, we are able to explore various types of networks for object detection, without suffering from the poor convergence. By extensive experiments and analyses on downsampling factor, we propose the Root-ResNet backbone network, which makes full use of the information from original images. Our ScratchDet achieves the state-of-the-art accuracy on PASCAL VOC 2007, 2012 and MS COCO among all the train-from-scratch detectors and even performs better than several one-stage pretrained methods. Codes will be made publicly available at https://github.com/KimSoybean/ScratchDet. |
Tasks | Object Detection |
Published | 2018-10-19 |
URL | https://arxiv.org/abs/1810.08425v4 |
https://arxiv.org/pdf/1810.08425v4.pdf | |
PWC | https://paperswithcode.com/paper/scratchdetexploring-to-train-single-shot |
Repo | https://github.com/KimSoybean/ScratchDet |
Framework | pytorch |
TSViz: Demystification of Deep Learning Models for Time-Series Analysis
Title | TSViz: Demystification of Deep Learning Models for Time-Series Analysis |
Authors | Shoaib Ahmed Siddiqui, Dominik Mercier, Mohsin Munir, Andreas Dengel, Sheraz Ahmed |
Abstract | This paper presents a novel framework for demystification of convolutional deep learning models for time-series analysis. This is a step towards making informed/explainable decisions in the domain of time-series, powered by deep learning. There have been numerous efforts to increase the interpretability of image-centric deep neural network models, where the learned features are more intuitive to visualize. Visualization in time-series domain is much more complicated as there is no direct interpretation of the filters and inputs as compared to the image modality. In addition, little or no concentration has been devoted for the development of such tools in the domain of time-series in the past. TSViz provides possibilities to explore and analyze a network from different dimensions at different levels of abstraction which includes identification of parts of the input that were responsible for a prediction (including per filter saliency), importance of different filters present in the network for a particular prediction, notion of diversity present in the network through filter clustering, understanding of the main sources of variation learnt by the network through inverse optimization, and analysis of the network’s robustness against adversarial noise. As a sanity check for the computed influence values, we demonstrate results regarding pruning of neural networks based on the computed influence information. These representations allow to understand the network features so that the acceptability of deep networks for time-series data can be enhanced. This is extremely important in domains like finance, industry 4.0, self-driving cars, health-care, counter-terrorism etc., where reasons for reaching a particular prediction are equally important as the prediction itself. We assess the proposed framework for interpretability with a set of desirable properties essential for any method. |
Tasks | Self-Driving Cars, Time Series, Time Series Analysis |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02952v2 |
http://arxiv.org/pdf/1802.02952v2.pdf | |
PWC | https://paperswithcode.com/paper/tsviz-demystification-of-deep-learning-models |
Repo | https://github.com/shoaibahmed/TSViz-Core |
Framework | tf |
Latent-space Physics: Towards Learning the Temporal Evolution of Fluid Flow
Title | Latent-space Physics: Towards Learning the Temporal Evolution of Fluid Flow |
Authors | Steffen Wiewel, Moritz Becher, Nils Thuerey |
Abstract | We propose a method for the data-driven inference of temporal evolutions of physical functions with deep learning. More specifically, we target fluid flows, i.e. Navier-Stokes problems, and we propose a novel LSTM-based approach to predict the changes of pressure fields over time. The central challenge in this context is the high dimensionality of Eulerian space-time data sets. We demonstrate for the first time that dense 3D+time functions of physics system can be predicted within the latent spaces of neural networks, and we arrive at a neural-network based simulation algorithm with significant practical speed-ups. We highlight the capabilities of our method with a series of complex liquid simulations, and with a set of single-phase buoyancy simulations. With a set of trained networks, our method is more than two orders of magnitudes faster than a traditional pressure solver. Additionally, we present and discuss a series of detailed evaluations for the different components of our algorithm. |
Tasks | Dimensionality Reduction |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.10123v3 |
http://arxiv.org/pdf/1802.10123v3.pdf | |
PWC | https://paperswithcode.com/paper/latent-space-physics-towards-learning-the |
Repo | https://github.com/wiewel/LatentSpacePhysics |
Framework | tf |
Information-Theoretic Active Learning for Content-Based Image Retrieval
Title | Information-Theoretic Active Learning for Content-Based Image Retrieval |
Authors | Björn Barz, Christoph Käding, Joachim Denzler |
Abstract | We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval. Instead of combining different heuristics such as uncertainty, diversity, or density, our method is based on maximizing the mutual information between the predicted relevance of the images and the expected user feedback regarding the selected batch. We propose suitable approximations to this computationally demanding problem and also integrate an explicit model of user behavior that accounts for possible incorrect labels and unnameable instances. Furthermore, our approach does not only take the structure of the data but also the expected model output change caused by the user feedback into account. In contrast to other methods, ITAL turns out to be highly flexible and provides state-of-the-art performance across various datasets, such as MIRFLICKR and ImageNet. |
Tasks | Active Learning, Content-Based Image Retrieval, Image Retrieval |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02337v2 |
http://arxiv.org/pdf/1809.02337v2.pdf | |
PWC | https://paperswithcode.com/paper/information-theoretic-active-learning-for |
Repo | https://github.com/cvjena/ITAL |
Framework | none |
Learning Deep Generative Models of Graphs
Title | Learning Deep Generative Models of Graphs |
Authors | Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia |
Abstract | Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry. Here we introduce a powerful new approach for learning generative models over graphs, which can capture both their structure and attributes. Our approach uses graph neural networks to express probabilistic dependencies among a graph’s nodes and edges, and can, in principle, learn distributions over any arbitrary graph. In a series of experiments our results show that once trained, our models can generate good quality samples of both synthetic graphs as well as real molecular graphs, both unconditionally and conditioned on data. Compared to baselines that do not use graph-structured representations, our models often perform far better. We also explore key challenges of learning generative models of graphs, such as how to handle symmetries and ordering of elements during the graph generation process, and offer possible solutions. Our work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures. |
Tasks | Graph Generation, Knowledge Graphs |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.03324v1 |
http://arxiv.org/pdf/1803.03324v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-generative-models-of-graphs |
Repo | https://github.com/snap-stanford/GraphRNN |
Framework | pytorch |
Differentially Private Generative Adversarial Network
Title | Differentially Private Generative Adversarial Network |
Authors | Liyang Xie, Kaixiang Lin, Shu Wang, Fei Wang, Jiayu Zhou |
Abstract | Generative Adversarial Network (GAN) and its variants have recently attracted intensive research interests due to their elegant theoretical foundation and excellent empirical performance as generative models. These tools provide a promising direction in the studies where data availability is limited. One common issue in GANs is that the density of the learned generative distribution could concentrate on the training data points, meaning that they can easily remember training samples due to the high model complexity of deep networks. This becomes a major concern when GANs are applied to private or sensitive data such as patient medical records, and the concentration of distribution may divulge critical patient information. To address this issue, in this paper we propose a differentially private GAN (DPGAN) model, in which we achieve differential privacy in GANs by adding carefully designed noise to gradients during the learning procedure. We provide rigorous proof for the privacy guarantee, as well as comprehensive empirical evidence to support our analysis, where we demonstrate that our method can generate high quality data points at a reasonable privacy level. |
Tasks | |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.06739v1 |
http://arxiv.org/pdf/1802.06739v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-generative-adversarial-1 |
Repo | https://github.com/illidanlab/dpgan |
Framework | tf |
Three hypergraph eigenvector centralities
Title | Three hypergraph eigenvector centralities |
Authors | Austin R. Benson |
Abstract | Eigenvector centrality is a standard network analysis tool for determining the importance of (or ranking of) entities in a connected system that is represented by a graph. However, many complex systems and datasets have natural multi-way interactions that are more faithfully modeled by a hypergraph. Here we extend the notion of graph eigenvector centrality to uniform hypergraphs. Traditional graph eigenvector centralities are given by a positive eigenvector of the adjacency matrix, which is guaranteed to exist by the Perron-Frobenius theorem under some mild conditions. The natural representation of a hypergraph is a hypermatrix (colloquially, a tensor). Using recently established Perron-Frobenius theory for tensors, we develop three tensor eigenvectors centralities for hypergraphs, each with different interpretations. We show that these centralities can reveal different information on real-world data by analyzing hypergraphs constructed from n-gram frequencies, co-tagging on stack exchange, and drug combinations observed in patient emergency room visits. |
Tasks | |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09644v3 |
http://arxiv.org/pdf/1807.09644v3.pdf | |
PWC | https://paperswithcode.com/paper/three-hypergraph-eigenvector-centralities |
Repo | https://github.com/arbenson/Hyper-Evec-Centrality |
Framework | none |
Using Syntax to Ground Referring Expressions in Natural Images
Title | Using Syntax to Ground Referring Expressions in Natural Images |
Authors | Volkan Cirik, Taylor Berg-Kirkpatrick, Louis-Philippe Morency |
Abstract | We introduce GroundNet, a neural network for referring expression recognition – the task of localizing (or grounding) in an image the object referred to by a natural language expression. Our approach to this task is the first to rely on a syntactic analysis of the input referring expression in order to inform the structure of the computation graph. Given a parse tree for an input expression, we explicitly map the syntactic constituents and relationships present in the tree to a composed graph of neural modules that defines our architecture for performing localization. This syntax-based approach aids localization of \textit{both} the target object and auxiliary supporting objects mentioned in the expression. As a result, GroundNet is more interpretable than previous methods: we can (1) determine which phrase of the referring expression points to which object in the image and (2) track how the localization of the target object is determined by the network. We study this property empirically by introducing a new set of annotations on the GoogleRef dataset to evaluate localization of supporting objects. Our experiments show that GroundNet achieves state-of-the-art accuracy in identifying supporting objects, while maintaining comparable performance in the localization of target objects. |
Tasks | |
Published | 2018-05-26 |
URL | http://arxiv.org/abs/1805.10547v1 |
http://arxiv.org/pdf/1805.10547v1.pdf | |
PWC | https://paperswithcode.com/paper/using-syntax-to-ground-referring-expressions |
Repo | https://github.com/volkancirik/groundnet |
Framework | pytorch |
Unifying Probabilistic Models for Time-Frequency Analysis
Title | Unifying Probabilistic Models for Time-Frequency Analysis |
Authors | William J. Wilkinson, Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, Arno Solin |
Abstract | In audio signal processing, probabilistic time-frequency models have many benefits over their non-probabilistic counterparts. They adapt to the incoming signal, quantify uncertainty, and measure correlation between the signal’s amplitude and phase information, making time domain resynthesis straightforward. However, these models are still not widely used since they come at a high computational cost, and because they are formulated in such a way that it can be difficult to interpret all the modelling assumptions. By showing their equivalence to Spectral Mixture Gaussian processes, we illuminate the underlying model assumptions and provide a general framework for constructing more complex models that better approximate real-world signals. Our interpretation makes it intuitive to inspect, compare, and alter the models since all prior knowledge is encoded in the Gaussian process kernel functions. We utilise a state space representation to perform efficient inference via Kalman smoothing, and we demonstrate how our interpretation allows for efficient parameter learning in the frequency domain. |
Tasks | Gaussian Processes |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02489v6 |
http://arxiv.org/pdf/1811.02489v6.pdf | |
PWC | https://paperswithcode.com/paper/unifying-probabilistic-models-for-time |
Repo | https://github.com/wil-j-wil/unifying-prob-time-freq |
Framework | none |
Multi-Mapping Image-to-Image Translation with Central Biasing Normalization
Title | Multi-Mapping Image-to-Image Translation with Central Biasing Normalization |
Authors | Xiaoming Yu, Zhenqiang Ying, Thomas Li, Shan Liu, Ge Li |
Abstract | Image-to-image translation is a class of image processing and vision problems that translates an image to a different style or domain. To improve the capacity and performance of one-to-one translation models, multi-mapping image translation have been attempting to extend them for multiple mappings by injecting latent code. Through the analysis of the existing latent code injection models, we find that latent code can determine the target mapping of a generator by controlling the output statistical properties, especially the mean value. However, we find that in some cases the normalization will reduce the consistency of same mapping or the diversity of different mappings. After mathematical analysis, we find the reason behind that is that the distributions of same mapping become inconsistent after batch normalization, and that the effects of latent code are eliminated after instance normalization. To solve these problems, we propose consistency within diversity design criteria for multi-mapping networks. Based on the design criteria, we propose central biasing normalization (CBN) to replace existing latent code injection. CBN can be easily integrated into existing multi-mapping models, significantly reducing model parameters. Experiments show that the results of our method is more stable and diverse than that of existing models. https://github.com/Xiaoming-Yu/cbn . |
Tasks | Image-to-Image Translation |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.10050v4 |
http://arxiv.org/pdf/1806.10050v4.pdf | |
PWC | https://paperswithcode.com/paper/multi-mapping-image-to-image-translation-with |
Repo | https://github.com/Xiaoming-Yu/cbn |
Framework | pytorch |
Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods
Title | Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods |
Authors | Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang |
Abstract | We introduce a new benchmark, WinoBias, for coreference resolution focused on gender bias. Our corpus contains Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). We demonstrate that a rule-based, a feature-rich, and a neural coreference system all link gendered pronouns to pro-stereotypical entities with higher accuracy than anti-stereotypical entities, by an average difference of 21.1 in F1 score. Finally, we demonstrate a data-augmentation approach that, in combination with existing word-embedding debiasing techniques, removes the bias demonstrated by these systems in WinoBias without significantly affecting their performance on existing coreference benchmark datasets. Our dataset and code are available at http://winobias.org. |
Tasks | Coreference Resolution, Data Augmentation |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06876v1 |
http://arxiv.org/pdf/1804.06876v1.pdf | |
PWC | https://paperswithcode.com/paper/gender-bias-in-coreference-resolution-1 |
Repo | https://github.com/uclanlp/corefBias |
Framework | none |
Building an Ellipsis-aware Chinese Dependency Treebank for Web Text
Title | Building an Ellipsis-aware Chinese Dependency Treebank for Web Text |
Authors | Xuancheng Ren, Xu Sun, Ji Wen, Bingzhen Wei, Weidong Zhan, Zhiyuan Zhang |
Abstract | Web 2.0 has brought with it numerous user-produced data revealing one’s thoughts, experiences, and knowledge, which are a great source for many tasks, such as information extraction, and knowledge base construction. However, the colloquial nature of the texts poses new challenges for current natural language processing techniques, which are more adapt to the formal form of the language. Ellipsis is a common linguistic phenomenon that some words are left out as they are understood from the context, especially in oral utterance, hindering the improvement of dependency parsing, which is of great importance for tasks relied on the meaning of the sentence. In order to promote research in this area, we are releasing a Chinese dependency treebank of 319 weibos, containing 572 sentences with omissions restored and contexts reserved. |
Tasks | Dependency Parsing |
Published | 2018-01-20 |
URL | http://arxiv.org/abs/1801.06613v2 |
http://arxiv.org/pdf/1801.06613v2.pdf | |
PWC | https://paperswithcode.com/paper/building-an-ellipsis-aware-chinese-dependency |
Repo | https://github.com/lancopku/Chinese-Dependency-Treebank-with-Ellipsis |
Framework | none |