October 20, 2019

2810 words 14 mins read

Paper Group AWR 303

Paper Group AWR 303

Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images. RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information. Challenges of language technologies for the indigenous languages of the Americas. Towards a Better Understanding and Regularization of GAN Training Dynamics. Convolutional CRFs …

Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images

Title Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images
Authors Sachin Mehta, Ezgi Mercan, Jamen Bartlett, Donald Weave, Joann G. Elmore, Linda Shapiro
Abstract In this paper, we introduce a conceptually simple network for generating discriminative tissue-level segmentation masks for the purpose of breast cancer diagnosis. Our method efficiently segments different types of tissues in breast biopsy images while simultaneously predicting a discriminative map for identifying important areas in an image. Our network, Y-Net, extends and generalizes U-Net by adding a parallel branch for discriminative map generation and by supporting convolutional block modularity, which allows the user to adjust network efficiency without altering the network topology. Y-Net delivers state-of-the-art segmentation accuracy while learning 6.6x fewer parameters than its closest competitors. The addition of descriptive power from Y-Net’s discriminative segmentation masks improve diagnostic classification accuracy by 7% over state-of-the-art methods for diagnostic classification. Source code is available at: https://sacmehta.github.io/YNet.
Tasks Medical Image Segmentation
Published 2018-06-04
URL http://arxiv.org/abs/1806.01313v1
PDF http://arxiv.org/pdf/1806.01313v1.pdf
PWC https://paperswithcode.com/paper/y-net-joint-segmentation-and-classification
Repo https://github.com/sacmehta/YNet
Framework pytorch

RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information

Title RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information
Authors Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga, Chiranjib Bhattacharyya, Partha Talukdar
Abstract Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically aligning relation instances in a Knowledge Base (KB) with unstructured text. In addition to relation instances, KBs often contain other relevant side information, such as aliases of relations (e.g., founded and co-founded are aliases for the relation founderOfCompany). RE models usually ignore such readily available side information. In this paper, we propose RESIDE, a distantly-supervised neural relation extraction method which utilizes additional side information from KBs for improved relation extraction. It uses entity type and relation alias information for imposing soft constraints while predicting relations. RESIDE employs Graph Convolution Networks (GCN) to encode syntactic information from text and improves performance even when limited side information is available. Through extensive experiments on benchmark datasets, we demonstrate RESIDE’s effectiveness. We have made RESIDE’s source code available to encourage reproducible research.
Tasks Relation Extraction, Relationship Extraction (Distant Supervised)
Published 2018-12-11
URL http://arxiv.org/abs/1812.04361v2
PDF http://arxiv.org/pdf/1812.04361v2.pdf
PWC https://paperswithcode.com/paper/reside-improving-distantly-supervised-neural
Repo https://github.com/malllabiisc/RESIDE
Framework tf

Challenges of language technologies for the indigenous languages of the Americas

Title Challenges of language technologies for the indigenous languages of the Americas
Authors Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra, Ivan Meza
Abstract Indigenous languages of the American continent are highly diverse. However, they have received little attention from the technological perspective. In this paper, we review the research, the digital resources and the available NLP systems that focus on these languages. We present the main challenges and research questions that arise when distant languages and low-resource scenarios are faced. We would like to encourage NLP research in linguistically rich and diverse areas like the Americas.
Tasks
Published 2018-06-12
URL http://arxiv.org/abs/1806.04291v1
PDF http://arxiv.org/pdf/1806.04291v1.pdf
PWC https://paperswithcode.com/paper/challenges-of-language-technologies-for-the
Repo https://github.com/pywirrarika/naki
Framework none

Towards a Better Understanding and Regularization of GAN Training Dynamics

Title Towards a Better Understanding and Regularization of GAN Training Dynamics
Authors Weili Nie, Ankit Patel
Abstract Generative adversarial networks (GANs) are notoriously difficult to train and the reasons underlying their (non-)convergence behaviors are still not completely understood. By first considering a simple yet representative GAN example, we mathematically analyze its local convergence behavior in a non-asymptotic way. Furthermore, the analysis is extended to general GANs under certain assumptions. We find that in order to ensure a good convergence rate, two factors of the Jacobian in the GAN training dynamics should be simultaneously avoided, which are (i) the Phase Factor, i.e., the Jacobian has complex eigenvalues with a large imaginary-to-real ratio, and (ii) the Conditioning Factor, i.e., the Jacobian is ill-conditioned. Previous methods of regularizing the Jacobian can only alleviate one of these two factors, while making the other more severe. Thus we propose a new JAcobian REgularization (JARE) for GANs, which simultaneously addresses both factors by construction. Finally, we conduct experiments that confirm our theoretical analysis and demonstrate the advantages of JARE over previous methods in stabilizing GANs.
Tasks
Published 2018-06-24
URL https://arxiv.org/abs/1806.09235v2
PDF https://arxiv.org/pdf/1806.09235v2.pdf
PWC https://paperswithcode.com/paper/jr-gan-jacobian-regularization-for-generative
Repo https://github.com/weilinie/JARE
Framework tf

Convolutional CRFs for Semantic Segmentation

Title Convolutional CRFs for Semantic Segmentation
Authors Marvin T. T. Teichmann, Roberto Cipolla
Abstract For the challenging semantic image segmentation task the most efficient models have traditionally combined the structured modelling capabilities of Conditional Random Fields (CRFs) with the feature extraction power of CNNs. In more recent works however, CRF post-processing has fallen out of favour. We argue that this is mainly due to the slow training and inference speeds of CRFs, as well as the difficulty of learning the internal CRF parameters. To overcome both issues we propose to add the assumption of conditional independence to the framework of fully-connected CRFs. This allows us to reformulate the inference in terms of convolutions, which can be implemented highly efficiently on GPUs. Doing so speeds up inference and training by a factor of more then 100. All parameters of the convolutional CRFs can easily be optimized using backpropagation. To facilitating further CRF research we make our implementation publicly available. Please visit: https://github.com/MarvinTeichmann/ConvCRF
Tasks Semantic Segmentation
Published 2018-05-12
URL http://arxiv.org/abs/1805.04777v2
PDF http://arxiv.org/pdf/1805.04777v2.pdf
PWC https://paperswithcode.com/paper/convolutional-crfs-for-semantic-segmentation
Repo https://github.com/MarvinTeichmann/ConvCRF
Framework pytorch

Latent Alignment and Variational Attention

Title Latent Alignment and Variational Attention
Authors Yuntian Deng, Yoon Kim, Justin Chiu, Demi Guo, Alexander M. Rush
Abstract Neural attention has become central to many state-of-the-art models in natural language processing and related domains. Attention networks are an easy-to-train and effective method for softly simulating alignment; however, the approach does not marginalize over latent alignments in a probabilistic sense. This property makes it difficult to compare attention to other alignment approaches, to compose it with probabilistic models, and to perform posterior inference conditioned on observed data. A related latent approach, hard attention, fixes these issues, but is generally harder to train and less accurate. This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference. We further propose methods for reducing the variance of gradients to make these approaches computationally feasible. Experiments show that for machine translation and visual question answering, inefficient exact latent variable models outperform standard neural attention, but these gains go away when using hard attention based training. On the other hand, variational attention retains most of the performance gain but with training speed comparable to neural attention.
Tasks Latent Variable Models, Machine Translation, Question Answering, Visual Question Answering
Published 2018-07-10
URL http://arxiv.org/abs/1807.03756v2
PDF http://arxiv.org/pdf/1807.03756v2.pdf
PWC https://paperswithcode.com/paper/latent-alignment-and-variational-attention
Repo https://github.com/harvardnlp/var-attn
Framework pytorch

Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings

Title Predicting Concreteness and Imageability of Words Within and Across Languages via Word Embeddings
Authors Nikola Ljubešić, Darja Fišer, Anita Peti-Stantić
Abstract The notions of concreteness and imageability, traditionally important in psycholinguistics, are gaining significance in semantic-oriented natural language processing tasks. In this paper we investigate the predictability of these two concepts via supervised learning, using word embeddings as explanatory variables. We perform predictions both within and across languages by exploiting collections of cross-lingual embeddings aligned to a single vector space. We show that the notions of concreteness and imageability are highly predictable both within and across languages, with a moderate loss of up to 20% in correlation when predicting across languages. We further show that the cross-lingual transfer via word embeddings is more efficient than the simple transfer via bilingual dictionaries.
Tasks Cross-Lingual Transfer, Word Embeddings
Published 2018-07-09
URL http://arxiv.org/abs/1807.02903v1
PDF http://arxiv.org/pdf/1807.02903v1.pdf
PWC https://paperswithcode.com/paper/predicting-concreteness-and-imageability-of
Repo https://github.com/clarinsi/megahr-crossling
Framework none

FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans

Title FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans
Authors Chen Liu, Jiaye Wu, Yasutaka Furukawa
Abstract The ultimate goal of this indoor mapping research is to automatically reconstruct a floorplan simply by walking through a house with a smartphone in a pocket. This paper tackles this problem by proposing FloorNet, a novel deep neural architecture. The challenge lies in the processing of RGBD streams spanning a large 3D space. FloorNet effectively processes the data through three neural network branches: 1) PointNet with 3D points, exploiting the 3D information; 2) CNN with a 2D point density image in a top-down view, enhancing the local spatial reasoning; and 3) CNN with RGB images, utilizing the full image information. FloorNet exchanges intermediate features across the branches to exploit the best of all the architectures. We have created a benchmark for floorplan reconstruction by acquiring RGBD video streams for 155 residential houses or apartments with Google Tango phones and annotating complete floorplan information. Our qualitative and quantitative evaluations demonstrate that the fusion of three branches effectively improves the reconstruction quality. We hope that the paper together with the benchmark will be an important step towards solving a challenging vector-graphics reconstruction problem. Code and data are available at https://github.com/art-programmer/FloorNet.
Tasks
Published 2018-03-31
URL http://arxiv.org/abs/1804.00090v1
PDF http://arxiv.org/pdf/1804.00090v1.pdf
PWC https://paperswithcode.com/paper/floornet-a-unified-framework-for-floorplan
Repo https://github.com/vohoaiviet/FloorNet
Framework tf

Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry

Title Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry
Authors Maximilian Nickel, Douwe Kiela
Abstract We are concerned with the discovery of hierarchical relationships from large-scale unstructured similarity scores. For this purpose, we study different models of hyperbolic space and find that learning embeddings in the Lorentz model is substantially more efficient than in the Poincar'e-ball model. We show that the proposed approach allows us to learn high-quality embeddings of large taxonomies which yield improvements over Poincar'e embeddings, especially in low dimensions. Lastly, we apply our model to discover hierarchies in two real-world datasets: we show that an embedding in hyperbolic space can reveal important aspects of a company’s organizational structure as well as reveal historical relationships between language families.
Tasks
Published 2018-06-09
URL http://arxiv.org/abs/1806.03417v2
PDF http://arxiv.org/pdf/1806.03417v2.pdf
PWC https://paperswithcode.com/paper/learning-continuous-hierarchies-in-the
Repo https://github.com/mtbarta/hyperbolic
Framework tf

Joint Embedding of Words and Labels for Text Classification

Title Joint Embedding of Words and Labels for Text Classification
Authors Guoyin Wang, Chunyuan Li, Wenlin Wang, Yizhe Zhang, Dinghan Shen, Xinyuan Zhang, Ricardo Henao, Lawrence Carin
Abstract Word embeddings are effective intermediate representations for capturing semantic regularities between words, when learning the representations of text sequences. We propose to view text classification as a label-word joint embedding problem: each label is embedded in the same space with the word vectors. We introduce an attention framework that measures the compatibility of embeddings between text sequences and labels. The attention is learned on a training set of labeled samples to ensure that, given a text sequence, the relevant words are weighted higher than the irrelevant ones. Our method maintains the interpretability of word embeddings, and enjoys a built-in ability to leverage alternative sources of information, in addition to input text sequences. Extensive results on the several large text datasets show that the proposed framework outperforms the state-of-the-art methods by a large margin, in terms of both accuracy and speed.
Tasks Sentiment Analysis, Text Classification
Published 2018-05-10
URL http://arxiv.org/abs/1805.04174v1
PDF http://arxiv.org/pdf/1805.04174v1.pdf
PWC https://paperswithcode.com/paper/joint-embedding-of-words-and-labels-for-text
Repo https://github.com/ShuanDeMorian/Project_AT_News_Classification
Framework none

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

Title Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
Authors Dan Xu, Wei Wang, Hao Tang, Hong Liu, Nicu Sebe, Elisa Ricci
Abstract Recent works have shown the benefit of integrating Conditional Random Fields (CRFs) models into deep architectures for improving pixel-level prediction tasks. Following this line of research, in this paper we introduce a novel approach for monocular depth estimation. Similarly to previous works, our method employs a continuous CRF to fuse multi-scale information derived from different layers of a front-end Convolutional Neural Network (CNN). Differently from past works, our approach benefits from a structured attention model which automatically regulates the amount of information transferred between corresponding features at different scales. Importantly, the proposed attention model is seamlessly integrated into the CRF, allowing end-to-end training of the entire architecture. Our extensive experimental evaluation demonstrates the effectiveness of the proposed method which is competitive with previous methods on the KITTI benchmark and outperforms the state of the art on the NYU Depth V2 dataset.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2018-03-29
URL http://arxiv.org/abs/1803.11029v1
PDF http://arxiv.org/pdf/1803.11029v1.pdf
PWC https://paperswithcode.com/paper/structured-attention-guided-convolutional
Repo https://github.com/danxuhk/StructuredAttentionDepthEstimation
Framework pytorch

Latent RANSAC

Title Latent RANSAC
Authors Simon Korman, Roee Litman
Abstract We present a method that can evaluate a RANSAC hypothesis in constant time, i.e. independent of the size of the data. A key observation here is that correct hypotheses are tightly clustered together in the latent parameter domain. In a manner similar to the generalized Hough transform we seek to find this cluster, only that we need as few as two votes for a successful detection. Rapidly locating such pairs of similar hypotheses is made possible by adapting the recent “Random Grids” range-search technique. We only perform the usual (costly) hypothesis verification stage upon the discovery of a close pair of hypotheses. We show that this event rarely happens for incorrect hypotheses, enabling a significant speedup of the RANSAC pipeline. The suggested approach is applied and tested on three robust estimation problems: camera localization, 3D rigid alignment and 2D-homography estimation. We perform rigorous testing on both synthetic and real datasets, demonstrating an improvement in efficiency without a compromise in accuracy. Furthermore, we achieve state-of-the-art 3D alignment results on the challenging “Redwood” loop-closure challenge.
Tasks Camera Localization, Homography Estimation
Published 2018-02-20
URL http://arxiv.org/abs/1802.07045v2
PDF http://arxiv.org/pdf/1802.07045v2.pdf
PWC https://paperswithcode.com/paper/latent-ransac
Repo https://github.com/rlit/LatentRANSAC
Framework none

Visual Localization Under Appearance Change: A Filtering Approach

Title Visual Localization Under Appearance Change: A Filtering Approach
Authors Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Shin-Fang Ch’ng, Thanh-Toan Do, Ian Reid
Abstract A major focus of current research on place recognition is visual localization for autonomous driving. In this scenario, as cameras will be operating continuously, it is realistic to expect videos as an input to visual localization algorithms, as opposed to the single-image querying approach used in other place recognition works. In this paper, we show that exploiting temporal continuity in the testing sequence significantly improves visual localization - qualitatively and quantitatively. Although intuitive, this idea has not been fully explored in recent works. Our main contribution is a novel Monte Carlo-based visual localization technique that can efficiently reason over the image sequence. Also, we propose an image retrieval pipeline which relies on local features and an encoding technique to represent an image as a single vector. The experimental results show that our proposed method achieves better results than state-of-the-art approaches for the task on visual localization under significant appearance change. Our synthetic dataset and source code are publicly made available.
Tasks 3D Pose Estimation, Camera Localization, Visual Localization, Visual Odometry
Published 2018-11-20
URL https://arxiv.org/abs/1811.08063v3
PDF https://arxiv.org/pdf/1811.08063v3.pdf
PWC https://paperswithcode.com/paper/visual-localization-under-appearance-change-a
Repo https://github.com/Adelaide-AI-Group/MCVL
Framework none

Top-K Influential Nodes in Social Networks: A Game Perspective

Title Top-K Influential Nodes in Social Networks: A Game Perspective
Authors Yu Zhang, Yan Zhang
Abstract Influence maximization, the fundamental of viral marketing, aims to find top-$K$ seed nodes maximizing influence spread under certain spreading models. In this paper, we study influence maximization from a game perspective. We propose a Coordination Game model, in which every individuals make their decisions based on the benefit of coordination with their network neighbors, to study information propagation. Our model serves as the generalization of some existing models, such as Majority Vote model and Linear Threshold model. Under the generalized model, we study the hardness of influence maximization and the approximation guarantee of the greedy algorithm. We also combine several strategies to accelerate the algorithm. Experimental results show that after the acceleration, our algorithm significantly outperforms other heuristics, and it is three orders of magnitude faster than the original greedy method.
Tasks Community Detection
Published 2018-10-14
URL https://arxiv.org/abs/1810.05959v9
PDF https://arxiv.org/pdf/1810.05959v9.pdf
PWC https://paperswithcode.com/paper/balancing-authority-and-diversity-in
Repo https://github.com/yuzhimanhua/Influence-Maximization
Framework none

Learning to Make Predictions on Graphs with Autoencoders

Title Learning to Make Predictions on Graphs with Autoencoders
Authors Phi Vu Tran
Abstract We examine two fundamental tasks associated with graph representation learning: link prediction and semi-supervised node classification. We present a novel autoencoder architecture capable of learning a joint representation of both local graph structure and available node features for the multi-task learning of link prediction and node classification. Our autoencoder architecture is efficiently trained end-to-end in a single learning stage to simultaneously perform link prediction and node classification, whereas previous related methods require multiple training steps that are difficult to optimize. We provide a comprehensive empirical evaluation of our models on nine benchmark graph-structured datasets and demonstrate significant improvement over related methods for graph representation learning. Reference code and data are available at https://github.com/vuptran/graph-representation-learning
Tasks Graph Representation Learning, Link Prediction, Multi-Task Learning, Node Classification, Representation Learning
Published 2018-02-23
URL http://arxiv.org/abs/1802.08352v2
PDF http://arxiv.org/pdf/1802.08352v2.pdf
PWC https://paperswithcode.com/paper/learning-to-make-predictions-on-graphs-with
Repo https://github.com/vuptran/graph-representation-learning
Framework tf
comments powered by Disqus