October 20, 2019

2752 words 13 mins read

Paper Group AWR 221

Paper Group AWR 221

On the Solvability of Viewing Graphs. f-VAEs: Improve VAEs with Conditional Flows. Egocentric Spatial Memory. DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama. Forging new worlds: high-resolution synthetic galaxies with chained generative adversarial networks. Outlier Aware Network Embedding for Attributed …

On the Solvability of Viewing Graphs

Title On the Solvability of Viewing Graphs
Authors Matthew Trager, Brian Osserman, Jean Ponce
Abstract A set of fundamental matrices relating pairs of cameras in some configuration can be represented as edges of a “viewing graph”. Whether or not these fundamental matrices are generically sufficient to recover the global camera configuration depends on the structure of this graph. We study characterizations of “solvable” viewing graphs and present several new results that can be applied to determine which pairs of views may be used to recover all camera parameters. We also discuss strategies for verifying the solvability of a graph computationally.
Tasks
Published 2018-08-08
URL http://arxiv.org/abs/1808.02856v2
PDF http://arxiv.org/pdf/1808.02856v2.pdf
PWC https://paperswithcode.com/paper/on-the-solvability-of-viewing-graphs
Repo https://github.com/mtrager/viewing-graphs
Framework none

f-VAEs: Improve VAEs with Conditional Flows

Title f-VAEs: Improve VAEs with Conditional Flows
Authors Jianlin Su, Guang Wu
Abstract In this paper, we integrate VAEs and flow-based generative models successfully and get f-VAEs. Compared with VAEs, f-VAEs generate more vivid images, solved the blurred-image problem of VAEs. Compared with flow-based models such as Glow, f-VAE is more lightweight and converges faster, achieving the same performance under smaller-size architecture.
Tasks
Published 2018-09-16
URL http://arxiv.org/abs/1809.05861v1
PDF http://arxiv.org/pdf/1809.05861v1.pdf
PWC https://paperswithcode.com/paper/f-vaes-improve-vaes-with-conditional-flows
Repo https://github.com/bojone/flow
Framework tf

Egocentric Spatial Memory

Title Egocentric Spatial Memory
Authors Mengmi Zhang, Keng Teck Ma, Shih-Cheng Yen, Joo Hwee Lim, Qi Zhao, Jiashi Feng
Abstract Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective. We introduce an integrated deep neural network architecture for modeling ESM. It learns to estimate the occupancy state of the world and progressively construct top-down 2D global maps from egocentric views in a spatially extended environment. During the exploration, our proposed ESM model updates belief of the global map based on local observations using a recurrent neural network. It also augments the local mapping with a novel external memory to encode and store latent representations of the visited places over long-term exploration in large environments which enables agents to perform place recognition and hence, loop closure. Our proposed ESM network contributes in the following aspects: (1) without feature engineering, our model predicts free space based on egocentric views efficiently in an end-to-end manner; (2) different from other deep learning-based mapping system, ESMN deals with continuous actions and states which is vitally important for robotic control in real applications. In the experiments, we demonstrate its accurate and robust global mapping capacities in 3D virtual mazes and realistic indoor environments by comparing with several competitive baselines.
Tasks Feature Engineering
Published 2018-07-31
URL http://arxiv.org/abs/1807.11929v1
PDF http://arxiv.org/pdf/1807.11929v1.pdf
PWC https://paperswithcode.com/paper/egocentric-spatial-memory
Repo https://github.com/Mengmi/Egocentric-Spatial-Memory
Framework pytorch

DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama

Title DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama
Authors Shang-Ta Yang, Fu-En Wang, Chi-Han Peng, Peter Wonka, Min Sun, Hung-Kuo Chu
Abstract We present a deep learning framework, called DuLa-Net, to predict Manhattan-world 3D room layouts from a single RGB panorama. To achieve better prediction accuracy, our method leverages two projections of the panorama at once, namely the equirectangular panorama-view and the perspective ceiling-view, that each contains different clues about the room layouts. Our network architecture consists of two encoder-decoder branches for analyzing each of the two views. In addition, a novel feature fusion structure is proposed to connect the two branches, which are then jointly trained to predict the 2D floor plans and layout heights. To learn more complex room layouts, we introduce the Realtor360 dataset that contains panoramas of Manhattan-world room layouts with different numbers of corners. Experimental results show that our work outperforms recent state-of-the-art in prediction accuracy and performance, especially in the rooms with non-cuboid layouts.
Tasks 3D Room Layouts From A Single Rgb Panorama
Published 2018-11-29
URL http://arxiv.org/abs/1811.11977v2
PDF http://arxiv.org/pdf/1811.11977v2.pdf
PWC https://paperswithcode.com/paper/dula-net-a-dual-projection-network-for
Repo https://github.com/SunDaDenny/DuLa-Net
Framework pytorch

Forging new worlds: high-resolution synthetic galaxies with chained generative adversarial networks

Title Forging new worlds: high-resolution synthetic galaxies with chained generative adversarial networks
Authors Levi Fussell, Ben Moews
Abstract Astronomy of the 21st century increasingly finds itself with extreme quantities of data. This growth in data is ripe for modern technologies such as deep image processing, which has the potential to allow astronomers to automatically identify, classify, segment and deblend various astronomical objects. In this paper, we explore the use of chained generative adversarial networks (GANs), a class of generative models that learn mappings from latent spaces to data distributions by modelling the joint distribution of the data, to produce physically realistic galaxy images as one use case of such models. In cosmology, such datasets can aid in the calibration of shape measurements for weak lensing by augmenting data with synthetic images. By measuring the distributions of multiple physical properties, we show that images generated with our approach closely follow the distributions of real galaxies, further establishing state-of-the-art GAN architectures as a valuable tool for modern-day astronomy.
Tasks Calibration
Published 2018-11-07
URL http://arxiv.org/abs/1811.03081v3
PDF http://arxiv.org/pdf/1811.03081v3.pdf
PWC https://paperswithcode.com/paper/forging-new-worlds-high-resolution-synthetic
Repo https://github.com/levifussell/forging_new_worlds
Framework pytorch

Outlier Aware Network Embedding for Attributed Networks

Title Outlier Aware Network Embedding for Attributed Networks
Authors Sambaran Bandyopadhyay, Lokesh N, M. N. Murty
Abstract Attributed network embedding has received much interest from the research community as most of the networks come with some content in each node, which is also known as node attributes. Existing attributed network approaches work well when the network is consistent in structure and attributes, and nodes behave as expected. But real world networks often have anomalous nodes. Typically these outliers, being relatively unexplainable, affect the embeddings of other nodes in the network. Thus all the downstream network mining tasks fail miserably in the presence of such outliers. Hence an integrated approach to detect anomalies and reduce their overall effect on the network embedding is required. Towards this end, we propose an unsupervised outlier aware network embedding algorithm (ONE) for attributed networks, which minimizes the effect of the outlier nodes, and hence generates robust network embeddings. We align and jointly optimize the loss functions coming from structure and attributes of the network. To the best of our knowledge, this is the first generic network embedding approach which incorporates the effect of outliers for an attributed network without any supervision. We experimented on publicly available real networks and manually planted different types of outliers to check the performance of the proposed algorithm. Results demonstrate the superiority of our approach to detect the network outliers compared to the state-of-the-art approaches. We also consider different downstream machine learning applications on networks to show the efficiency of ONE as a generic network embedding technique. The source code is made available at https://github.com/sambaranban/ONE.
Tasks Network Embedding
Published 2018-11-19
URL http://arxiv.org/abs/1811.07609v1
PDF http://arxiv.org/pdf/1811.07609v1.pdf
PWC https://paperswithcode.com/paper/outlier-aware-network-embedding-for
Repo https://github.com/sambaranban/ONE
Framework none

Deep Learning Based Speed Estimation for Constraining Strapdown Inertial Navigation on Smartphones

Title Deep Learning Based Speed Estimation for Constraining Strapdown Inertial Navigation on Smartphones
Authors Santiago Cortés, Arno Solin, Juho Kannala
Abstract Strapdown inertial navigation systems are sensitive to the quality of the data provided by the accelerometer and gyroscope. Low-grade IMUs in handheld smart-devices pose a problem for inertial odometry on these devices. We propose a scheme for constraining the inertial odometry problem by complementing non-linear state estimation by a CNN-based deep-learning model for inferring the momentary speed based on a window of IMU samples. We show the feasibility of the model using a wide range of data from an iPhone, and present proof-of-concept results for how the model can be combined with an inertial navigation system for three-dimensional inertial navigation.
Tasks
Published 2018-08-10
URL http://arxiv.org/abs/1808.03485v1
PDF http://arxiv.org/pdf/1808.03485v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-based-speed-estimation-for
Repo https://github.com/AaltoVision/deep-speed-constrained-ins
Framework pytorch

Ranking Sentences for Extractive Summarization with Reinforcement Learning

Title Ranking Sentences for Extractive Summarization with Reinforcement Learning
Authors Shashi Narayan, Shay B. Cohen, Mirella Lapata
Abstract Single document summarization is the task of producing a shorter version of a document while preserving its principal information content. In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective. We use our algorithm to train a neural summarization model on the CNN and DailyMail datasets and demonstrate experimentally that it outperforms state-of-the-art extractive and abstractive systems when evaluated automatically and by humans.
Tasks Document Summarization
Published 2018-02-23
URL http://arxiv.org/abs/1802.08636v2
PDF http://arxiv.org/pdf/1802.08636v2.pdf
PWC https://paperswithcode.com/paper/ranking-sentences-for-extractive
Repo https://github.com/shashiongithub/Refresh
Framework tf

ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation

Title ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation
Authors Jiankai Sun, Bortik Bandyopadhyay, Armin Bashizade, Jiongqian Liang, P. Sadayappan, Srinivasan Parthasarathy
Abstract Directed graphs have been widely used in Community Question Answering services (CQAs) to model asymmetric relationships among different types of nodes in CQA graphs, e.g., question, answer, user. Asymmetric transitivity is an essential property of directed graphs, since it can play an important role in downstream graph inference and analysis. Question difficulty and user expertise follow the characteristic of asymmetric transitivity. Maintaining such properties, while reducing the graph to a lower dimensional vector embedding space, has been the focus of much recent research. In this paper, we tackle the challenge of directed graph embedding with asymmetric transitivity preservation and then leverage the proposed embedding method to solve a fundamental task in CQAs: how to appropriately route and assign newly posted questions to users with the suitable expertise and interest in CQAs. The technique incorporates graph hierarchy and reachability information naturally by relying on a non-linear transformation that operates on the core reachability and implicit hierarchy within such graphs. Subsequently, the methodology levers a factorization-based approach to generate two embedding vectors for each node within the graph, to capture the asymmetric transitivity. Extensive experiments show that our framework consistently and significantly outperforms the state-of-the-art baselines on two diverse real-world tasks: link prediction, and question difficulty estimation and expert finding in online forums like Stack Exchange. Particularly, our framework can support inductive embedding learning for newly posted questions (unseen nodes during training), and therefore can properly route and assign these kinds of questions to experts in CQAs.
Tasks Community Question Answering, Graph Embedding, Link Prediction, Question Answering
Published 2018-11-02
URL http://arxiv.org/abs/1811.00839v2
PDF http://arxiv.org/pdf/1811.00839v2.pdf
PWC https://paperswithcode.com/paper/atp-directed-graph-embedding-with-asymmetric
Repo https://github.com/zhenv5/atp
Framework none

Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability

Title Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability
Authors Akifumi Okuno, Geewook Kim, Hidetoshi Shimodaira
Abstract We propose shifted inner-product similarity (SIPS), which is a novel yet very simple extension of the ordinary inner-product similarity (IPS) for neural-network based graph embedding (GE). In contrast to IPS, that is limited to approximating positive-definite (PD) similarities, SIPS goes beyond the limitation by introducing bias terms in IPS; we theoretically prove that SIPS is capable of approximating not only PD but also conditionally PD (CPD) similarities with many examples such as cosine similarity, negative Poincare distance and negative Wasserstein distance. Since SIPS with sufficiently large neural networks learns a variety of similarities, SIPS alleviates the need for configuring the similarity function of GE. Approximation error rate is also evaluated, and experiments on two real-world datasets demonstrate that graph embedding using SIPS indeed outperforms existing methods.
Tasks Graph Embedding
Published 2018-10-04
URL http://arxiv.org/abs/1810.03463v2
PDF http://arxiv.org/pdf/1810.03463v2.pdf
PWC https://paperswithcode.com/paper/graph-embedding-with-shifted-inner-product
Repo https://github.com/kdrl/SIPS
Framework pytorch

Multi-Task Learning for Left Atrial Segmentation on GE-MRI

Title Multi-Task Learning for Left Atrial Segmentation on GE-MRI
Authors Chen Chen, Wenjia Bai, Daniel Rueckert
Abstract Segmentation of the left atrium (LA) is crucial for assessing its anatomy in both pre-operative atrial fibrillation (AF) ablation planning and post-operative follow-up studies. In this paper, we present a fully automated framework for left atrial segmentation in gadolinium-enhanced magnetic resonance images (GE-MRI) based on deep learning. We propose a fully convolutional neural network and explore the benefits of multi-task learning for performing both atrial segmentation and pre/post ablation classification. Our results show that, by sharing features between related tasks, the network can gain additional anatomical information and achieve more accurate atrial segmentation, leading to a mean Dice score of 0.901 on a test set of 20 3D MRI images. Code of our proposed algorithm is available at https://github.com/cherise215/atria_segmentation_2018/.
Tasks Multi-Task Learning
Published 2018-10-31
URL http://arxiv.org/abs/1810.13205v1
PDF http://arxiv.org/pdf/1810.13205v1.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-for-left-atrial
Repo https://github.com/cherise215/atria_segmentation_2018
Framework pytorch

E-swish: Adjusting Activations to Different Network Depths

Title E-swish: Adjusting Activations to Different Network Depths
Authors Eric Alcaide
Abstract Activation functions have a notorious impact on neural networks on both training and testing the models against the desired problem. Currently, the most used activation function is the Rectified Linear Unit (ReLU). This paper introduces a new and novel activation function, closely related with the new activation $Swish = x * sigmoid(x)$ (Ramachandran et al., 2017) which generalizes it. We call the new activation $E-swish = \beta x * sigmoid(x)$. We show that E-swish outperforms many other well-known activations including both ReLU and Swish. For example, using E-swish provided 1.5% and 4.6% accuracy improvements on Cifar10 and Cifar100 respectively for the WRN 10-2 when compared to ReLU and 0.35% and 0.6% respectively when compared to Swish. The code to reproduce all our experiments can be found at https://github.com/EricAlcaide/E-swish
Tasks
Published 2018-01-22
URL http://arxiv.org/abs/1801.07145v1
PDF http://arxiv.org/pdf/1801.07145v1.pdf
PWC https://paperswithcode.com/paper/e-swish-adjusting-activations-to-different
Repo https://github.com/EricAlcaide/E-swish
Framework none

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Title Multi-Task Neural Models for Translating Between Styles Within and Across Languages
Authors Xing Niu, Sudha Rao, Marine Carpuat
Abstract Generating natural language requires conveying content in an appropriate style. We explore two related tasks on generating text of varying formality: monolingual formality transfer and formality-sensitive machine translation. We propose to solve these tasks jointly using multi-task learning, and show that our models achieve state-of-the-art performance for formality transfer and are able to perform formality-sensitive translation without being explicitly trained on style-annotated translation examples.
Tasks Machine Translation, Multi-Task Learning
Published 2018-06-12
URL http://arxiv.org/abs/1806.04357v1
PDF http://arxiv.org/pdf/1806.04357v1.pdf
PWC https://paperswithcode.com/paper/multi-task-neural-models-for-translating
Repo https://github.com/xingniu/multitask-ft-fsmt
Framework mxnet

Gated Hierarchical Attention for Image Captioning

Title Gated Hierarchical Attention for Image Captioning
Authors Qingzhong Wang, Antoni B. Chan
Abstract Attention modules connecting encoder and decoders have been widely applied in the field of object recognition, image captioning, visual question answering and neural machine translation, and significantly improves the performance. In this paper, we propose a bottom-up gated hierarchical attention (GHA) mechanism for image captioning. Our proposed model employs a CNN as the decoder which is able to learn different concepts at different layers, and apparently, different concepts correspond to different areas of an image. Therefore, we develop the GHA in which low-level concepts are merged into high-level concepts and simultaneously low-level attended features pass to the top to make predictions. Our GHA significantly improves the performance of the model that only applies one level attention, for example, the CIDEr score increases from 0.923 to 0.999, which is comparable to the state-of-the-art models that employ attributes boosting and reinforcement learning (RL). We also conduct extensive experiments to analyze the CNN decoder and our proposed GHA, and we find that deeper decoders cannot obtain better performance, and when the convolutional decoder becomes deeper the model is likely to collapse during training.
Tasks Image Captioning
Published 2018-10-30
URL http://arxiv.org/abs/1810.12535v2
PDF http://arxiv.org/pdf/1810.12535v2.pdf
PWC https://paperswithcode.com/paper/gated-hierarchical-attention-for-image
Repo https://github.com/qingzwang/GHA-ImageCaptioning
Framework pytorch

Unsupervised Adversarial Depth Estimation using Cycled Generative Networks

Title Unsupervised Adversarial Depth Estimation using Cycled Generative Networks
Authors Andrea Pilzer, Dan Xu, Mihai Marian Puscas, Elisa Ricci, Nicu Sebe
Abstract While recent deep monocular depth estimation approaches based on supervised regression have achieved remarkable performance, costly ground truth annotations are required during training. To cope with this issue, in this paper we present a novel unsupervised deep learning approach for predicting depth maps and show that the depth estimation task can be effectively tackled within an adversarial learning framework. Specifically, we propose a deep generative network that learns to predict the correspondence field i.e. the disparity map between two image views in a calibrated stereo camera setting. The proposed architecture consists of two generative sub-networks jointly trained with adversarial learning for reconstructing the disparity map and organized in a cycle such as to provide mutual constraints and supervision to each other. Extensive experiments on the publicly available datasets KITTI and Cityscapes demonstrate the effectiveness of the proposed model and competitive results with state of the art methods. The code and trained model are available on https://github.com/andrea-pilzer/unsup-stereo-depthGAN.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2018-07-28
URL http://arxiv.org/abs/1807.10915v1
PDF http://arxiv.org/pdf/1807.10915v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-adversarial-depth-estimation
Repo https://github.com/rickgroen/depthgan
Framework pytorch
comments powered by Disqus