October 20, 2019

2752 words 13 mins read

Paper Group AWR 221

On the Solvability of Viewing Graphs. f-VAEs: Improve VAEs with Conditional Flows. Egocentric Spatial Memory. DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama. Forging new worlds: high-resolution synthetic galaxies with chained generative adversarial networks. Outlier Aware Network Embedding for Attributed …

On the Solvability of Viewing Graphs


Title	On the Solvability of Viewing Graphs
Authors	Matthew Trager, Brian Osserman, Jean Ponce
Abstract	A set of fundamental matrices relating pairs of cameras in some configuration can be represented as edges of a “viewing graph”. Whether or not these fundamental matrices are generically sufficient to recover the global camera configuration depends on the structure of this graph. We study characterizations of “solvable” viewing graphs and present several new results that can be applied to determine which pairs of views may be used to recover all camera parameters. We also discuss strategies for verifying the solvability of a graph computationally.
Tasks
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02856v2
PDF	http://arxiv.org/pdf/1808.02856v2.pdf
PWC	https://paperswithcode.com/paper/on-the-solvability-of-viewing-graphs
Repo	https://github.com/mtrager/viewing-graphs
Framework	none

f-VAEs: Improve VAEs with Conditional Flows


Title	f-VAEs: Improve VAEs with Conditional Flows
Authors	Jianlin Su, Guang Wu
Abstract	In this paper, we integrate VAEs and flow-based generative models successfully and get f-VAEs. Compared with VAEs, f-VAEs generate more vivid images, solved the blurred-image problem of VAEs. Compared with flow-based models such as Glow, f-VAE is more lightweight and converges faster, achieving the same performance under smaller-size architecture.
Tasks
Published	2018-09-16
URL	http://arxiv.org/abs/1809.05861v1
PDF	http://arxiv.org/pdf/1809.05861v1.pdf
PWC	https://paperswithcode.com/paper/f-vaes-improve-vaes-with-conditional-flows
Repo	https://github.com/bojone/flow
Framework	tf

Egocentric Spatial Memory


Title	Egocentric Spatial Memory
Authors	Mengmi Zhang, Keng Teck Ma, Shih-Cheng Yen, Joo Hwee Lim, Qi Zhao, Jiashi Feng
Abstract	Egocentric spatial memory (ESM) defines a memory system with encoding, storing, recognizing and recalling the spatial information about the environment from an egocentric perspective. We introduce an integrated deep neural network architecture for modeling ESM. It learns to estimate the occupancy state of the world and progressively construct top-down 2D global maps from egocentric views in a spatially extended environment. During the exploration, our proposed ESM model updates belief of the global map based on local observations using a recurrent neural network. It also augments the local mapping with a novel external memory to encode and store latent representations of the visited places over long-term exploration in large environments which enables agents to perform place recognition and hence, loop closure. Our proposed ESM network contributes in the following aspects: (1) without feature engineering, our model predicts free space based on egocentric views efficiently in an end-to-end manner; (2) different from other deep learning-based mapping system, ESMN deals with continuous actions and states which is vitally important for robotic control in real applications. In the experiments, we demonstrate its accurate and robust global mapping capacities in 3D virtual mazes and realistic indoor environments by comparing with several competitive baselines.
Tasks	Feature Engineering
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11929v1
PDF	http://arxiv.org/pdf/1807.11929v1.pdf
PWC	https://paperswithcode.com/paper/egocentric-spatial-memory
Repo	https://github.com/Mengmi/Egocentric-Spatial-Memory
Framework	pytorch

DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama


Title	DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama
Authors	Shang-Ta Yang, Fu-En Wang, Chi-Han Peng, Peter Wonka, Min Sun, Hung-Kuo Chu
Abstract	We present a deep learning framework, called DuLa-Net, to predict Manhattan-world 3D room layouts from a single RGB panorama. To achieve better prediction accuracy, our method leverages two projections of the panorama at once, namely the equirectangular panorama-view and the perspective ceiling-view, that each contains different clues about the room layouts. Our network architecture consists of two encoder-decoder branches for analyzing each of the two views. In addition, a novel feature fusion structure is proposed to connect the two branches, which are then jointly trained to predict the 2D floor plans and layout heights. To learn more complex room layouts, we introduce the Realtor360 dataset that contains panoramas of Manhattan-world room layouts with different numbers of corners. Experimental results show that our work outperforms recent state-of-the-art in prediction accuracy and performance, especially in the rooms with non-cuboid layouts.
Tasks	3D Room Layouts From A Single Rgb Panorama
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11977v2
PDF	http://arxiv.org/pdf/1811.11977v2.pdf
PWC	https://paperswithcode.com/paper/dula-net-a-dual-projection-network-for
Repo	https://github.com/SunDaDenny/DuLa-Net
Framework	pytorch

Forging new worlds: high-resolution synthetic galaxies with chained generative adversarial networks


Title	Forging new worlds: high-resolution synthetic galaxies with chained generative adversarial networks
Authors	Levi Fussell, Ben Moews
Abstract	Astronomy of the 21st century increasingly finds itself with extreme quantities of data. This growth in data is ripe for modern technologies such as deep image processing, which has the potential to allow astronomers to automatically identify, classify, segment and deblend various astronomical objects. In this paper, we explore the use of chained generative adversarial networks (GANs), a class of generative models that learn mappings from latent spaces to data distributions by modelling the joint distribution of the data, to produce physically realistic galaxy images as one use case of such models. In cosmology, such datasets can aid in the calibration of shape measurements for weak lensing by augmenting data with synthetic images. By measuring the distributions of multiple physical properties, we show that images generated with our approach closely follow the distributions of real galaxies, further establishing state-of-the-art GAN architectures as a valuable tool for modern-day astronomy.
Tasks	Calibration
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03081v3
PDF	http://arxiv.org/pdf/1811.03081v3.pdf
PWC	https://paperswithcode.com/paper/forging-new-worlds-high-resolution-synthetic
Repo	https://github.com/levifussell/forging_new_worlds
Framework	pytorch

Outlier Aware Network Embedding for Attributed Networks


Title	Outlier Aware Network Embedding for Attributed Networks
Authors	Sambaran Bandyopadhyay, Lokesh N, M. N. Murty
Abstract	Attributed network embedding has received much interest from the research community as most of the networks come with some content in each node, which is also known as node attributes. Existing attributed network approaches work well when the network is consistent in structure and attributes, and nodes behave as expected. But real world networks often have anomalous nodes. Typically these outliers, being relatively unexplainable, affect the embeddings of other nodes in the network. Thus all the downstream network mining tasks fail miserably in the presence of such outliers. Hence an integrated approach to detect anomalies and reduce their overall effect on the network embedding is required. Towards this end, we propose an unsupervised outlier aware network embedding algorithm (ONE) for attributed networks, which minimizes the effect of the outlier nodes, and hence generates robust network embeddings. We align and jointly optimize the loss functions coming from structure and attributes of the network. To the best of our knowledge, this is the first generic network embedding approach which incorporates the effect of outliers for an attributed network without any supervision. We experimented on publicly available real networks and manually planted different types of outliers to check the performance of the proposed algorithm. Results demonstrate the superiority of our approach to detect the network outliers compared to the state-of-the-art approaches. We also consider different downstream machine learning applications on networks to show the efficiency of ONE as a generic network embedding technique. The source code is made available at https://github.com/sambaranban/ONE.
Tasks	Network Embedding
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07609v1
PDF	http://arxiv.org/pdf/1811.07609v1.pdf
PWC	https://paperswithcode.com/paper/outlier-aware-network-embedding-for
Repo	https://github.com/sambaranban/ONE
Framework	none


Title	Deep Learning Based Speed Estimation for Constraining Strapdown Inertial Navigation on Smartphones
Authors	Santiago Cortés, Arno Solin, Juho Kannala
Abstract	Strapdown inertial navigation systems are sensitive to the quality of the data provided by the accelerometer and gyroscope. Low-grade IMUs in handheld smart-devices pose a problem for inertial odometry on these devices. We propose a scheme for constraining the inertial odometry problem by complementing non-linear state estimation by a CNN-based deep-learning model for inferring the momentary speed based on a window of IMU samples. We show the feasibility of the model using a wide range of data from an iPhone, and present proof-of-concept results for how the model can be combined with an inertial navigation system for three-dimensional inertial navigation.
Tasks
Published	2018-08-10
URL	http://arxiv.org/abs/1808.03485v1
PDF	http://arxiv.org/pdf/1808.03485v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-speed-estimation-for
Repo	https://github.com/AaltoVision/deep-speed-constrained-ins
Framework	pytorch

Ranking Sentences for Extractive Summarization with Reinforcement Learning


Title	Ranking Sentences for Extractive Summarization with Reinforcement Learning
Authors	Shashi Narayan, Shay B. Cohen, Mirella Lapata
Abstract	Single document summarization is the task of producing a shorter version of a document while preserving its principal information content. In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective. We use our algorithm to train a neural summarization model on the CNN and DailyMail datasets and demonstrate experimentally that it outperforms state-of-the-art extractive and abstractive systems when evaluated automatically and by humans.
Tasks	Document Summarization
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08636v2
PDF	http://arxiv.org/pdf/1802.08636v2.pdf
PWC	https://paperswithcode.com/paper/ranking-sentences-for-extractive
Repo	https://github.com/shashiongithub/Refresh
Framework	tf

ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation


Title	ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation
Authors	Jiankai Sun, Bortik Bandyopadhyay, Armin Bashizade, Jiongqian Liang, P. Sadayappan, Srinivasan Parthasarathy
Abstract	Directed graphs have been widely used in Community Question Answering services (CQAs) to model asymmetric relationships among different types of nodes in CQA graphs, e.g., question, answer, user. Asymmetric transitivity is an essential property of directed graphs, since it can play an important role in downstream graph inference and analysis. Question difficulty and user expertise follow the characteristic of asymmetric transitivity. Maintaining such properties, while reducing the graph to a lower dimensional vector embedding space, has been the focus of much recent research. In this paper, we tackle the challenge of directed graph embedding with asymmetric transitivity preservation and then leverage the proposed embedding method to solve a fundamental task in CQAs: how to appropriately route and assign newly posted questions to users with the suitable expertise and interest in CQAs. The technique incorporates graph hierarchy and reachability information naturally by relying on a non-linear transformation that operates on the core reachability and implicit hierarchy within such graphs. Subsequently, the methodology levers a factorization-based approach to generate two embedding vectors for each node within the graph, to capture the asymmetric transitivity. Extensive experiments show that our framework consistently and significantly outperforms the state-of-the-art baselines on two diverse real-world tasks: link prediction, and question difficulty estimation and expert finding in online forums like Stack Exchange. Particularly, our framework can support inductive embedding learning for newly posted questions (unseen nodes during training), and therefore can properly route and assign these kinds of questions to experts in CQAs.
Tasks	Community Question Answering, Graph Embedding, Link Prediction, Question Answering
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00839v2
PDF	http://arxiv.org/pdf/1811.00839v2.pdf
PWC	https://paperswithcode.com/paper/atp-directed-graph-embedding-with-asymmetric
Repo	https://github.com/zhenv5/atp
Framework	none

Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability


Title	Graph Embedding with Shifted Inner Product Similarity and Its Improved Approximation Capability
Authors	Akifumi Okuno, Geewook Kim, Hidetoshi Shimodaira
Abstract	We propose shifted inner-product similarity (SIPS), which is a novel yet very simple extension of the ordinary inner-product similarity (IPS) for neural-network based graph embedding (GE). In contrast to IPS, that is limited to approximating positive-definite (PD) similarities, SIPS goes beyond the limitation by introducing bias terms in IPS; we theoretically prove that SIPS is capable of approximating not only PD but also conditionally PD (CPD) similarities with many examples such as cosine similarity, negative Poincare distance and negative Wasserstein distance. Since SIPS with sufficiently large neural networks learns a variety of similarities, SIPS alleviates the need for configuring the similarity function of GE. Approximation error rate is also evaluated, and experiments on two real-world datasets demonstrate that graph embedding using SIPS indeed outperforms existing methods.
Tasks	Graph Embedding
Published	2018-10-04
URL	http://arxiv.org/abs/1810.03463v2
PDF	http://arxiv.org/pdf/1810.03463v2.pdf
PWC	https://paperswithcode.com/paper/graph-embedding-with-shifted-inner-product
Repo	https://github.com/kdrl/SIPS
Framework	pytorch

Multi-Task Learning for Left Atrial Segmentation on GE-MRI


Title	Multi-Task Learning for Left Atrial Segmentation on GE-MRI
Authors	Chen Chen, Wenjia Bai, Daniel Rueckert
Abstract	Segmentation of the left atrium (LA) is crucial for assessing its anatomy in both pre-operative atrial fibrillation (AF) ablation planning and post-operative follow-up studies. In this paper, we present a fully automated framework for left atrial segmentation in gadolinium-enhanced magnetic resonance images (GE-MRI) based on deep learning. We propose a fully convolutional neural network and explore the benefits of multi-task learning for performing both atrial segmentation and pre/post ablation classification. Our results show that, by sharing features between related tasks, the network can gain additional anatomical information and achieve more accurate atrial segmentation, leading to a mean Dice score of 0.901 on a test set of 20 3D MRI images. Code of our proposed algorithm is available at https://github.com/cherise215/atria_segmentation_2018/.
Tasks	Multi-Task Learning
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13205v1
PDF	http://arxiv.org/pdf/1810.13205v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-for-left-atrial
Repo	https://github.com/cherise215/atria_segmentation_2018
Framework	pytorch

E-swish: Adjusting Activations to Different Network Depths


Title	E-swish: Adjusting Activations to Different Network Depths
Authors	Eric Alcaide
Abstract	Activation functions have a notorious impact on neural networks on both training and testing the models against the desired problem. Currently, the most used activation function is the Rectified Linear Unit (ReLU). This paper introduces a new and novel activation function, closely related with the new activation $Swish = x * sigmoid(x)$ (Ramachandran et al., 2017) which generalizes it. We call the new activation $E-swish = \beta x * sigmoid(x)$. We show that E-swish outperforms many other well-known activations including both ReLU and Swish. For example, using E-swish provided 1.5% and 4.6% accuracy improvements on Cifar10 and Cifar100 respectively for the WRN 10-2 when compared to ReLU and 0.35% and 0.6% respectively when compared to Swish. The code to reproduce all our experiments can be found at https://github.com/EricAlcaide/E-swish
Tasks
Published	2018-01-22
URL	http://arxiv.org/abs/1801.07145v1
PDF	http://arxiv.org/pdf/1801.07145v1.pdf
PWC	https://paperswithcode.com/paper/e-swish-adjusting-activations-to-different
Repo	https://github.com/EricAlcaide/E-swish
Framework	none

Multi-Task Neural Models for Translating Between Styles Within and Across Languages


Title	Multi-Task Neural Models for Translating Between Styles Within and Across Languages
Authors	Xing Niu, Sudha Rao, Marine Carpuat
Abstract	Generating natural language requires conveying content in an appropriate style. We explore two related tasks on generating text of varying formality: monolingual formality transfer and formality-sensitive machine translation. We propose to solve these tasks jointly using multi-task learning, and show that our models achieve state-of-the-art performance for formality transfer and are able to perform formality-sensitive translation without being explicitly trained on style-annotated translation examples.
Tasks	Machine Translation, Multi-Task Learning
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04357v1
PDF	http://arxiv.org/pdf/1806.04357v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-neural-models-for-translating
Repo	https://github.com/xingniu/multitask-ft-fsmt
Framework	mxnet

Gated Hierarchical Attention for Image Captioning


Title	Gated Hierarchical Attention for Image Captioning
Authors	Qingzhong Wang, Antoni B. Chan
Abstract	Attention modules connecting encoder and decoders have been widely applied in the field of object recognition, image captioning, visual question answering and neural machine translation, and significantly improves the performance. In this paper, we propose a bottom-up gated hierarchical attention (GHA) mechanism for image captioning. Our proposed model employs a CNN as the decoder which is able to learn different concepts at different layers, and apparently, different concepts correspond to different areas of an image. Therefore, we develop the GHA in which low-level concepts are merged into high-level concepts and simultaneously low-level attended features pass to the top to make predictions. Our GHA significantly improves the performance of the model that only applies one level attention, for example, the CIDEr score increases from 0.923 to 0.999, which is comparable to the state-of-the-art models that employ attributes boosting and reinforcement learning (RL). We also conduct extensive experiments to analyze the CNN decoder and our proposed GHA, and we find that deeper decoders cannot obtain better performance, and when the convolutional decoder becomes deeper the model is likely to collapse during training.
Tasks	Image Captioning
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12535v2
PDF	http://arxiv.org/pdf/1810.12535v2.pdf
PWC	https://paperswithcode.com/paper/gated-hierarchical-attention-for-image
Repo	https://github.com/qingzwang/GHA-ImageCaptioning
Framework	pytorch

Unsupervised Adversarial Depth Estimation using Cycled Generative Networks


Title	Unsupervised Adversarial Depth Estimation using Cycled Generative Networks
Authors	Andrea Pilzer, Dan Xu, Mihai Marian Puscas, Elisa Ricci, Nicu Sebe
Abstract	While recent deep monocular depth estimation approaches based on supervised regression have achieved remarkable performance, costly ground truth annotations are required during training. To cope with this issue, in this paper we present a novel unsupervised deep learning approach for predicting depth maps and show that the depth estimation task can be effectively tackled within an adversarial learning framework. Specifically, we propose a deep generative network that learns to predict the correspondence field i.e. the disparity map between two image views in a calibrated stereo camera setting. The proposed architecture consists of two generative sub-networks jointly trained with adversarial learning for reconstructing the disparity map and organized in a cycle such as to provide mutual constraints and supervision to each other. Extensive experiments on the publicly available datasets KITTI and Cityscapes demonstrate the effectiveness of the proposed model and competitive results with state of the art methods. The code and trained model are available on https://github.com/andrea-pilzer/unsup-stereo-depthGAN.
Tasks	Depth Estimation, Monocular Depth Estimation
Published	2018-07-28
URL	http://arxiv.org/abs/1807.10915v1
PDF	http://arxiv.org/pdf/1807.10915v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-adversarial-depth-estimation
Repo	https://github.com/rickgroen/depthgan
Framework	pytorch