October 20, 2019

2993 words 15 mins read

Paper Group AWR 340

The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight. Cross-Domain Adversarial Auto-Encoder. Probabilistic Binary Neural Networks. Recurrent Skipping Networks for Entity Alignment. Laplacian Smoothing Gradient Descent. Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularit …

The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight


Title	The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight
Authors	Amado Antonini, Winter Guerra, Varun Murali, Thomas Sayre-McCord, Sertac Karaman
Abstract	The Blackbird unmanned aerial vehicle (UAV) dataset is a large-scale, aggressive indoor flight dataset collected using a custom-built quadrotor platform for use in evaluation of agile perception.Inspired by the potential of future high-speed fully-autonomous drone racing, the Blackbird dataset contains over 10 hours of flight data from 168 flights over 17 flight trajectories and 5 environments at velocities up to $7.0ms^-1$. Each flight includes sensor data from 120Hz stereo and downward-facing photorealistic virtual cameras, 100Hz IMU, $\sim190Hz$ motor speed sensors, and 360Hz millimeter-accurate motion capture ground truth. Camera images for each flight were photorealistically rendered using FlightGoggles across a variety of environments to facilitate easy experimentation of high performance perception algorithms. The dataset is available for download at http://blackbird-dataset. mit.edu/.
Tasks	Motion Capture
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01987v1
PDF	http://arxiv.org/pdf/1810.01987v1.pdf
PWC	https://paperswithcode.com/paper/the-blackbird-dataset-a-large-scale-dataset
Repo	https://github.com/mit-fast/FlightGoggles
Framework	none

Cross-Domain Adversarial Auto-Encoder


Title	Cross-Domain Adversarial Auto-Encoder
Authors	Haodi Hou, Jing Huo, Yang Gao
Abstract	In this paper, we propose the Cross-Domain Adversarial Auto-Encoder (CDAAE) to address the problem of cross-domain image inference, generation and transformation. We make the assumption that images from different domains share the same latent code space for content, while having separate latent code space for style. The proposed framework can map cross-domain data to a latent code vector consisting of a content part and a style part. The latent code vector is matched with a prior distribution so that we can generate meaningful samples from any part of the prior space. Consequently, given a sample of one domain, our framework can generate various samples of the other domain with the same content of the input. This makes the proposed framework different from the current work of cross-domain transformation. Besides, the proposed framework can be trained with both labeled and unlabeled data, which makes it also suitable for domain adaptation. Experimental results on data sets SVHN, MNIST and CASIA show the proposed framework achieved visually appealing performance for image generation task. Besides, we also demonstrate the proposed method achieved superior results for domain adaptation. Code of our experiments is available in https://github.com/luckycallor/CDAAE.
Tasks	Domain Adaptation, Image Generation
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06078v1
PDF	http://arxiv.org/pdf/1804.06078v1.pdf
PWC	https://paperswithcode.com/paper/cross-domain-adversarial-auto-encoder
Repo	https://github.com/luckycallor/CDAAE
Framework	tf

Probabilistic Binary Neural Networks


Title	Probabilistic Binary Neural Networks
Authors	Jorn W. T. Peters, Max Welling
Abstract	Low bit-width weights and activations are an effective way of combating the increasing need for both memory and compute power of Deep Neural Networks. In this work, we present a probabilistic training method for Neural Network with both binary weights and activations, called BLRNet. By embracing stochasticity during training, we circumvent the need to approximate the gradient of non-differentiable functions such as sign(), while still obtaining a fully Binary Neural Network at test time. Moreover, it allows for anytime ensemble predictions for improved performance and uncertainty estimates by sampling from the weight distribution. Since all operations in a layer of the BLRNet operate on random variables, we introduce stochastic versions of Batch Normalization and max pooling, which transfer well to a deterministic network at test time. We evaluate the BLRNet on multiple standardized benchmarks.
Tasks
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03368v1
PDF	http://arxiv.org/pdf/1809.03368v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-binary-neural-networks
Repo	https://github.com/COMP6248-Reproducability-Challenge/Reproduction-of-Probabilistic-binary-neural-networks
Framework	pytorch

Recurrent Skipping Networks for Entity Alignment


Title	Recurrent Skipping Networks for Entity Alignment
Authors	Lingbing Guo, Zequn Sun, Ermei Cao, Wei Hu
Abstract	We consider the problem of learning knowledge graph (KG) embeddings for entity alignment (EA). Current methods use the embedding models mainly focusing on triple-level learning, which lacks the ability of capturing long-term dependencies existing in KGs. Consequently, the embedding-based EA methods heavily rely on the amount of prior (known) alignment, due to the identity information in the prior alignment cannot be efficiently propagated from one KG to another. In this paper, we propose RSN4EA (recurrent skipping networks for EA), which leverages biased random walk sampling for generating long paths across KGs and models the paths with a novel recurrent skipping network (RSN). RSN integrates the conventional recurrent neural network (RNN) with residual learning and can largely improve the convergence speed and performance with only a few more parameters. We evaluated RSN4EA on a series of datasets constructed from real-world KGs. Our experimental results showed that it outperformed a number of state-of-the-art embedding-based EA methods and also achieved comparable performance for KG completion.
Tasks	Entity Alignment
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02318v1
PDF	http://arxiv.org/pdf/1811.02318v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-skipping-networks-for-entity
Repo	https://github.com/THU-KEG/Entity_Alignment_Papers
Framework	tf

Laplacian Smoothing Gradient Descent


Title	Laplacian Smoothing Gradient Descent
Authors	Stanley Osher, Bao Wang, Penghang Yin, Xiyang Luo, Farzin Barekat, Minh Pham, Alex Lin
Abstract	We propose a class of very simple modifications of gradient descent and stochastic gradient descent. We show that when applied to a large variety of machine learning problems, ranging from logistic regression to deep neural nets, the proposed surrogates can dramatically reduce the variance, allow to take a larger step size, and improve the generalization accuracy. The methods only involve multiplying the usual (stochastic) gradient by the inverse of a positive definitive matrix (which can be computed efficiently by FFT) with a low condition number coming from a one-dimensional discrete Laplacian or its high order generalizations. It also preserves the mean and increases the smallest component and decreases the largest component. The theory of Hamilton-Jacobi partial differential equations demonstrates that the implicit version of the new algorithm is almost the same as doing gradient descent on a new function which (i) has the same global minima as the original function and (ii) is ``more convex”. Moreover, we show that optimization algorithms with these surrogates converge uniformly in the discrete Sobolev $H_\sigma^p$ sense and reduce the optimality gap for convex optimization problems. The code is available at: \url{https://github.com/BaoWangMath/LaplacianSmoothing-GradientDescent} \|
Tasks
Published	2018-06-17
URL	http://arxiv.org/abs/1806.06317v5
PDF	http://arxiv.org/pdf/1806.06317v5.pdf
PWC	https://paperswithcode.com/paper/laplacian-smoothing-gradient-descent
Repo	https://github.com/BaoWangMath/LaplacianSmoothing-GradientDescent
Framework	pytorch

Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity


Title	Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity
Authors	Christian Koch, Moritz Lode, Denny Stohr, Amr Rizk, Ralf Steinmetz
Abstract	YouTube is one of the most popular platforms for streaming of user-generated video. Nowadays, professional YouTubers are organized in so called multi-channel networks (MCNs). These networks offer services such as brand deals, equipment, and strategic advice in exchange for a share of the YouTubers’ revenue. A major strategy to gain more subscribers and, hence, revenue is collaborating with other YouTubers. Yet, collaborations on YouTube have not been studied in a detailed quantitative manner. This paper aims to close this gap with the following contributions. First, we collect a YouTube dataset covering video statistics over three months for 7,942 channels. Second, we design a framework for collaboration detection given a previously unknown number of persons featuring in YouTube videos. We denote this framework for the analysis of collaborations in YouTube videos using a Deep Neural Network (DNN) based approach as CATANA. Third, we analyze about 2.4 years of video content and use CATANA to answer research questions providing guidance for YouTubers and MCNs for efficient collaboration strategies. Thereby, we focus on (i) collaboration frequency and partner selectivity, (ii) the influence of MCNs on channel collaborations, (iii) collaborating channel types, and (iv) the impact of collaborations on video and channel popularity. Our results show that collaborations are in many cases significantly beneficial in terms of viewers and newly attracted subscribers for both collaborating channels, showing often more than 100% popularity growth compared with non-collaboration videos.
Tasks
Published	2018-05-01
URL	http://arxiv.org/abs/1805.01887v1
PDF	http://arxiv.org/pdf/1805.01887v1.pdf
PWC	https://paperswithcode.com/paper/collaborations-on-youtube-from-unsupervised
Repo	https://github.com/christiannkoch/CATANA
Framework	none

Dynamic Meta-Embeddings for Improved Sentence Representations


Title	Dynamic Meta-Embeddings for Improved Sentence Representations
Authors	Douwe Kiela, Changhan Wang, Kyunghyun Cho
Abstract	While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state-of-the-art performance within the same model class on a variety of tasks. We subsequently show how the technique can be used to shed new light on the usage of word embeddings in NLP systems.
Tasks	Word Embeddings
Published	2018-04-21
URL	http://arxiv.org/abs/1804.07983v2
PDF	http://arxiv.org/pdf/1804.07983v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-meta-embeddings-for-improved-sentence
Repo	https://github.com/kushalchauhan98/dynamic-meta-embeddings
Framework	pytorch

Stochastic Block Models are a Discrete Surface Tension


Title	Stochastic Block Models are a Discrete Surface Tension
Authors	Zachary M. Boyd, Mason A. Porter, Andrea L. Bertozzi
Abstract	Networks, which represent agents and interactions between them, arise in myriad applications throughout the sciences, engineering, and even the humanities. To understand large-scale structure in a network, a common task is to cluster a network’s nodes into sets called “communities”, such that there are dense connections within communities but sparse connections between them. A popular and statistically principled method to perform such clustering is to use a family of generative models known as stochastic block models (SBMs). In this paper, we show that maximum likelihood estimation in an SBM is a network analog of a well-known continuum surface-tension problem that arises from an application in metallurgy. To illustrate the utility of this relationship, we implement network analogs of three surface-tension algorithms, with which we successfully recover planted community structure in synthetic networks and which yield fascinating insights on empirical networks that we construct from hyperspectral videos.
Tasks	Video Semantic Segmentation
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02485v2
PDF	http://arxiv.org/pdf/1806.02485v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-block-models-are-a-discrete
Repo	https://github.com/zboyd2/SBM-surface-tension
Framework	none

Investigating Human Priors for Playing Video Games


Title	Investigating Human Priors for Playing Video Games
Authors	Rachit Dubey, Pulkit Agrawal, Deepak Pathak, Thomas L. Griffiths, Alexei A. Efros
Abstract	What makes humans so good at solving seemingly complex video games? Unlike computers, humans bring in a great deal of prior knowledge about the world, enabling efficient decision making. This paper investigates the role of human priors for solving video games. Given a sample game, we conduct a series of ablation studies to quantify the importance of various priors on human performance. We do this by modifying the video game environment to systematically mask different types of visual information that could be used by humans as priors. We find that removal of some prior knowledge causes a drastic degradation in the speed with which human players solve the game, e.g. from 2 minutes to over 20 minutes. Furthermore, our results indicate that general priors, such as the importance of objects and visual consistency, are critical for efficient game-play. Videos and the game manipulations are available at https://rach0012.github.io/humanRL_website/
Tasks	Decision Making
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10217v3
PDF	http://arxiv.org/pdf/1802.10217v3.pdf
PWC	https://paperswithcode.com/paper/investigating-human-priors-for-playing-video
Repo	https://github.com/rach0012/humanRL_prior_games
Framework	none

Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting


Title	Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting
Authors	Xialei Liu, Marc Masana, Luis Herranz, Joost Van de Weijer, Antonio M. Lopez, Andrew D. Bagdanov
Abstract	In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to other state-of-the-art in lifelong learning without forgetting.
Tasks
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02950v4
PDF	http://arxiv.org/pdf/1802.02950v4.pdf
PWC	https://paperswithcode.com/paper/rotate-your-networks-better-weight
Repo	https://github.com/xialeiliu/RotateNetworks
Framework	tf

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks


Title	The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Authors	Jonathan Frankle, Michael Carbin
Abstract	Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, contemporary experience is that the sparse architectures produced by pruning are difficult to train from the start, which would similarly improve training performance. We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the “lottery ticket hypothesis:” dense, randomly-initialized, feed-forward networks contain subnetworks (“winning tickets”) that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective. We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis and the importance of these fortuitous initializations. We consistently find winning tickets that are less than 10-20% of the size of several fully-connected and convolutional feed-forward architectures for MNIST and CIFAR10. Above this size, the winning tickets that we find learn faster than the original network and reach higher test accuracy.
Tasks	Network Pruning
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03635v5
PDF	http://arxiv.org/pdf/1803.03635v5.pdf
PWC	https://paperswithcode.com/paper/the-lottery-ticket-hypothesis-finding-sparse
Repo	https://github.com/gcastex/PruNet
Framework	pytorch

Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction


Title	Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction
Authors	Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid
Abstract	Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner. Recent approaches to single view depth estimation explore the possibility of learning without full supervision via minimizing photometric error. In this paper, we explore the use of stereo sequences for learning depth and visual odometry. The use of stereo sequences enables the use of both spatial (between left-right pairs) and temporal (forward backward) photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale. At test time our framework is able to estimate single view depth and two-view odometry from a monocular sequence. We also show how we can improve on a standard photometric warp loss by considering a warp of deep features. We show through extensive experiments that: (i) jointly training for single view depth and visual odometry improves depth prediction because of the additional constraint imposed on depths and achieves competitive results for visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss for both single view depth estimation and visual odometry. Our method outperforms existing learning based methods on the KITTI driving dataset in both tasks. The source code is available at https://github.com/Huangying-Zhan/Depth-VO-Feat
Tasks	Depth And Camera Motion, Depth Estimation, Monocular Depth Estimation, Visual Odometry
Published	2018-03-11
URL	http://arxiv.org/abs/1803.03893v3
PDF	http://arxiv.org/pdf/1803.03893v3.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-monocular-depth-1
Repo	https://github.com/Huangying-Zhan/Depth-VO-Feat
Framework	caffe2

xView: Objects in Context in Overhead Imagery


Title	xView: Objects in Context in Overhead Imagery
Authors	Darius Lam, Richard Kuzma, Kevin McGee, Samuel Dooley, Michael Laielli, Matthew Klaric, Yaroslav Bulatov, Brendan McCord
Abstract	We introduce a new large-scale dataset for the advancement of object detection techniques and overhead object detection research. This satellite imagery dataset enables research progress pertaining to four key computer vision frontiers. We utilize a novel process for geospatial category detection and bounding box annotation with three stages of quality control. Our data is collected from WorldView-3 satellites at 0.3m ground sample distance, providing higher resolution imagery than most public satellite imagery datasets. We compare xView to other object detection datasets in both natural and overhead imagery domains and then provide a baseline analysis using the Single Shot MultiBox Detector. xView is one of the largest and most diverse publicly available object-detection datasets to date, with over 1 million objects across 60 classes in over 1,400 km^2 of imagery.
Tasks	Object Detection
Published	2018-02-22
URL	http://arxiv.org/abs/1802.07856v1
PDF	http://arxiv.org/pdf/1802.07856v1.pdf
PWC	https://paperswithcode.com/paper/xview-objects-in-context-in-overhead-imagery
Repo	https://github.com/dellemc-hpc-ai/satellite_imagery_demo
Framework	none

Patch-based Progressive 3D Point Set Upsampling


Title	Patch-based Progressive 3D Point Set Upsampling
Authors	Wang Yifan, Shihao Wu, Hui Huang, Daniel Cohen-Or, Olga Sorkine-Hornung
Abstract	We present a detail-driven deep neural network for point set upsampling. A high-resolution point set is essential for point-based rendering and surface reconstruction. Inspired by the recent success of neural image super-resolution techniques, we progressively train a cascade of patch-based upsampling networks on different levels of detail end-to-end. We propose a series of architectural design contributions that lead to a substantial performance boost. The effect of each technical contribution is demonstrated in an ablation study. Qualitative and quantitative experiments show that our method significantly outperforms the state-of-the-art learning-based and optimazation-based approaches, both in terms of handling low-resolution inputs and revealing high-fidelity details.
Tasks	Point Cloud Super Resolution, Point Set Upsampling, Super-Resolution
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11286v3
PDF	http://arxiv.org/pdf/1811.11286v3.pdf
PWC	https://paperswithcode.com/paper/patch-based-progressive-3d-point-set
Repo	https://github.com/yifita/3PU
Framework	tf

A Quantitative Analysis of Multi-Winner Rules


Title	A Quantitative Analysis of Multi-Winner Rules
Authors	Martin Lackner, Piotr Skowron
Abstract	To choose a suitable multi-winner voting rule is a hard and ambiguous task. Depending on the context, it varies widely what constitutes the choice of an “optimal” subset of alternatives. In this paper, we offer a new perspective on measuring the quality of such subsets and—consequently—of multi-winner rules. We provide a quantitative analysis using methods from the theory of approximation algorithms and estimate how well multi-winner rules approximate two extreme objectives: a representation criterion defined via the Approval Chamberlin–Courant rule and a utilitarian criterion defined via Multi-winner Approval Voting. With both theoretical and experimental methods we classify multi-winner rules in terms of their quantitative alignment with these two opposing objectives. Our results provide fundamental information about the nature of multi-winner rules, and in particular about the necessary tradeoffs when choosing such a rule.
Tasks
Published	2018-01-04
URL	https://arxiv.org/abs/1801.01527v3
PDF	https://arxiv.org/pdf/1801.01527v3.pdf
PWC	https://paperswithcode.com/paper/a-quantitative-analysis-of-multi-winner-rules
Repo	https://github.com/martinlackner/abcvoting
Framework	none