Paper Group AWR 340
The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight. Cross-Domain Adversarial Auto-Encoder. Probabilistic Binary Neural Networks. Recurrent Skipping Networks for Entity Alignment. Laplacian Smoothing Gradient Descent. Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularit …
The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight
Title | The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight |
Authors | Amado Antonini, Winter Guerra, Varun Murali, Thomas Sayre-McCord, Sertac Karaman |
Abstract | The Blackbird unmanned aerial vehicle (UAV) dataset is a large-scale, aggressive indoor flight dataset collected using a custom-built quadrotor platform for use in evaluation of agile perception.Inspired by the potential of future high-speed fully-autonomous drone racing, the Blackbird dataset contains over 10 hours of flight data from 168 flights over 17 flight trajectories and 5 environments at velocities up to $7.0ms^-1$. Each flight includes sensor data from 120Hz stereo and downward-facing photorealistic virtual cameras, 100Hz IMU, $\sim190Hz$ motor speed sensors, and 360Hz millimeter-accurate motion capture ground truth. Camera images for each flight were photorealistically rendered using FlightGoggles across a variety of environments to facilitate easy experimentation of high performance perception algorithms. The dataset is available for download at http://blackbird-dataset. mit.edu/. |
Tasks | Motion Capture |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01987v1 |
http://arxiv.org/pdf/1810.01987v1.pdf | |
PWC | https://paperswithcode.com/paper/the-blackbird-dataset-a-large-scale-dataset |
Repo | https://github.com/mit-fast/FlightGoggles |
Framework | none |
Cross-Domain Adversarial Auto-Encoder
Title | Cross-Domain Adversarial Auto-Encoder |
Authors | Haodi Hou, Jing Huo, Yang Gao |
Abstract | In this paper, we propose the Cross-Domain Adversarial Auto-Encoder (CDAAE) to address the problem of cross-domain image inference, generation and transformation. We make the assumption that images from different domains share the same latent code space for content, while having separate latent code space for style. The proposed framework can map cross-domain data to a latent code vector consisting of a content part and a style part. The latent code vector is matched with a prior distribution so that we can generate meaningful samples from any part of the prior space. Consequently, given a sample of one domain, our framework can generate various samples of the other domain with the same content of the input. This makes the proposed framework different from the current work of cross-domain transformation. Besides, the proposed framework can be trained with both labeled and unlabeled data, which makes it also suitable for domain adaptation. Experimental results on data sets SVHN, MNIST and CASIA show the proposed framework achieved visually appealing performance for image generation task. Besides, we also demonstrate the proposed method achieved superior results for domain adaptation. Code of our experiments is available in https://github.com/luckycallor/CDAAE. |
Tasks | Domain Adaptation, Image Generation |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06078v1 |
http://arxiv.org/pdf/1804.06078v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-adversarial-auto-encoder |
Repo | https://github.com/luckycallor/CDAAE |
Framework | tf |
Probabilistic Binary Neural Networks
Title | Probabilistic Binary Neural Networks |
Authors | Jorn W. T. Peters, Max Welling |
Abstract | Low bit-width weights and activations are an effective way of combating the increasing need for both memory and compute power of Deep Neural Networks. In this work, we present a probabilistic training method for Neural Network with both binary weights and activations, called BLRNet. By embracing stochasticity during training, we circumvent the need to approximate the gradient of non-differentiable functions such as sign(), while still obtaining a fully Binary Neural Network at test time. Moreover, it allows for anytime ensemble predictions for improved performance and uncertainty estimates by sampling from the weight distribution. Since all operations in a layer of the BLRNet operate on random variables, we introduce stochastic versions of Batch Normalization and max pooling, which transfer well to a deterministic network at test time. We evaluate the BLRNet on multiple standardized benchmarks. |
Tasks | |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03368v1 |
http://arxiv.org/pdf/1809.03368v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-binary-neural-networks |
Repo | https://github.com/COMP6248-Reproducability-Challenge/Reproduction-of-Probabilistic-binary-neural-networks |
Framework | pytorch |
Recurrent Skipping Networks for Entity Alignment
Title | Recurrent Skipping Networks for Entity Alignment |
Authors | Lingbing Guo, Zequn Sun, Ermei Cao, Wei Hu |
Abstract | We consider the problem of learning knowledge graph (KG) embeddings for entity alignment (EA). Current methods use the embedding models mainly focusing on triple-level learning, which lacks the ability of capturing long-term dependencies existing in KGs. Consequently, the embedding-based EA methods heavily rely on the amount of prior (known) alignment, due to the identity information in the prior alignment cannot be efficiently propagated from one KG to another. In this paper, we propose RSN4EA (recurrent skipping networks for EA), which leverages biased random walk sampling for generating long paths across KGs and models the paths with a novel recurrent skipping network (RSN). RSN integrates the conventional recurrent neural network (RNN) with residual learning and can largely improve the convergence speed and performance with only a few more parameters. We evaluated RSN4EA on a series of datasets constructed from real-world KGs. Our experimental results showed that it outperformed a number of state-of-the-art embedding-based EA methods and also achieved comparable performance for KG completion. |
Tasks | Entity Alignment |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02318v1 |
http://arxiv.org/pdf/1811.02318v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-skipping-networks-for-entity |
Repo | https://github.com/THU-KEG/Entity_Alignment_Papers |
Framework | tf |
Laplacian Smoothing Gradient Descent
Title | Laplacian Smoothing Gradient Descent |
Authors | Stanley Osher, Bao Wang, Penghang Yin, Xiyang Luo, Farzin Barekat, Minh Pham, Alex Lin |
Abstract | We propose a class of very simple modifications of gradient descent and stochastic gradient descent. We show that when applied to a large variety of machine learning problems, ranging from logistic regression to deep neural nets, the proposed surrogates can dramatically reduce the variance, allow to take a larger step size, and improve the generalization accuracy. The methods only involve multiplying the usual (stochastic) gradient by the inverse of a positive definitive matrix (which can be computed efficiently by FFT) with a low condition number coming from a one-dimensional discrete Laplacian or its high order generalizations. It also preserves the mean and increases the smallest component and decreases the largest component. The theory of Hamilton-Jacobi partial differential equations demonstrates that the implicit version of the new algorithm is almost the same as doing gradient descent on a new function which (i) has the same global minima as the original function and (ii) is ``more convex”. Moreover, we show that optimization algorithms with these surrogates converge uniformly in the discrete Sobolev $H_\sigma^p$ sense and reduce the optimality gap for convex optimization problems. The code is available at: \url{https://github.com/BaoWangMath/LaplacianSmoothing-GradientDescent} | |
Tasks | |
Published | 2018-06-17 |
URL | http://arxiv.org/abs/1806.06317v5 |
http://arxiv.org/pdf/1806.06317v5.pdf | |
PWC | https://paperswithcode.com/paper/laplacian-smoothing-gradient-descent |
Repo | https://github.com/BaoWangMath/LaplacianSmoothing-GradientDescent |
Framework | pytorch |
Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity
Title | Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity |
Authors | Christian Koch, Moritz Lode, Denny Stohr, Amr Rizk, Ralf Steinmetz |
Abstract | YouTube is one of the most popular platforms for streaming of user-generated video. Nowadays, professional YouTubers are organized in so called multi-channel networks (MCNs). These networks offer services such as brand deals, equipment, and strategic advice in exchange for a share of the YouTubers’ revenue. A major strategy to gain more subscribers and, hence, revenue is collaborating with other YouTubers. Yet, collaborations on YouTube have not been studied in a detailed quantitative manner. This paper aims to close this gap with the following contributions. First, we collect a YouTube dataset covering video statistics over three months for 7,942 channels. Second, we design a framework for collaboration detection given a previously unknown number of persons featuring in YouTube videos. We denote this framework for the analysis of collaborations in YouTube videos using a Deep Neural Network (DNN) based approach as CATANA. Third, we analyze about 2.4 years of video content and use CATANA to answer research questions providing guidance for YouTubers and MCNs for efficient collaboration strategies. Thereby, we focus on (i) collaboration frequency and partner selectivity, (ii) the influence of MCNs on channel collaborations, (iii) collaborating channel types, and (iv) the impact of collaborations on video and channel popularity. Our results show that collaborations are in many cases significantly beneficial in terms of viewers and newly attracted subscribers for both collaborating channels, showing often more than 100% popularity growth compared with non-collaboration videos. |
Tasks | |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.01887v1 |
http://arxiv.org/pdf/1805.01887v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborations-on-youtube-from-unsupervised |
Repo | https://github.com/christiannkoch/CATANA |
Framework | none |
Dynamic Meta-Embeddings for Improved Sentence Representations
Title | Dynamic Meta-Embeddings for Improved Sentence Representations |
Authors | Douwe Kiela, Changhan Wang, Kyunghyun Cho |
Abstract | While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state-of-the-art performance within the same model class on a variety of tasks. We subsequently show how the technique can be used to shed new light on the usage of word embeddings in NLP systems. |
Tasks | Word Embeddings |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07983v2 |
http://arxiv.org/pdf/1804.07983v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-meta-embeddings-for-improved-sentence |
Repo | https://github.com/kushalchauhan98/dynamic-meta-embeddings |
Framework | pytorch |
Stochastic Block Models are a Discrete Surface Tension
Title | Stochastic Block Models are a Discrete Surface Tension |
Authors | Zachary M. Boyd, Mason A. Porter, Andrea L. Bertozzi |
Abstract | Networks, which represent agents and interactions between them, arise in myriad applications throughout the sciences, engineering, and even the humanities. To understand large-scale structure in a network, a common task is to cluster a network’s nodes into sets called “communities”, such that there are dense connections within communities but sparse connections between them. A popular and statistically principled method to perform such clustering is to use a family of generative models known as stochastic block models (SBMs). In this paper, we show that maximum likelihood estimation in an SBM is a network analog of a well-known continuum surface-tension problem that arises from an application in metallurgy. To illustrate the utility of this relationship, we implement network analogs of three surface-tension algorithms, with which we successfully recover planted community structure in synthetic networks and which yield fascinating insights on empirical networks that we construct from hyperspectral videos. |
Tasks | Video Semantic Segmentation |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02485v2 |
http://arxiv.org/pdf/1806.02485v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-block-models-are-a-discrete |
Repo | https://github.com/zboyd2/SBM-surface-tension |
Framework | none |
Investigating Human Priors for Playing Video Games
Title | Investigating Human Priors for Playing Video Games |
Authors | Rachit Dubey, Pulkit Agrawal, Deepak Pathak, Thomas L. Griffiths, Alexei A. Efros |
Abstract | What makes humans so good at solving seemingly complex video games? Unlike computers, humans bring in a great deal of prior knowledge about the world, enabling efficient decision making. This paper investigates the role of human priors for solving video games. Given a sample game, we conduct a series of ablation studies to quantify the importance of various priors on human performance. We do this by modifying the video game environment to systematically mask different types of visual information that could be used by humans as priors. We find that removal of some prior knowledge causes a drastic degradation in the speed with which human players solve the game, e.g. from 2 minutes to over 20 minutes. Furthermore, our results indicate that general priors, such as the importance of objects and visual consistency, are critical for efficient game-play. Videos and the game manipulations are available at https://rach0012.github.io/humanRL_website/ |
Tasks | Decision Making |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10217v3 |
http://arxiv.org/pdf/1802.10217v3.pdf | |
PWC | https://paperswithcode.com/paper/investigating-human-priors-for-playing-video |
Repo | https://github.com/rach0012/humanRL_prior_games |
Framework | none |
Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting
Title | Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting |
Authors | Xialei Liu, Marc Masana, Luis Herranz, Joost Van de Weijer, Antonio M. Lopez, Andrew D. Bagdanov |
Abstract | In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to other state-of-the-art in lifelong learning without forgetting. |
Tasks | |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02950v4 |
http://arxiv.org/pdf/1802.02950v4.pdf | |
PWC | https://paperswithcode.com/paper/rotate-your-networks-better-weight |
Repo | https://github.com/xialeiliu/RotateNetworks |
Framework | tf |
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Title | The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks |
Authors | Jonathan Frankle, Michael Carbin |
Abstract | Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, contemporary experience is that the sparse architectures produced by pruning are difficult to train from the start, which would similarly improve training performance. We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the “lottery ticket hypothesis:” dense, randomly-initialized, feed-forward networks contain subnetworks (“winning tickets”) that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective. We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis and the importance of these fortuitous initializations. We consistently find winning tickets that are less than 10-20% of the size of several fully-connected and convolutional feed-forward architectures for MNIST and CIFAR10. Above this size, the winning tickets that we find learn faster than the original network and reach higher test accuracy. |
Tasks | Network Pruning |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03635v5 |
http://arxiv.org/pdf/1803.03635v5.pdf | |
PWC | https://paperswithcode.com/paper/the-lottery-ticket-hypothesis-finding-sparse |
Repo | https://github.com/gcastex/PruNet |
Framework | pytorch |
Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction
Title | Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction |
Authors | Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid |
Abstract | Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner. Recent approaches to single view depth estimation explore the possibility of learning without full supervision via minimizing photometric error. In this paper, we explore the use of stereo sequences for learning depth and visual odometry. The use of stereo sequences enables the use of both spatial (between left-right pairs) and temporal (forward backward) photometric warp error, and constrains the scene depth and camera motion to be in a common, real-world scale. At test time our framework is able to estimate single view depth and two-view odometry from a monocular sequence. We also show how we can improve on a standard photometric warp loss by considering a warp of deep features. We show through extensive experiments that: (i) jointly training for single view depth and visual odometry improves depth prediction because of the additional constraint imposed on depths and achieves competitive results for visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss for both single view depth estimation and visual odometry. Our method outperforms existing learning based methods on the KITTI driving dataset in both tasks. The source code is available at https://github.com/Huangying-Zhan/Depth-VO-Feat |
Tasks | Depth And Camera Motion, Depth Estimation, Monocular Depth Estimation, Visual Odometry |
Published | 2018-03-11 |
URL | http://arxiv.org/abs/1803.03893v3 |
http://arxiv.org/pdf/1803.03893v3.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-monocular-depth-1 |
Repo | https://github.com/Huangying-Zhan/Depth-VO-Feat |
Framework | caffe2 |
xView: Objects in Context in Overhead Imagery
Title | xView: Objects in Context in Overhead Imagery |
Authors | Darius Lam, Richard Kuzma, Kevin McGee, Samuel Dooley, Michael Laielli, Matthew Klaric, Yaroslav Bulatov, Brendan McCord |
Abstract | We introduce a new large-scale dataset for the advancement of object detection techniques and overhead object detection research. This satellite imagery dataset enables research progress pertaining to four key computer vision frontiers. We utilize a novel process for geospatial category detection and bounding box annotation with three stages of quality control. Our data is collected from WorldView-3 satellites at 0.3m ground sample distance, providing higher resolution imagery than most public satellite imagery datasets. We compare xView to other object detection datasets in both natural and overhead imagery domains and then provide a baseline analysis using the Single Shot MultiBox Detector. xView is one of the largest and most diverse publicly available object-detection datasets to date, with over 1 million objects across 60 classes in over 1,400 km^2 of imagery. |
Tasks | Object Detection |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07856v1 |
http://arxiv.org/pdf/1802.07856v1.pdf | |
PWC | https://paperswithcode.com/paper/xview-objects-in-context-in-overhead-imagery |
Repo | https://github.com/dellemc-hpc-ai/satellite_imagery_demo |
Framework | none |
Patch-based Progressive 3D Point Set Upsampling
Title | Patch-based Progressive 3D Point Set Upsampling |
Authors | Wang Yifan, Shihao Wu, Hui Huang, Daniel Cohen-Or, Olga Sorkine-Hornung |
Abstract | We present a detail-driven deep neural network for point set upsampling. A high-resolution point set is essential for point-based rendering and surface reconstruction. Inspired by the recent success of neural image super-resolution techniques, we progressively train a cascade of patch-based upsampling networks on different levels of detail end-to-end. We propose a series of architectural design contributions that lead to a substantial performance boost. The effect of each technical contribution is demonstrated in an ablation study. Qualitative and quantitative experiments show that our method significantly outperforms the state-of-the-art learning-based and optimazation-based approaches, both in terms of handling low-resolution inputs and revealing high-fidelity details. |
Tasks | Point Cloud Super Resolution, Point Set Upsampling, Super-Resolution |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11286v3 |
http://arxiv.org/pdf/1811.11286v3.pdf | |
PWC | https://paperswithcode.com/paper/patch-based-progressive-3d-point-set |
Repo | https://github.com/yifita/3PU |
Framework | tf |
A Quantitative Analysis of Multi-Winner Rules
Title | A Quantitative Analysis of Multi-Winner Rules |
Authors | Martin Lackner, Piotr Skowron |
Abstract | To choose a suitable multi-winner voting rule is a hard and ambiguous task. Depending on the context, it varies widely what constitutes the choice of an “optimal” subset of alternatives. In this paper, we offer a new perspective on measuring the quality of such subsets and—consequently—of multi-winner rules. We provide a quantitative analysis using methods from the theory of approximation algorithms and estimate how well multi-winner rules approximate two extreme objectives: a representation criterion defined via the Approval Chamberlin–Courant rule and a utilitarian criterion defined via Multi-winner Approval Voting. With both theoretical and experimental methods we classify multi-winner rules in terms of their quantitative alignment with these two opposing objectives. Our results provide fundamental information about the nature of multi-winner rules, and in particular about the necessary tradeoffs when choosing such a rule. |
Tasks | |
Published | 2018-01-04 |
URL | https://arxiv.org/abs/1801.01527v3 |
https://arxiv.org/pdf/1801.01527v3.pdf | |
PWC | https://paperswithcode.com/paper/a-quantitative-analysis-of-multi-winner-rules |
Repo | https://github.com/martinlackner/abcvoting |
Framework | none |