October 20, 2019

3108 words 15 mins read

Paper Group AWR 213

Paper Group AWR 213

Dual CNN Models for Unsupervised Monocular Depth Estimation. StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction. MVDepthNet: Real-time Multiview Depth Estimation Neural Network. Boosting Binary Optimization via Binary Classification: A Case Study of Job Shop Scheduling. A Recurrent Neural Network for Sentiment Quant …

Dual CNN Models for Unsupervised Monocular Depth Estimation

Title Dual CNN Models for Unsupervised Monocular Depth Estimation
Authors Vamshi Krishna Repala, Shiv Ram Dubey
Abstract The unsupervised depth estimation is the recent trend by utilizing the binocular stereo images to get rid of depth map ground truth. In unsupervised depth computation, the disparity images are generated by training the CNN with an image reconstruction loss. In this paper, a dual CNN based model is presented for unsupervised depth estimation with 6 losses (DNM6) with individual CNN for each view to generate the corresponding disparity map. The proposed dual CNN model is also extended with 12 losses (DNM12) by utilizing the cross disparities. The presented DNM6 and DNM12 models are experimented over KITTI driving and Cityscapes urban database and compared with the recent state-of-the-art result of unsupervised depth estimation. The code is available at: https://github.com/ishmav16/Dual-CNN-Models-for-Unsupervised-Monocular-Depth-Estimation.
Tasks Depth Estimation, Image Reconstruction, Monocular Depth Estimation
Published 2018-04-16
URL https://arxiv.org/abs/1804.06324v4
PDF https://arxiv.org/pdf/1804.06324v4.pdf
PWC https://paperswithcode.com/paper/dual-cnn-models-for-unsupervised-monocular
Repo https://github.com/ishmav16/Dual-CNN-Models-for-Unsupervised-Monocular-Depth-Estimation
Framework tf

StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction

Title StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction
Authors Sameh Khamis, Sean Fanello, Christoph Rhemann, Adarsh Kowdle, Julien Valentin, Shahram Izadi
Abstract This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60 fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free disparity maps. A key insight of this paper is that the network achieves a sub-pixel matching precision than is a magnitude higher than those of traditional stereo matching approaches. This allows us to achieve real-time performance by using a very low resolution cost volume that encodes all the information needed to achieve high disparity precision. Spatial precision is achieved by employing a learned edge-aware upsampling function. Our model uses a Siamese network to extract features from the left and right image. A first estimate of the disparity is computed in a very low resolution cost volume, then hierarchically the model re-introduces high-frequency details through a learned upsampling function that uses compact pixel-to-pixel refinement networks. Leveraging color input as a guide, this function is capable of producing high-quality edge-aware output. We achieve compelling results on multiple benchmarks, showing how the proposed method offers extreme flexibility at an acceptable computational budget.
Tasks Depth Estimation, Quantization, Stereo Matching, Stereo Matching Hand
Published 2018-07-24
URL http://arxiv.org/abs/1807.08865v1
PDF http://arxiv.org/pdf/1807.08865v1.pdf
PWC https://paperswithcode.com/paper/stereonet-guided-hierarchical-refinement-for
Repo https://github.com/meteorshowers/StereoNet
Framework pytorch

MVDepthNet: Real-time Multiview Depth Estimation Neural Network

Title MVDepthNet: Real-time Multiview Depth Estimation Neural Network
Authors Kaixuan Wang, Shaojie Shen
Abstract Although deep neural networks have been widely applied to computer vision problems, extending them into multiview depth estimation is non-trivial. In this paper, we present MVDepthNet, a convolutional network to solve the depth estimation problem given several image-pose pairs from a localized monocular camera in neighbor viewpoints. Multiview observations are encoded in a cost volume and then combined with the reference image to estimate the depth map using an encoder-decoder network. By encoding the information from multiview observations into the cost volume, our method achieves real-time performance and the flexibility of traditional methods that can be applied regardless of the camera intrinsic parameters and the number of images. Geometric data augmentation is used to train MVDepthNet. We further apply MVDepthNet in a monocular dense mapping system that continuously estimates depth maps using a single localized moving camera. Experiments show that our method can generate depth maps efficiently and precisely.
Tasks Data Augmentation, Depth Estimation
Published 2018-07-23
URL http://arxiv.org/abs/1807.08563v1
PDF http://arxiv.org/pdf/1807.08563v1.pdf
PWC https://paperswithcode.com/paper/mvdepthnet-real-time-multiview-depth
Repo https://github.com/HKUST-Aerial-Robotics/MVDepthNet
Framework pytorch

Boosting Binary Optimization via Binary Classification: A Case Study of Job Shop Scheduling

Title Boosting Binary Optimization via Binary Classification: A Case Study of Job Shop Scheduling
Authors Oleg V. Shylo, Hesam Shams
Abstract Many optimization techniques evaluate solutions consecutively, where the next candidate for evaluation is determined by the results of previous evaluations. For example, these include iterative methods, “black box” optimization algorithms, simulated annealing, evolutionary algorithms and tabu search, to name a few. When solving an optimization problem, these algorithms evaluate a large number of solutions, which raises the following question: Is it possible to learn something about the optimum using these solutions? In this paper, we define this “learning” question in terms of a logistic regression model and explore its predictive accuracy computationally. The proposed model uses a collection of solutions to predict the components of the optimal solutions. To illustrate the utility of such predictions, we embed the logistic regression model into the tabu search algorithm for job shop scheduling problem. The resulting framework is simple to implement, yet provides a significant boost to the performance of the standard tabu search.
Tasks
Published 2018-08-31
URL http://arxiv.org/abs/1808.10813v1
PDF http://arxiv.org/pdf/1808.10813v1.pdf
PWC https://paperswithcode.com/paper/boosting-binary-optimization-via-binary
Repo https://github.com/quasiquasar/gta-jobshop-data
Framework none

A Recurrent Neural Network for Sentiment Quantification

Title A Recurrent Neural Network for Sentiment Quantification
Authors Andrea Esuli, Alejandro Moreo Fernández, Fabrizio Sebastiani
Abstract Quantification is a supervised learning task that consists in predicting, given a set of classes C and a set D of unlabelled items, the prevalence (or relative frequency) p(cD) of each class c in C. Quantification can in principle be solved by classifying all the unlabelled items and counting how many of them have been attributed to each class. However, this “classify and count” approach has been shown to yield suboptimal quantification accuracy; this has established quantification as a task of its own, and given rise to a number of methods specifically devised for it. We propose a recurrent neural network architecture for quantification (that we call QuaNet) that observes the classification predictions to learn higher-order “quantification embeddings”, which are then refined by incorporating quantification predictions of simple classify-and-count-like methods. We test {QuaNet on sentiment quantification on text, showing that it substantially outperforms several state-of-the-art baselines.
Tasks
Published 2018-09-04
URL http://arxiv.org/abs/1809.00836v1
PDF http://arxiv.org/pdf/1809.00836v1.pdf
PWC https://paperswithcode.com/paper/a-recurrent-neural-network-for-sentiment
Repo https://github.com/HLT-ISTI/QuaNet
Framework none

Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming

Title Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming
Authors Kirthevasan Kandasamy, Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos
Abstract We design a new myopic strategy for a wide class of sequential design of experiment (DOE) problems, where the goal is to collect data in order to to fulfil a certain problem specific goal. Our approach, Myopic Posterior Sampling (MPS), is inspired by the classical posterior (Thompson) sampling algorithm for multi-armed bandits and leverages the flexibility of probabilistic programming and approximate Bayesian inference to address a broad set of problems. Empirically, this general-purpose strategy is competitive with more specialised methods in a wide array of DOE tasks, and more importantly, enables addressing complex DOE goals where no existing method seems applicable. On the theoretical side, we leverage ideas from adaptive submodularity and reinforcement learning to derive conditions under which MPS achieves sublinear regret against natural benchmark policies.
Tasks Bayesian Inference, Multi-Armed Bandits, Probabilistic Programming
Published 2018-05-25
URL http://arxiv.org/abs/1805.09964v1
PDF http://arxiv.org/pdf/1805.09964v1.pdf
PWC https://paperswithcode.com/paper/myopic-bayesian-design-of-experiments-via
Repo https://github.com/kirthevasank/mps
Framework none

Deep Counterfactual Regret Minimization

Title Deep Counterfactual Regret Minimization
Authors Noam Brown, Adam Lerer, Sam Gross, Tuomas Sandholm
Abstract Counterfactual Regret Minimization (CFR) is the leading framework for solving large imperfect-information games. It converges to an equilibrium by iteratively traversing the game tree. In order to deal with extremely large games, abstraction is typically applied before running CFR. The abstracted game is solved with tabular CFR, and its solution is mapped back to the full game. This process can be problematic because aspects of abstraction are often manual and domain specific, abstraction algorithms may miss important strategic nuances of the game, and there is a chicken-and-egg problem because determining a good abstraction requires knowledge of the equilibrium of the game. This paper introduces Deep Counterfactual Regret Minimization, a form of CFR that obviates the need for abstraction by instead using deep neural networks to approximate the behavior of CFR in the full game. We show that Deep CFR is principled and achieves strong performance in large poker games. This is the first non-tabular variant of CFR to be successful in large games.
Tasks
Published 2018-11-01
URL https://arxiv.org/abs/1811.00164v3
PDF https://arxiv.org/pdf/1811.00164v3.pdf
PWC https://paperswithcode.com/paper/deep-counterfactual-regret-minimization
Repo https://github.com/TopologicLogic/CFRM-ES-CFU-XGBoost
Framework none

Probabilistic FastText for Multi-Sense Word Embeddings

Title Probabilistic FastText for Multi-Sense Word Embeddings
Authors Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar
Abstract We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share statistical strength across sub-word structures (e.g. Latin roots), producing accurate representations of rare, misspelt, or even unseen words. Moreover, each component of the mixture can capture a different word sense. Probabilistic FastText outperforms both FastText, which has no probabilistic model, and dictionary-level probabilistic embeddings, which do not incorporate subword structures, on several word-similarity benchmarks, including English RareWord and foreign language datasets. We also achieve state-of-art performance on benchmarks that measure ability to discern different meanings. Thus, the proposed model is the first to achieve multi-sense representations while having enriched semantics on rare words.
Tasks Word Embeddings
Published 2018-06-07
URL http://arxiv.org/abs/1806.02901v1
PDF http://arxiv.org/pdf/1806.02901v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-fasttext-for-multi-sense-word
Repo https://github.com/benathi/multisense-prob-fasttext
Framework none

Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Code

Title Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Code
Authors Nghi D. Q. Bui, Lingxiao Jiang
Abstract Translating a program written in one programming language to another can be useful for software development tasks that need functionality implementations in different languages. Although past studies have considered this problem, they may be either specific to the language grammars, or specific to certain kinds of code elements (e.g., tokens, phrases, API uses). This paper proposes a new approach to automatically learn cross-language representations for various kinds of structural code elements that may be used for program translation. Our key idea is two folded: First, we normalize and enrich code token streams with additional structural and semantic information, and train cross-language vector representations for the tokens (a.k.a. shared embeddings based on word2vec, a neural-network-based technique for producing word embeddings; Second, hierarchically from bottom up, we construct shared embeddings for code elements of higher levels of granularity (e.g., expressions, statements, methods) from the embeddings for their constituents, and then build mappings among code elements across languages based on similarities among embeddings. Our preliminary evaluations on about 40,000 Java and C# source files from 9 software projects show that our approach can automatically learn shared embeddings for various code elements in different languages and identify their cross-language mappings with reasonable Mean Average Precision scores. When compared with an existing tool for mapping library API methods, our approach identifies many more mappings accurately. The mapping results and code can be accessed at https://github.com/bdqnghi/hierarchical-programming-language-mapping. We believe that our idea for learning cross-language vector representations with code structural information can be a useful step towards automated program translation.
Tasks Word Embeddings
Published 2018-03-13
URL http://arxiv.org/abs/1803.04715v1
PDF http://arxiv.org/pdf/1803.04715v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-learning-of-cross-language
Repo https://github.com/bdqnghi/hierarchical-programming-language-mapping
Framework none

Theory and Experiments on Vector Quantized Autoencoders

Title Theory and Experiments on Vector Quantized Autoencoders
Authors Aurko Roy, Ashish Vaswani, Arvind Neelakantan, Niki Parmar
Abstract Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks. There has been a surge in interest in discrete latent variable models, however, despite several recent improvements, the training of discrete latent variable models has remained challenging and their performance has mostly failed to match their continuous counterparts. Recent work on vector quantized autoencoders (VQ-VAE) has made substantial progress in this direction, with its perplexity almost matching that of a VAE on datasets such as CIFAR-10. In this work, we investigate an alternate training technique for VQ-VAE, inspired by its connection to the Expectation Maximization (EM) algorithm. Training the discrete bottleneck with EM helps us achieve better image generation results on CIFAR-10, and together with knowledge distillation, allows us to develop a non-autoregressive machine translation model whose accuracy almost matches a strong greedy autoregressive baseline Transformer, while being 3.3 times faster at inference.
Tasks Image Generation, Latent Variable Models, Machine Translation
Published 2018-05-28
URL http://arxiv.org/abs/1805.11063v2
PDF http://arxiv.org/pdf/1805.11063v2.pdf
PWC https://paperswithcode.com/paper/theory-and-experiments-on-vector-quantized
Repo https://github.com/jaywalnut310/Vector-Quantized-Autoencoders
Framework tf

Traffic Signs in the Wild: Highlights from the IEEE Video and Image Processing Cup 2017 Student Competition [SP Competitions]

Title Traffic Signs in the Wild: Highlights from the IEEE Video and Image Processing Cup 2017 Student Competition [SP Competitions]
Authors Dogancan Temel, Ghassan AlRegib
Abstract Robust and reliable traffic sign detection is necessary to bring autonomous vehicles onto our roads. State-of-the-art algorithms successfully perform traffic sign detection over existing databases that mostly lack severe challenging conditions. VIP Cup 2017 competition focused on detecting such traffic signs under challenging conditions. To facilitate such task and competition, we introduced a video dataset denoted as CURE-TSD that includes a variety of challenging conditions. The goal of this challenge was to implement traffic sign detection algorithms that can robustly perform under such challenging conditions. In this article, we share an overview of the VIP Cup 2017 experience including competition setup, teams, technical approaches, participation statistics, and competition experience through finalist teams members’ and organizers’ eyes.
Tasks Autonomous Vehicles
Published 2018-10-15
URL http://arxiv.org/abs/1810.06169v2
PDF http://arxiv.org/pdf/1810.06169v2.pdf
PWC https://paperswithcode.com/paper/traffic-signs-in-the-wild-highlights-from-the
Repo https://github.com/olivesgatech/CURE-TSR
Framework pytorch

DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

Title DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images
Authors Ilke Demir, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, Ramesh Raskar
Abstract We present the DeepGlobe 2018 Satellite Image Understanding Challenge, which includes three public competitions for segmentation, detection, and classification tasks on satellite images. Similar to other challenges in computer vision domain such as DAVIS and COCO, DeepGlobe proposes three datasets and corresponding evaluation methodologies, coherently bundled in three competitions with a dedicated workshop co-located with CVPR 2018. We observed that satellite imagery is a rich and structured source of information, yet it is less investigated than everyday images by computer vision researchers. However, bridging modern computer vision with remote sensing data analysis could have critical impact to the way we understand our environment and lead to major breakthroughs in global urban planning or climate change research. Keeping such bridging objective in mind, DeepGlobe aims to bring together researchers from different domains to raise awareness of remote sensing in the computer vision community and vice-versa. We aim to improve and evaluate state-of-the-art satellite image understanding approaches, which can hopefully serve as reference benchmarks for future research in the same topic. In this paper, we analyze characteristics of each dataset, define the evaluation criteria of the competitions, and provide baselines for each task.
Tasks
Published 2018-05-17
URL http://arxiv.org/abs/1805.06561v1
PDF http://arxiv.org/pdf/1805.06561v1.pdf
PWC https://paperswithcode.com/paper/deepglobe-2018-a-challenge-to-parse-the-earth
Repo https://github.com/chenwydj/ultra_high_resolution_segmentation
Framework pytorch

PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding

Title PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding
Authors Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, Hao Su
Abstract We present PartNet: a consistent, large-scale dataset of 3D objects annotated with fine-grained, instance-level, and hierarchical 3D part information. Our dataset consists of 573,585 part instances over 26,671 3D models covering 24 object categories. This dataset enables and serves as a catalyst for many tasks such as shape analysis, dynamic 3D scene modeling and simulation, affordance analysis, and others. Using our dataset, we establish three benchmarking tasks for evaluating 3D part recognition: fine-grained semantic segmentation, hierarchical semantic segmentation, and instance segmentation. We benchmark four state-of-the-art 3D deep learning algorithms for fine-grained semantic segmentation and three baseline methods for hierarchical semantic segmentation. We also propose a novel method for part instance segmentation and demonstrate its superior performance over existing methods.
Tasks 3D Object Understanding, Instance Segmentation, Semantic Segmentation
Published 2018-12-06
URL http://arxiv.org/abs/1812.02713v1
PDF http://arxiv.org/pdf/1812.02713v1.pdf
PWC https://paperswithcode.com/paper/partnet-a-large-scale-benchmark-for-fine
Repo https://github.com/daerduoCarey/partnet_anno_system
Framework none

Non-linear Attributed Graph Clustering by Symmetric NMF with PU Learning

Title Non-linear Attributed Graph Clustering by Symmetric NMF with PU Learning
Authors Seiji Maekawa, Koh Takeuch, Makoto Onizuka
Abstract We consider the clustering problem of attributed graphs. Our challenge is how we can design an effective and efficient clustering method that precisely captures the hidden relationship between the topology and the attributes in real-world graphs. We propose Non-linear Attributed Graph Clustering by Symmetric Non-negative Matrix Factorization with Positive Unlabeled Learning. The features of our method are three holds. 1) it learns a non-linear projection function between the different cluster assignments of the topology and the attributes of graphs so as to capture the complicated relationship between the topology and the attributes in real-world graphs, 2) it leverages the positive unlabeled learning to take the effect of partially observed positive edges into the cluster assignment, and 3) it achieves efficient computational complexity, $O((n^2+mn)kt)$, where $n$ is the vertex size, $m$ is the attribute size, $k$ is the number of clusters, and $t$ is the number of iterations for learning the cluster assignment. We conducted experiments extensively for various clustering methods with various real datasets to validate that our method outperforms the former clustering methods regarding the clustering quality.
Tasks Graph Clustering
Published 2018-09-21
URL http://arxiv.org/abs/1810.00946v1
PDF http://arxiv.org/pdf/1810.00946v1.pdf
PWC https://paperswithcode.com/paper/non-linear-attributed-graph-clustering-by
Repo https://github.com/seijimaekawa/NAGC
Framework none
Title Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation
Authors Shani Gamrian, Yoav Goldberg
Abstract Despite the remarkable success of Deep RL in learning control policies from raw pixels, the resulting models do not generalize. We demonstrate that a trained agent fails completely when facing small visual changes, and that fine-tuning—the common transfer learning paradigm—fails to adapt to these changes, to the extent that it is faster to re-train the model from scratch. We show that by separating the visual transfer task from the control policy we achieve substantially better sample efficiency and transfer behavior, allowing an agent trained on the source task to transfer well to the target tasks. The visual mapping from the target to the source domain is performed using unaligned GANs, resulting in a control policy that can be further improved using imitation learning from imperfect demonstrations. We demonstrate the approach on synthetic visual variants of the Breakout game, as well as on transfer between subsequent levels of Road Fighter, a Nintendo car-driving game. A visualization of our approach can be seen in https://youtu.be/4mnkzYyXMn4 and https://youtu.be/KCGTrQi6Ogo .
Tasks Image-to-Image Translation, Imitation Learning, Transfer Learning
Published 2018-05-31
URL https://arxiv.org/abs/1806.07377v6
PDF https://arxiv.org/pdf/1806.07377v6.pdf
PWC https://paperswithcode.com/paper/transfer-learning-for-related-reinforcement
Repo https://github.com/PeteXC/Transfer-Learning
Framework pytorch
comments powered by Disqus