October 20, 2019

3062 words 15 mins read

Paper Group AWR 353

Incremental Few-Shot Learning with Attention Attractor Networks. CoT: Cooperative Training for Generative Modeling of Discrete Data. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. Approximating Poker Probabilities with Deep Learning. Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic S …

Incremental Few-Shot Learning with Attention Attractor Networks


Title	Incremental Few-Shot Learning with Attention Attractor Networks
Authors	Mengye Ren, Renjie Liao, Ethan Fetaya, Richard S. Zemel
Abstract	Machine learning classifiers are often trained to recognize a set of pre-defined classes. However, in many applications, it is often desirable to have the flexibility of learning additional concepts, with limited data and without re-training on the full training set. This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples. After learning the novel classes, the model is then evaluated on the overall classification performance on both base and novel classes. To this end, we propose a meta-learning model, the Attention Attractor Network, which regularizes the learning of novel classes. In each episode, we train a set of new weights to recognize novel classes until they converge, and we show that the technique of recurrent back-propagation can back-propagate through the optimization process and facilitate the learning of these parameters. We demonstrate that the learned attractor network can help recognize novel classes while remembering old classes without the need to review the original training set, outperforming various baselines.
Tasks	Few-Shot Learning, Meta-Learning
Published	2018-10-16
URL	https://arxiv.org/abs/1810.07218v3
PDF	https://arxiv.org/pdf/1810.07218v3.pdf
PWC	https://paperswithcode.com/paper/incremental-few-shot-learning-with-attention
Repo	https://github.com/renmengye/inc-few-shot-attractor-public
Framework	tf

CoT: Cooperative Training for Generative Modeling of Discrete Data


Title	CoT: Cooperative Training for Generative Modeling of Discrete Data
Authors	Sidi Lu, Lantao Yu, Siyuan Feng, Yaoming Zhu, Weinan Zhang, Yong Yu
Abstract	In this paper, we study the generative models of sequential discrete data. To tackle the exposure bias problem inherent in maximum likelihood estimation (MLE), generative adversarial networks (GANs) are introduced to penalize the unrealistic generated samples. To exploit the supervision signal from the discriminator, most previous models leverage REINFORCE to address the non-differentiable problem of sequential discrete data. However, because of the unstable property of the training signal during the dynamic process of adversarial training, the effectiveness of REINFORCE, in this case, is hardly guaranteed. To deal with such a problem, we propose a novel approach called Cooperative Training (CoT) to improve the training of sequence generative models. CoT transforms the min-max game of GANs into a joint maximization framework and manages to explicitly estimate and optimize Jensen-Shannon divergence. Moreover, CoT works without the necessity of pre-training via MLE, which is crucial to the success of previous methods. In the experiments, compared to existing state-of-the-art methods, CoT shows superior or at least competitive performance on sample quality, diversity, as well as training stability.
Tasks
Published	2018-04-11
URL	https://arxiv.org/abs/1804.03782v3
PDF	https://arxiv.org/pdf/1804.03782v3.pdf
PWC	https://paperswithcode.com/paper/cot-cooperative-training-for-generative
Repo	https://github.com/desire2020/CoT
Framework	tf

Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm


Title	Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
Authors	Ziniu Hu, Yang Wang, Qu Peng, Hang Li
Abstract	Although click data is widely used in search systems in practice, so far the inherent bias, most notably position bias, has prevented it from being used in training of a ranker for search, i.e., learning-to-rank. Recently, a number of authors have proposed new techniques referred to as ‘unbiased learning-to-rank’, which can reduce position bias and train a relatively high-performance ranker using click data. Most of the algorithms, based on the inverse propensity weighting (IPW) principle, first estimate the click bias at each position, and then train an unbiased ranker with the estimated biases using a learning-to-rank algorithm. However, there has not been a method for pairwise learning-to-rank that can jointly conduct debiasing of click data and training of a ranker using a pairwise loss function. In this paper, we propose a novel algorithm, which can jointly estimate the biases at click positions and the biases at unclick positions, and learn an unbiased ranker. Experiments on benchmark data show that our algorithm can significantly outperform existing algorithms. In addition, an online A/B Testing at a commercial search engine shows that our algorithm can effectively conduct debiasing of click data and enhance relevance ranking.
Tasks	Learning-To-Rank
Published	2018-09-16
URL	http://arxiv.org/abs/1809.05818v2
PDF	http://arxiv.org/pdf/1809.05818v2.pdf
PWC	https://paperswithcode.com/paper/unbiased-lambdamart-an-unbiased-pairwise
Repo	https://github.com/acbull/Unbiased_LambdaMart
Framework	none

Approximating Poker Probabilities with Deep Learning


Title	Approximating Poker Probabilities with Deep Learning
Authors	Brandon Da Silva
Abstract	Many poker systems, whether created with heuristics or machine learning, rely on the probability of winning as a key input. However calculating the precise probability using combinatorics is an intractable problem, so instead we approximate it. Monte Carlo simulation is an effective technique that can be used to approximate the probability that a player will win and/or tie a hand. However, without the use of a memory-intensive lookup table or a supercomputer, it becomes infeasible to run millions of times when training an agent with self-play. To combat the space-time tradeoff, we use deep learning to approximate the probabilities obtained from the Monte Carlo simulation with high accuracy. The learned model proves to be a lightweight alternative to Monte Carlo simulation, which ultimately allows us to use the probabilities as inputs during self-play efficiently. The source code and optimized neural network can be found at https://github.com/brandinho/Poker-Probability-Approximation
Tasks	Card Games, Game of Poker
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07220v2
PDF	http://arxiv.org/pdf/1808.07220v2.pdf
PWC	https://paperswithcode.com/paper/approximating-poker-probabilities-with-deep
Repo	https://github.com/brandinho/Poker-Probability-Approximation
Framework	tf

Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic Segmentation


Title	Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic Segmentation
Authors	Shang-Wei Hung, Shao-Yuan Lo, Hsueh-Ming Hang
Abstract	Semantic segmentation has made encouraging progress due to the success of deep convolutional networks in recent years. Meanwhile, depth sensors become prevalent nowadays, so depth maps can be acquired more easily. However, there are few studies that focus on the RGB-D semantic segmentation task. Exploiting the depth information effectiveness to improve performance is a challenge. In this paper, we propose a novel solution named LDFNet, which incorporates Luminance, Depth and Color information by a fusion-based network. It includes a sub-network to process depth maps and employs luminance images to assist the depth information in processes. LDFNet outperforms the other state-of-art systems on the Cityscapes dataset, and its inference speed is faster than most of the existing networks. The experimental results show the effectiveness of the proposed multi-modal fusion network and its potential for practical applications.
Tasks	Autonomous Driving, Scene Understanding, Semantic Segmentation
Published	2018-09-24
URL	https://arxiv.org/abs/1809.09077v3
PDF	https://arxiv.org/pdf/1809.09077v3.pdf
PWC	https://paperswithcode.com/paper/incorporating-luminance-depth-and-color
Repo	https://github.com/shangweihung/LDFNet
Framework	pytorch

Implicit Maximum Likelihood Estimation


Title	Implicit Maximum Likelihood Estimation
Authors	Ke Li, Jitendra Malik
Abstract	Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.
Tasks
Published	2018-09-24
URL	http://arxiv.org/abs/1809.09087v2
PDF	http://arxiv.org/pdf/1809.09087v2.pdf
PWC	https://paperswithcode.com/paper/implicit-maximum-likelihood-estimation
Repo	https://github.com/baraklevy20/IMLE
Framework	none

Online Learning for Effort Reduction in Interactive Neural Machine Translation


Title	Online Learning for Effort Reduction in Interactive Neural Machine Translation
Authors	Álvaro Peris, Francisco Casacuberta
Abstract	Neural machine translation systems require large amounts of training data and resources. Even with this, the quality of the translations may be insufficient for some users or domains. In such cases, the output of the system must be revised by a human agent. This can be done in a post-editing stage or following an interactive machine translation protocol. We explore the incremental update of neural machine translation systems during the post-editing or interactive translation processes. Such modifications aim to incorporate the new knowledge, from the edited sentences, into the translation system. Updates to the model are performed on-the-fly, as sentences are corrected, via online learning techniques. In addition, we implement a novel interactive, adaptive system, able to react to single-character interactions. This system greatly reduces the human effort required for obtaining high-quality translations. In order to stress our proposals, we conduct exhaustive experiments varying the amount and type of data available for training. Results show that online learning effectively achieves the objective of reducing the human effort required during the post-editing or the interactive machine translation stages. Moreover, these adaptive systems also perform well in scenarios with scarce resources. We show that a neural machine translation system can be rapidly adapted to a specific domain, exclusively by means of online learning techniques.
Tasks	Machine Translation
Published	2018-02-10
URL	http://arxiv.org/abs/1802.03594v2
PDF	http://arxiv.org/pdf/1802.03594v2.pdf
PWC	https://paperswithcode.com/paper/online-learning-for-effort-reduction-in
Repo	https://github.com/lvapeab/nmt-keras
Framework	tf

Exploiting Structure for Fast Kernel Learning


Title	Exploiting Structure for Fast Kernel Learning
Authors	Trefor W. Evans, Prasanth B. Nair
Abstract	We propose two methods for exact Gaussian process (GP) inference and learning on massive image, video, spatial-temporal, or multi-output datasets with missing values (or “gaps”) in the observed responses. The first method ignores the gaps using sparse selection matrices and a highly effective low-rank preconditioner is introduced to accelerate computations. The second method introduces a novel approach to GP training whereby response values are inferred on the gaps before explicitly training the model. We find this second approach to be greatly advantageous for the class of problems considered. Both of these novel approaches make extensive use of Kronecker matrix algebra to design massively scalable algorithms which have low memory requirements. We demonstrate exact GP inference for a spatial-temporal climate modelling problem with 3.7 million training points as well as a video reconstruction problem with 1 billion points.
Tasks	Video Reconstruction
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03351v1
PDF	http://arxiv.org/pdf/1808.03351v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-structure-for-fast-kernel-learning
Repo	https://github.com/treforevans/gp_grid
Framework	none

Unsupervised Multilingual Word Embeddings


Title	Unsupervised Multilingual Word Embeddings
Authors	Xilun Chen, Claire Cardie
Abstract	Multilingual Word Embeddings (MWEs) represent words from multiple languages in a single distributional vector space. Unsupervised MWE (UMWE) methods acquire multilingual embeddings without cross-lingual supervision, which is a significant advantage over traditional supervised approaches and opens many new possibilities for low-resource languages. Prior art for learning UMWEs, however, merely relies on a number of independently trained Unsupervised Bilingual Word Embeddings (UBWEs) to obtain multilingual embeddings. These methods fail to leverage the interdependencies that exist among many languages. To address this shortcoming, we propose a fully unsupervised framework for learning MWEs that directly exploits the relations between all language pairs. Our model substantially outperforms previous approaches in the experiments on multilingual word translation and cross-lingual word similarity. In addition, our model even beats supervised approaches trained with cross-lingual resources.
Tasks	Multilingual Word Embeddings, Word Embeddings
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08933v2
PDF	http://arxiv.org/pdf/1808.08933v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-multilingual-word-embeddings
Repo	https://github.com/ccsasuke/umwe
Framework	pytorch

HSI-CNN: A Novel Convolution Neural Network for Hyperspectral Image


Title	HSI-CNN: A Novel Convolution Neural Network for Hyperspectral Image
Authors	Yanan Luo, Jie Zou, Chengfei Yao, Tao Li, Gang Bai
Abstract	With the development of deep learning, the performance of hyperspectral image (HSI) classification has been greatly improved in recent years. The shortage of training samples has become a bottleneck for further improvement of performance. In this paper, we propose a novel convolutional neural network framework for the characteristics of hyperspectral image data, called HSI-CNN. Firstly, the spectral-spatial feature is extracted from a target pixel and its neighbors. Then, a number of one-dimensional feature maps, obtained by convolution operation on spectral-spatial features, are stacked into a two-dimensional matrix. Finally, the two-dimensional matrix considered as an image is fed into standard CNN. This is why we call it HSI-CNN. In addition, we also implements two depth network classification models, called HSI-CNN+XGBoost and HSI-CapsNet, in order to compare the performance of our framework. Experiments show that the performance of hyperspectral image classification is improved efficiently with HSI-CNN framework. We evaluate the model’s performance using four popular HSI datasets, which are the Kennedy Space Center (KSC), Indian Pines (IP), Pavia University scene (PU) and Salinas scene (SA). As far as we concerned, HSI-CNN has got the state-of-art accuracy among all methods we have known on these datasets of 99.28%, 99.09%, 99.42%, 98.95% separately.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10478v1
PDF	http://arxiv.org/pdf/1802.10478v1.pdf
PWC	https://paperswithcode.com/paper/hsi-cnn-a-novel-convolution-neural-network
Repo	https://github.com/eecn/Hyperspectral-Classification
Framework	pytorch

Meta-Learning with Latent Embedding Optimization


Title	Meta-Learning with Latent Embedding Optimization
Authors	Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell
Abstract	Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space. The resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks. Further analysis indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space.
Tasks	Few-Shot Learning, Meta-Learning
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05960v3
PDF	http://arxiv.org/pdf/1807.05960v3.pdf
PWC	https://paperswithcode.com/paper/meta-learning-with-latent-embedding
Repo	https://github.com/deepmind/leo
Framework	tf

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?


Title	When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?
Authors	Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Janani Padmanabhan, Graham Neubig
Abstract	The performance of Neural Machine Translation (NMT) systems often suffers in low-resource scenarios where sufficiently large-scale parallel corpora cannot be obtained. Pre-trained word embeddings have proven to be invaluable for improving performance in natural language analysis tasks, which often suffer from paucity of data. However, their utility for NMT has not been extensively explored. In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. We show that such embeddings can be surprisingly effective in some cases – providing gains of up to 20 BLEU points in the most favorable setting.
Tasks	Machine Translation, Word Embeddings
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06323v2
PDF	http://arxiv.org/pdf/1804.06323v2.pdf
PWC	https://paperswithcode.com/paper/when-and-why-are-pre-trained-word-embeddings
Repo	https://github.com/neulab/word-embeddings-for-nmt
Framework	none

InGAN: Capturing and Remapping the “DNA” of a Natural Image


Title	InGAN: Capturing and Remapping the “DNA” of a Natural Image
Authors	Assaf Shocher, Shai Bagon, Phillip Isola, Michal Irani
Abstract	Generative Adversarial Networks (GANs) typically learn a distribution of images in a large image dataset, and are then able to generate new images from this distribution. However, each natural image has its own internal statistics, captured by its unique distribution of patches. In this paper we propose an “Internal GAN” (InGAN) - an image-specific GAN - which trains on a single input image and learns its internal distribution of patches. It is then able to synthesize a plethora of new natural images of significantly different sizes, shapes and aspect-ratios - all with the same internal patch-distribution (same “DNA”) as the input image. In particular, despite large changes in global size/shape of the image, all elements inside the image maintain their local size/shape. InGAN is fully unsupervised, requiring no additional data other than the input image itself. Once trained on the input image, it can remap the input to any size or shape in a single feedforward pass, while preserving the same internal patch distribution. InGAN provides a unified framework for a variety of tasks, bridging the gap between textures and natural images.
Tasks
Published	2018-12-01
URL	http://arxiv.org/abs/1812.00231v2
PDF	http://arxiv.org/pdf/1812.00231v2.pdf
PWC	https://paperswithcode.com/paper/internal-distribution-matching-for-natural
Repo	https://github.com/assafshocher/InGAN
Framework	pytorch

Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery


Title	Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery
Authors	Tim G. J. Rudner, Marc Rußwurm, Jakub Fil, Ramona Pelich, Benjamin Bischke, Veronika Kopackova, Piotr Bilinski
Abstract	We propose a novel approach for rapid segmentation of flooded buildings by fusing multiresolution, multisensor, and multitemporal satellite imagery in a convolutional neural network. Our model significantly expedites the generation of satellite imagery-based flood maps, crucial for first responders and local authorities in the early stages of flood events. By incorporating multitemporal satellite imagery, our model allows for rapid and accurate post-disaster damage assessment and can be used by governments to better coordinate medium- and long-term financial assistance programs for affected areas. The network consists of multiple streams of encoder-decoder architectures that extract spatiotemporal information from medium-resolution images and spatial information from high-resolution images before fusing the resulting representations into a single medium-resolution segmentation map of flooded buildings. We compare our model to state-of-the-art methods for building footprint segmentation as well as to alternative fusion approaches for the segmentation of flooded buildings and find that our model performs best on both tasks. We also demonstrate that our model produces highly accurate segmentation maps of flooded buildings using only publicly available medium-resolution data instead of significantly more detailed but sparsely available very high-resolution data. We release the first open-source dataset of fully preprocessed and labeled multiresolution, multispectral, and multitemporal satellite images of disaster sites along with our source code.
Tasks	Flooded Building Segmentation
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01756v1
PDF	http://arxiv.org/pdf/1812.01756v1.pdf
PWC	https://paperswithcode.com/paper/multimathbf3net-segmenting-flooded-buildings
Repo	https://github.com/FrontierDevelopmentLab/multi3net
Framework	pytorch

Video Logo Retrieval based on local Features


Title	Video Logo Retrieval based on local Features
Authors	Bochen Guan, Hanrong Ye, Hong Liu, William Sethares
Abstract	Estimation of the frequency and duration of logos in videos is important in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the spatial distribution of local image descriptors that measure the distance between the query image (the logo) and a collection of down-sampled video images. VLR uses local features to overcome the weakness of global feature-based models such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and does not require training. The performance of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and Standford I2V), and compared with other state-of-the-art logo retrieval or detection algorithms. Overall, VLR shows significantly higher accuracy compared with the existing methods.
Tasks	Image Retrieval, Video Retrieval
Published	2018-08-11
URL	https://arxiv.org/abs/1808.03735v3
PDF	https://arxiv.org/pdf/1808.03735v3.pdf
PWC	https://paperswithcode.com/paper/target-image-video-search-based-on-local
Repo	https://github.com/gbc8181/TISLF
Framework	none