Paper Group AWR 353
Incremental Few-Shot Learning with Attention Attractor Networks. CoT: Cooperative Training for Generative Modeling of Discrete Data. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. Approximating Poker Probabilities with Deep Learning. Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic S …
Incremental Few-Shot Learning with Attention Attractor Networks
Title | Incremental Few-Shot Learning with Attention Attractor Networks |
Authors | Mengye Ren, Renjie Liao, Ethan Fetaya, Richard S. Zemel |
Abstract | Machine learning classifiers are often trained to recognize a set of pre-defined classes. However, in many applications, it is often desirable to have the flexibility of learning additional concepts, with limited data and without re-training on the full training set. This paper addresses this problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples. After learning the novel classes, the model is then evaluated on the overall classification performance on both base and novel classes. To this end, we propose a meta-learning model, the Attention Attractor Network, which regularizes the learning of novel classes. In each episode, we train a set of new weights to recognize novel classes until they converge, and we show that the technique of recurrent back-propagation can back-propagate through the optimization process and facilitate the learning of these parameters. We demonstrate that the learned attractor network can help recognize novel classes while remembering old classes without the need to review the original training set, outperforming various baselines. |
Tasks | Few-Shot Learning, Meta-Learning |
Published | 2018-10-16 |
URL | https://arxiv.org/abs/1810.07218v3 |
https://arxiv.org/pdf/1810.07218v3.pdf | |
PWC | https://paperswithcode.com/paper/incremental-few-shot-learning-with-attention |
Repo | https://github.com/renmengye/inc-few-shot-attractor-public |
Framework | tf |
CoT: Cooperative Training for Generative Modeling of Discrete Data
Title | CoT: Cooperative Training for Generative Modeling of Discrete Data |
Authors | Sidi Lu, Lantao Yu, Siyuan Feng, Yaoming Zhu, Weinan Zhang, Yong Yu |
Abstract | In this paper, we study the generative models of sequential discrete data. To tackle the exposure bias problem inherent in maximum likelihood estimation (MLE), generative adversarial networks (GANs) are introduced to penalize the unrealistic generated samples. To exploit the supervision signal from the discriminator, most previous models leverage REINFORCE to address the non-differentiable problem of sequential discrete data. However, because of the unstable property of the training signal during the dynamic process of adversarial training, the effectiveness of REINFORCE, in this case, is hardly guaranteed. To deal with such a problem, we propose a novel approach called Cooperative Training (CoT) to improve the training of sequence generative models. CoT transforms the min-max game of GANs into a joint maximization framework and manages to explicitly estimate and optimize Jensen-Shannon divergence. Moreover, CoT works without the necessity of pre-training via MLE, which is crucial to the success of previous methods. In the experiments, compared to existing state-of-the-art methods, CoT shows superior or at least competitive performance on sample quality, diversity, as well as training stability. |
Tasks | |
Published | 2018-04-11 |
URL | https://arxiv.org/abs/1804.03782v3 |
https://arxiv.org/pdf/1804.03782v3.pdf | |
PWC | https://paperswithcode.com/paper/cot-cooperative-training-for-generative |
Repo | https://github.com/desire2020/CoT |
Framework | tf |
Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm
Title | Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm |
Authors | Ziniu Hu, Yang Wang, Qu Peng, Hang Li |
Abstract | Although click data is widely used in search systems in practice, so far the inherent bias, most notably position bias, has prevented it from being used in training of a ranker for search, i.e., learning-to-rank. Recently, a number of authors have proposed new techniques referred to as ‘unbiased learning-to-rank’, which can reduce position bias and train a relatively high-performance ranker using click data. Most of the algorithms, based on the inverse propensity weighting (IPW) principle, first estimate the click bias at each position, and then train an unbiased ranker with the estimated biases using a learning-to-rank algorithm. However, there has not been a method for pairwise learning-to-rank that can jointly conduct debiasing of click data and training of a ranker using a pairwise loss function. In this paper, we propose a novel algorithm, which can jointly estimate the biases at click positions and the biases at unclick positions, and learn an unbiased ranker. Experiments on benchmark data show that our algorithm can significantly outperform existing algorithms. In addition, an online A/B Testing at a commercial search engine shows that our algorithm can effectively conduct debiasing of click data and enhance relevance ranking. |
Tasks | Learning-To-Rank |
Published | 2018-09-16 |
URL | http://arxiv.org/abs/1809.05818v2 |
http://arxiv.org/pdf/1809.05818v2.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-lambdamart-an-unbiased-pairwise |
Repo | https://github.com/acbull/Unbiased_LambdaMart |
Framework | none |
Approximating Poker Probabilities with Deep Learning
Title | Approximating Poker Probabilities with Deep Learning |
Authors | Brandon Da Silva |
Abstract | Many poker systems, whether created with heuristics or machine learning, rely on the probability of winning as a key input. However calculating the precise probability using combinatorics is an intractable problem, so instead we approximate it. Monte Carlo simulation is an effective technique that can be used to approximate the probability that a player will win and/or tie a hand. However, without the use of a memory-intensive lookup table or a supercomputer, it becomes infeasible to run millions of times when training an agent with self-play. To combat the space-time tradeoff, we use deep learning to approximate the probabilities obtained from the Monte Carlo simulation with high accuracy. The learned model proves to be a lightweight alternative to Monte Carlo simulation, which ultimately allows us to use the probabilities as inputs during self-play efficiently. The source code and optimized neural network can be found at https://github.com/brandinho/Poker-Probability-Approximation |
Tasks | Card Games, Game of Poker |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07220v2 |
http://arxiv.org/pdf/1808.07220v2.pdf | |
PWC | https://paperswithcode.com/paper/approximating-poker-probabilities-with-deep |
Repo | https://github.com/brandinho/Poker-Probability-Approximation |
Framework | tf |
Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic Segmentation
Title | Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic Segmentation |
Authors | Shang-Wei Hung, Shao-Yuan Lo, Hsueh-Ming Hang |
Abstract | Semantic segmentation has made encouraging progress due to the success of deep convolutional networks in recent years. Meanwhile, depth sensors become prevalent nowadays, so depth maps can be acquired more easily. However, there are few studies that focus on the RGB-D semantic segmentation task. Exploiting the depth information effectiveness to improve performance is a challenge. In this paper, we propose a novel solution named LDFNet, which incorporates Luminance, Depth and Color information by a fusion-based network. It includes a sub-network to process depth maps and employs luminance images to assist the depth information in processes. LDFNet outperforms the other state-of-art systems on the Cityscapes dataset, and its inference speed is faster than most of the existing networks. The experimental results show the effectiveness of the proposed multi-modal fusion network and its potential for practical applications. |
Tasks | Autonomous Driving, Scene Understanding, Semantic Segmentation |
Published | 2018-09-24 |
URL | https://arxiv.org/abs/1809.09077v3 |
https://arxiv.org/pdf/1809.09077v3.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-luminance-depth-and-color |
Repo | https://github.com/shangweihung/LDFNet |
Framework | pytorch |
Implicit Maximum Likelihood Estimation
Title | Implicit Maximum Likelihood Estimation |
Authors | Ke Li, Jitendra Malik |
Abstract | Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.09087v2 |
http://arxiv.org/pdf/1809.09087v2.pdf | |
PWC | https://paperswithcode.com/paper/implicit-maximum-likelihood-estimation |
Repo | https://github.com/baraklevy20/IMLE |
Framework | none |
Online Learning for Effort Reduction in Interactive Neural Machine Translation
Title | Online Learning for Effort Reduction in Interactive Neural Machine Translation |
Authors | Álvaro Peris, Francisco Casacuberta |
Abstract | Neural machine translation systems require large amounts of training data and resources. Even with this, the quality of the translations may be insufficient for some users or domains. In such cases, the output of the system must be revised by a human agent. This can be done in a post-editing stage or following an interactive machine translation protocol. We explore the incremental update of neural machine translation systems during the post-editing or interactive translation processes. Such modifications aim to incorporate the new knowledge, from the edited sentences, into the translation system. Updates to the model are performed on-the-fly, as sentences are corrected, via online learning techniques. In addition, we implement a novel interactive, adaptive system, able to react to single-character interactions. This system greatly reduces the human effort required for obtaining high-quality translations. In order to stress our proposals, we conduct exhaustive experiments varying the amount and type of data available for training. Results show that online learning effectively achieves the objective of reducing the human effort required during the post-editing or the interactive machine translation stages. Moreover, these adaptive systems also perform well in scenarios with scarce resources. We show that a neural machine translation system can be rapidly adapted to a specific domain, exclusively by means of online learning techniques. |
Tasks | Machine Translation |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.03594v2 |
http://arxiv.org/pdf/1802.03594v2.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-for-effort-reduction-in |
Repo | https://github.com/lvapeab/nmt-keras |
Framework | tf |
Exploiting Structure for Fast Kernel Learning
Title | Exploiting Structure for Fast Kernel Learning |
Authors | Trefor W. Evans, Prasanth B. Nair |
Abstract | We propose two methods for exact Gaussian process (GP) inference and learning on massive image, video, spatial-temporal, or multi-output datasets with missing values (or “gaps”) in the observed responses. The first method ignores the gaps using sparse selection matrices and a highly effective low-rank preconditioner is introduced to accelerate computations. The second method introduces a novel approach to GP training whereby response values are inferred on the gaps before explicitly training the model. We find this second approach to be greatly advantageous for the class of problems considered. Both of these novel approaches make extensive use of Kronecker matrix algebra to design massively scalable algorithms which have low memory requirements. We demonstrate exact GP inference for a spatial-temporal climate modelling problem with 3.7 million training points as well as a video reconstruction problem with 1 billion points. |
Tasks | Video Reconstruction |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03351v1 |
http://arxiv.org/pdf/1808.03351v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-structure-for-fast-kernel-learning |
Repo | https://github.com/treforevans/gp_grid |
Framework | none |
Unsupervised Multilingual Word Embeddings
Title | Unsupervised Multilingual Word Embeddings |
Authors | Xilun Chen, Claire Cardie |
Abstract | Multilingual Word Embeddings (MWEs) represent words from multiple languages in a single distributional vector space. Unsupervised MWE (UMWE) methods acquire multilingual embeddings without cross-lingual supervision, which is a significant advantage over traditional supervised approaches and opens many new possibilities for low-resource languages. Prior art for learning UMWEs, however, merely relies on a number of independently trained Unsupervised Bilingual Word Embeddings (UBWEs) to obtain multilingual embeddings. These methods fail to leverage the interdependencies that exist among many languages. To address this shortcoming, we propose a fully unsupervised framework for learning MWEs that directly exploits the relations between all language pairs. Our model substantially outperforms previous approaches in the experiments on multilingual word translation and cross-lingual word similarity. In addition, our model even beats supervised approaches trained with cross-lingual resources. |
Tasks | Multilingual Word Embeddings, Word Embeddings |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08933v2 |
http://arxiv.org/pdf/1808.08933v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-multilingual-word-embeddings |
Repo | https://github.com/ccsasuke/umwe |
Framework | pytorch |
HSI-CNN: A Novel Convolution Neural Network for Hyperspectral Image
Title | HSI-CNN: A Novel Convolution Neural Network for Hyperspectral Image |
Authors | Yanan Luo, Jie Zou, Chengfei Yao, Tao Li, Gang Bai |
Abstract | With the development of deep learning, the performance of hyperspectral image (HSI) classification has been greatly improved in recent years. The shortage of training samples has become a bottleneck for further improvement of performance. In this paper, we propose a novel convolutional neural network framework for the characteristics of hyperspectral image data, called HSI-CNN. Firstly, the spectral-spatial feature is extracted from a target pixel and its neighbors. Then, a number of one-dimensional feature maps, obtained by convolution operation on spectral-spatial features, are stacked into a two-dimensional matrix. Finally, the two-dimensional matrix considered as an image is fed into standard CNN. This is why we call it HSI-CNN. In addition, we also implements two depth network classification models, called HSI-CNN+XGBoost and HSI-CapsNet, in order to compare the performance of our framework. Experiments show that the performance of hyperspectral image classification is improved efficiently with HSI-CNN framework. We evaluate the model’s performance using four popular HSI datasets, which are the Kennedy Space Center (KSC), Indian Pines (IP), Pavia University scene (PU) and Salinas scene (SA). As far as we concerned, HSI-CNN has got the state-of-art accuracy among all methods we have known on these datasets of 99.28%, 99.09%, 99.42%, 98.95% separately. |
Tasks | Hyperspectral Image Classification, Image Classification |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10478v1 |
http://arxiv.org/pdf/1802.10478v1.pdf | |
PWC | https://paperswithcode.com/paper/hsi-cnn-a-novel-convolution-neural-network |
Repo | https://github.com/eecn/Hyperspectral-Classification |
Framework | pytorch |
Meta-Learning with Latent Embedding Optimization
Title | Meta-Learning with Latent Embedding Optimization |
Authors | Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell |
Abstract | Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space. The resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks. Further analysis indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space. |
Tasks | Few-Shot Learning, Meta-Learning |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05960v3 |
http://arxiv.org/pdf/1807.05960v3.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-with-latent-embedding |
Repo | https://github.com/deepmind/leo |
Framework | tf |
When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?
Title | When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation? |
Authors | Ye Qi, Devendra Singh Sachan, Matthieu Felix, Sarguna Janani Padmanabhan, Graham Neubig |
Abstract | The performance of Neural Machine Translation (NMT) systems often suffers in low-resource scenarios where sufficiently large-scale parallel corpora cannot be obtained. Pre-trained word embeddings have proven to be invaluable for improving performance in natural language analysis tasks, which often suffer from paucity of data. However, their utility for NMT has not been extensively explored. In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. We show that such embeddings can be surprisingly effective in some cases – providing gains of up to 20 BLEU points in the most favorable setting. |
Tasks | Machine Translation, Word Embeddings |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06323v2 |
http://arxiv.org/pdf/1804.06323v2.pdf | |
PWC | https://paperswithcode.com/paper/when-and-why-are-pre-trained-word-embeddings |
Repo | https://github.com/neulab/word-embeddings-for-nmt |
Framework | none |
InGAN: Capturing and Remapping the “DNA” of a Natural Image
Title | InGAN: Capturing and Remapping the “DNA” of a Natural Image |
Authors | Assaf Shocher, Shai Bagon, Phillip Isola, Michal Irani |
Abstract | Generative Adversarial Networks (GANs) typically learn a distribution of images in a large image dataset, and are then able to generate new images from this distribution. However, each natural image has its own internal statistics, captured by its unique distribution of patches. In this paper we propose an “Internal GAN” (InGAN) - an image-specific GAN - which trains on a single input image and learns its internal distribution of patches. It is then able to synthesize a plethora of new natural images of significantly different sizes, shapes and aspect-ratios - all with the same internal patch-distribution (same “DNA”) as the input image. In particular, despite large changes in global size/shape of the image, all elements inside the image maintain their local size/shape. InGAN is fully unsupervised, requiring no additional data other than the input image itself. Once trained on the input image, it can remap the input to any size or shape in a single feedforward pass, while preserving the same internal patch distribution. InGAN provides a unified framework for a variety of tasks, bridging the gap between textures and natural images. |
Tasks | |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00231v2 |
http://arxiv.org/pdf/1812.00231v2.pdf | |
PWC | https://paperswithcode.com/paper/internal-distribution-matching-for-natural |
Repo | https://github.com/assafshocher/InGAN |
Framework | pytorch |
Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery
Title | Multi$^{\mathbf{3}}$Net: Segmenting Flooded Buildings via Fusion of Multiresolution, Multisensor, and Multitemporal Satellite Imagery |
Authors | Tim G. J. Rudner, Marc Rußwurm, Jakub Fil, Ramona Pelich, Benjamin Bischke, Veronika Kopackova, Piotr Bilinski |
Abstract | We propose a novel approach for rapid segmentation of flooded buildings by fusing multiresolution, multisensor, and multitemporal satellite imagery in a convolutional neural network. Our model significantly expedites the generation of satellite imagery-based flood maps, crucial for first responders and local authorities in the early stages of flood events. By incorporating multitemporal satellite imagery, our model allows for rapid and accurate post-disaster damage assessment and can be used by governments to better coordinate medium- and long-term financial assistance programs for affected areas. The network consists of multiple streams of encoder-decoder architectures that extract spatiotemporal information from medium-resolution images and spatial information from high-resolution images before fusing the resulting representations into a single medium-resolution segmentation map of flooded buildings. We compare our model to state-of-the-art methods for building footprint segmentation as well as to alternative fusion approaches for the segmentation of flooded buildings and find that our model performs best on both tasks. We also demonstrate that our model produces highly accurate segmentation maps of flooded buildings using only publicly available medium-resolution data instead of significantly more detailed but sparsely available very high-resolution data. We release the first open-source dataset of fully preprocessed and labeled multiresolution, multispectral, and multitemporal satellite images of disaster sites along with our source code. |
Tasks | Flooded Building Segmentation |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01756v1 |
http://arxiv.org/pdf/1812.01756v1.pdf | |
PWC | https://paperswithcode.com/paper/multimathbf3net-segmenting-flooded-buildings |
Repo | https://github.com/FrontierDevelopmentLab/multi3net |
Framework | pytorch |
Video Logo Retrieval based on local Features
Title | Video Logo Retrieval based on local Features |
Authors | Bochen Guan, Hanrong Ye, Hong Liu, William Sethares |
Abstract | Estimation of the frequency and duration of logos in videos is important in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the spatial distribution of local image descriptors that measure the distance between the query image (the logo) and a collection of down-sampled video images. VLR uses local features to overcome the weakness of global feature-based models such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and does not require training. The performance of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and Standford I2V), and compared with other state-of-the-art logo retrieval or detection algorithms. Overall, VLR shows significantly higher accuracy compared with the existing methods. |
Tasks | Image Retrieval, Video Retrieval |
Published | 2018-08-11 |
URL | https://arxiv.org/abs/1808.03735v3 |
https://arxiv.org/pdf/1808.03735v3.pdf | |
PWC | https://paperswithcode.com/paper/target-image-video-search-based-on-local |
Repo | https://github.com/gbc8181/TISLF |
Framework | none |