October 21, 2019

3248 words 16 mins read

Paper Group AWR 43

Reinforced Continual Learning. Hierarchical Discrete Distribution Decomposition for Match Density Estimation. simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions. Memorize or generalize? Searching for a compositional RNN in a haystack. Learning Tree-based Deep Model for Recommender Systems. Characte …

Reinforced Continual Learning


Title	Reinforced Continual Learning
Authors	Ju Xu, Zhanxing Zhu
Abstract	Most artificial intelligence models have limiting ability to solve new tasks faster, without forgetting previously acquired knowledge. The recently emerging paradigm of continual learning aims to solve this issue, in which the model learns various tasks in a sequential fashion. In this work, a novel approach for continual learning is proposed, which searches for the best neural architecture for each coming task via sophisticatedly designed reinforcement learning strategies. We name it as Reinforced Continual Learning. Our method not only has good performance on preventing catastrophic forgetting but also fits new tasks well. The experiments on sequential classification tasks for variants of MNIST and CIFAR-100 datasets demonstrate that the proposed approach outperforms existing continual learning alternatives for deep networks.
Tasks	Continual Learning
Published	2018-05-31
URL	http://arxiv.org/abs/1805.12369v1
PDF	http://arxiv.org/pdf/1805.12369v1.pdf
PWC	https://paperswithcode.com/paper/reinforced-continual-learning
Repo	https://github.com/xujinfan/Reinforced-Continual-Learning
Framework	tf

Hierarchical Discrete Distribution Decomposition for Match Density Estimation


Title	Hierarchical Discrete Distribution Decomposition for Match Density Estimation
Authors	Zhichao Yin, Trevor Darrell, Fisher Yu
Abstract	Explicit representations of the global match distributions of pixel-wise correspondences between pairs of images are desirable for uncertainty estimation and downstream applications. However, the computation of the match density for each pixel may be prohibitively expensive due to the large number of candidates. In this paper, we propose Hierarchical Discrete Distribution Decomposition (HD^3), a framework suitable for learning probabilistic pixel correspondences in both optical flow and stereo matching. We decompose the full match density into multiple scales hierarchically, and estimate the local matching distributions at each scale conditioned on the matching and warping at coarser scales. The local distributions can then be composed together to form the global match density. Despite its simplicity, our probabilistic method achieves state-of-the-art results for both optical flow and stereo matching on established benchmarks. We also find the estimated uncertainty is a good indication of the reliability of the predicted correspondences.
Tasks	Density Estimation, Optical Flow Estimation, Stereo Matching, Stereo Matching Hand
Published	2018-12-15
URL	http://arxiv.org/abs/1812.06264v3
PDF	http://arxiv.org/pdf/1812.06264v3.pdf
PWC	https://paperswithcode.com/paper/hierarchical-discrete-distribution
Repo	https://github.com/ucbdrive/hd3
Framework	pytorch

simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions


Title	simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Authors	Fenglin Liu, Xuancheng Ren, Yuanxin Liu, Houfeng Wang, Xu Sun
Abstract	The encode-decoder framework has shown recent success in image captioning. Visual attention, which is good at detailedness, and semantic attention, which is good at comprehensiveness, have been separately proposed to ground the caption on the image. In this paper, we propose the Stepwise Image-Topic Merging Network (simNet) that makes use of the two kinds of attention at the same time. At each time step when generating the caption, the decoder adaptively merges the attentive information in the extracted topics and the image according to the generated context, so that the visual information and the semantic information can be effectively combined. The proposed approach is evaluated on two benchmark datasets and reaches the state-of-the-art performances.(The code is available at https://github.com/lancopku/simNet)
Tasks	Image Captioning
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08732v1
PDF	http://arxiv.org/pdf/1808.08732v1.pdf
PWC	https://paperswithcode.com/paper/simnet-stepwise-image-topic-merging-network
Repo	https://github.com/lancopku/simNet
Framework	pytorch

Memorize or generalize? Searching for a compositional RNN in a haystack


Title	Memorize or generalize? Searching for a compositional RNN in a haystack
Authors	Adam Liška, Germán Kruszewski, Marco Baroni
Abstract	Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other. This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared by different tasks, and recombining them to solve new problems. In this paper, we explore the compositional generalization capabilities of recurrent neural networks (RNNs). We first propose the lookup table composition domain as a simple setup to test compositional behaviour and show that it is theoretically possible for a standard RNN to learn to behave compositionally in this domain when trained with standard gradient descent and provided with additional supervision. We then remove this additional supervision and perform a search over a large number of model initializations to investigate the proportion of RNNs that can still converge to a compositional solution. We discover that a small but non-negligible proportion of RNNs do reach partial compositional solutions even without special architectural constraints. This suggests that a combination of gradient descent and evolutionary strategies directly favouring the minority models that developed more compositional approaches might suffice to lead standard RNNs towards compositional solutions.
Tasks
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06467v2
PDF	http://arxiv.org/pdf/1802.06467v2.pdf
PWC	https://paperswithcode.com/paper/memorize-or-generalize-searching-for-a
Repo	https://github.com/i-machine-think/machine-tasks
Framework	none

Learning Tree-based Deep Model for Recommender Systems


Title	Learning Tree-based Deep Model for Recommender Systems
Authors	Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, Kun Gai
Abstract	Model-based methods for recommender systems have been studied extensively in recent years. In systems with large corpus, however, the calculation cost for the learnt model to predict all user-item preferences is tremendous, which makes full corpus retrieval extremely difficult. To overcome the calculation barriers, models such as matrix factorization resort to inner product form (i.e., model user-item preference as the inner product of user, item latent factors) and indexes to facilitate efficient approximate k-nearest neighbor searches. However, it still remains challenging to incorporate more expressive interaction forms between user and item features, e.g., interactions through deep neural networks, because of the calculation cost. In this paper, we focus on the problem of introducing arbitrary advanced models to recommender systems with large corpus. We propose a novel tree-based method which can provide logarithmic complexity w.r.t. corpus size even with more expressive models such as deep neural networks. Our main idea is to predict user interests from coarse to fine by traversing tree nodes in a top-down fashion and making decisions for each user-node pair. We also show that the tree structure can be jointly learnt towards better compatibility with users’ interest distribution and hence facilitate both training and prediction. Experimental evaluations with two large-scale real-world datasets show that the proposed method significantly outperforms traditional methods. Online A/B test results in Taobao display advertising platform also demonstrate the effectiveness of the proposed method in production environments.
Tasks	Recommendation Systems
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02294v5
PDF	http://arxiv.org/pdf/1801.02294v5.pdf
PWC	https://paperswithcode.com/paper/learning-tree-based-deep-model-for
Repo	https://github.com/baldandbrave/RecSysCOEN6313
Framework	tf

Character-level Recurrent Neural Networks in Practice: Comparing Training and Sampling Schemes


Title	Character-level Recurrent Neural Networks in Practice: Comparing Training and Sampling Schemes
Authors	Cedric De Boom, Thomas Demeester, Bart Dhoedt
Abstract	Recurrent neural networks are nowadays successfully used in an abundance of applications, going from text, speech and image processing to recommender systems. Backpropagation through time is the algorithm that is commonly used to train these networks on specific tasks. Many deep learning frameworks have their own implementation of training and sampling procedures for recurrent neural networks, while there are in fact multiple other possibilities to choose from and other parameters to tune. In existing literature this is very often overlooked or ignored. In this paper we therefore give an overview of possible training and sampling schemes for character-level recurrent neural networks to solve the task of predicting the next token in a given sequence. We test these different schemes on a variety of datasets, neural network architectures and parameter settings, and formulate a number of take-home recommendations. The choice of training and sampling scheme turns out to be subject to a number of trade-offs, such as training stability, sampling time, model performance and implementation effort, but is largely independent of the data. Perhaps the most surprising result is that transferring hidden states for correctly initializing the model on subsequences often leads to unstable training behavior depending on the dataset.
Tasks	Recommendation Systems
Published	2018-01-02
URL	http://arxiv.org/abs/1801.00632v2
PDF	http://arxiv.org/pdf/1801.00632v2.pdf
PWC	https://paperswithcode.com/paper/character-level-recurrent-neural-networks-in
Repo	https://github.com/cedricdeboom/character-level-rnn-datasets
Framework	none

Contrastive Explanations with Local Foil Trees


Title	Contrastive Explanations with Local Foil Trees
Authors	Jasper van der Waa, Marcel Robeer, Jurriaan van Diggelen, Matthieu Brinkhuis, Mark Neerincx
Abstract	Recent advances in interpretable Machine Learning (iML) and eXplainable AI (XAI) construct explanations based on the importance of features in classification tasks. However, in a high-dimensional feature space this approach may become unfeasible without restraining the set of important features. We propose to utilize the human tendency to ask questions like “Why this output (the fact) instead of that output (the foil)?” to reduce the number of features to those that play a main role in the asked contrast. Our proposed method utilizes locally trained one-versus-all decision trees to identify the disjoint set of rules that causes the tree to classify data points as the foil and not as the fact. In this study we illustrate this approach on three benchmark classification tasks.
Tasks	Interpretable Machine Learning
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07470v1
PDF	http://arxiv.org/pdf/1806.07470v1.pdf
PWC	https://paperswithcode.com/paper/contrastive-explanations-with-local-foil
Repo	https://github.com/MarcelRobeer/ContrastiveExplanation
Framework	none

Non-Projective Dependency Parsing via Latent Heads Representation (LHR)


Title	Non-Projective Dependency Parsing via Latent Heads Representation (LHR)
Authors	Matteo Grella, Simone Cangialosi
Abstract	In this paper, we introduce a novel approach based on a bidirectional recurrent autoencoder to perform globally optimized non-projective dependency parsing via semi-supervised learning. The syntactic analysis is completed at the end of the neural process that generates a Latent Heads Representation (LHR), without any algorithmic constraint and with a linear complexity. The resulting “latent syntactic structure” can be used directly in other semantic tasks. The LHR is transformed into the usual dependency tree computing a simple vectors similarity. We believe that our model has the potential to compete with much more complex state-of-the-art parsing architectures.
Tasks	Dependency Parsing
Published	2018-02-06
URL	http://arxiv.org/abs/1802.02116v1
PDF	http://arxiv.org/pdf/1802.02116v1.pdf
PWC	https://paperswithcode.com/paper/non-projective-dependency-parsing-via-latent
Repo	https://github.com/GrellaCangialosi/LHRParser
Framework	none

Sdf-GAN: Semi-supervised Depth Fusion with Multi-scale Adversarial Networks


Title	Sdf-GAN: Semi-supervised Depth Fusion with Multi-scale Adversarial Networks
Authors	Can Pu, Runzi Song, Radim Tylecek, Nanbo Li, Robert B Fisher
Abstract	Refining raw disparity maps from different algorithms to exploit their complementary advantages is still challenging. Uncertainty estimation and complex disparity relationships among pixels limit the accuracy and robustness of existing methods and there is no standard method for fusion of different kinds of depth data. In this paper, we introduce a new method to fuse disparity maps from different sources, while incorporating supplementary information (intensity, gradient, etc.) into a refiner network to better refine raw disparity inputs. A discriminator network classifies disparities at different receptive fields and scales. Assuming a Markov Random Field for the refined disparity map produces better estimates of the true disparity distribution. Both fully supervised and semi-supervised versions of the algorithm are proposed. The approach includes a more robust loss function to inpaint invalid disparity values and requires much less labeled data to train in the semi-supervised learning mode. The algorithm can be generalized to fuse depths from different kinds of depth sources. Experiments explored different fusion opportunities: stereo-monocular fusion, stereo-ToF fusion and stereo-stereo fusion. The experiments show the superiority of the proposed algorithm compared with the most recent algorithms on public synthetic datasets (Scene Flow, SYNTH3, our synthetic garden dataset) and real datasets (Kitti2015 dataset and Trimbot2020 Garden dataset).
Tasks
Published	2018-03-18
URL	https://arxiv.org/abs/1803.06657v3
PDF	https://arxiv.org/pdf/1803.06657v3.pdf
PWC	https://paperswithcode.com/paper/sdf-gan-semi-supervised-depth-fusion-with
Repo	https://github.com/jcshim/sh
Framework	none

eXclusive Autoencoder (XAE) for Nucleus Detection and Classification on Hematoxylin and Eosin (H&E) Stained Histopathological Images


Title	eXclusive Autoencoder (XAE) for Nucleus Detection and Classification on Hematoxylin and Eosin (H&E) Stained Histopathological Images
Authors	Chao-Hui Huang, Daniel Racoceanu
Abstract	In this paper, we introduced a novel feature extraction approach, named exclusive autoencoder (XAE), which is a supervised version of autoencoder (AE), able to largely improve the performance of nucleus detection and classification on hematoxylin and eosin (H&E) histopathological images. The proposed XAE can be used in any AE-based algorithm, as long as the data labels are also provided in the feature extraction phase. In the experiments, we evaluated the performance of an approach which is the combination of an XAE and a fully connected neural network (FCN) and compared with some AE-based methods. For a nucleus detection problem (considered as a nucleus/non-nucleus classification problem) on breast cancer H&E images, the F-score of the proposed XAE+FCN approach achieved 96.64% while the state-of-the-art was at 84.49%. For nucleus classification on colorectal cancer H&E images, with the annotations of four categories of epithelial, inflammatory, fibroblast and miscellaneous nuclei. The F-score of the proposed method reached 70.4%. We also proposed a lymphocyte segmentation method. In the step of lymphocyte detection, we have compared with cutting-edge technology and gained improved performance from 90% to 98.67%. We also proposed an algorithm for lymphocyte segmentation based on nucleus detection and classification. The obtained Dice coefficient achieved 88.31% while the cutting-edge approach was at 74%.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11243v1
PDF	http://arxiv.org/pdf/1811.11243v1.pdf
PWC	https://paperswithcode.com/paper/exclusive-autoencoder-xae-for-nucleus
Repo	https://github.com/huangch/xae4hne
Framework	tf

Physically-Inspired Gaussian Process Models for Post-Transcriptional Regulation in Drosophila


Title	Physically-Inspired Gaussian Process Models for Post-Transcriptional Regulation in Drosophila
Authors	Andrés F. López-Lopera, Nicolas Durrande, Mauricio A. Alvarez
Abstract	The regulatory process of Drosophila is thoroughly studied for understanding a great variety of biological principles. While pattern-forming gene networks are analysed in the transcription step, post-transcriptional events (e.g. translation, protein processing) play an important role in establishing protein expression patterns and levels. Since the post-transcriptional regulation of Drosophila depends on spatiotemporal interactions between mRNAs and gap proteins, proper physically-inspired stochastic models are required to study the link between both quantities. Previous research attempts have shown that using Gaussian processes (GPs) and differential equations lead to promising predictions when analysing regulatory networks. Here we aim at further investigating two types of physically-inspired GP models based on a reaction-diffusion equation where the main difference lies in where the prior is placed. While one of them has been studied previously using protein data only, the other is novel and yields a simple approach requiring only the differentiation of kernel functions. In contrast to other stochastic frameworks, discretising the spatial space is not required here. Both GP models are tested under different conditions depending on the availability of gap gene mRNA expression data. Finally, their performances are assessed on a high-resolution dataset describing the blastoderm stage of the early embryo of Drosophila melanogaster
Tasks	Gaussian Processes
Published	2018-08-29
URL	https://arxiv.org/abs/1808.10026v3
PDF	https://arxiv.org/pdf/1808.10026v3.pdf
PWC	https://paperswithcode.com/paper/physically-inspired-gaussian-processes-for
Repo	https://github.com/anfelopera/PhysicallyGPDrosophila
Framework	none

Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning


Title	Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning
Authors	Matthew Jagielski, Alina Oprea, Battista Biggio, Chang Liu, Cristina Nita-Rotaru, Bo Li
Abstract	As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms. In this paper, we perform the first systematic study of poisoning attacks and their countermeasures for linear regression models. In poisoning attacks, attackers deliberately influence the training data to manipulate the results of a predictive model. We propose a theoretically-grounded optimization framework specifically designed for linear regression and demonstrate its effectiveness on a range of datasets and models. We also introduce a fast statistical attack that requires limited knowledge of the training process. Finally, we design a new principled defense method that is highly resilient against all poisoning attacks. We provide formal guarantees about its convergence and an upper bound on the effect of poisoning attacks when the defense is deployed. We evaluate extensively our attacks and defenses on three realistic datasets from health care, loan assessment, and real estate domains.
Tasks
Published	2018-04-01
URL	http://arxiv.org/abs/1804.00308v1
PDF	http://arxiv.org/pdf/1804.00308v1.pdf
PWC	https://paperswithcode.com/paper/manipulating-machine-learning-poisoning
Repo	https://github.com/jagielski/manip-ml
Framework	none

Estimating Time-Varying Graphical Models


Title	Estimating Time-Varying Graphical Models
Authors	Jilei Yang, Jie Peng
Abstract	In this paper, we study time-varying graphical models based on data measured over a temporal grid. Such models are motivated by the needs to describe and understand evolving interacting relationships among a set of random variables in many real applications, for instance the study of how stocks interact with each other and how such interactions change over time. We propose a new model, LOcal Group Graphical Lasso Estimation (loggle), under the assumption that the graph topology changes gradually over time. Specifically, loggle uses a novel local group-lasso type penalty to efficiently incorporate information from neighboring time points and to impose structural smoothness of the graphs. We implement an ADMM based algorithm to fit the loggle model. This algorithm utilizes blockwise fast computation and pseudo-likelihood approximation to improve computational efficiency. An R package loggle has also been developed. We evaluate the performance of loggle by simulation experiments. We also apply loggle to S&P 500 stock price data and demonstrate that loggle is able to reveal the interacting relationships among stocks and among industrial sectors in a time period that covers the recent global financial crisis.
Tasks
Published	2018-04-11
URL	http://arxiv.org/abs/1804.03811v1
PDF	http://arxiv.org/pdf/1804.03811v1.pdf
PWC	https://paperswithcode.com/paper/estimating-time-varying-graphical-models
Repo	https://github.com/jlyang1990/loggle_test
Framework	none

Object Captioning and Retrieval with Natural Language


Title	Object Captioning and Retrieval with Natural Language
Authors	Anh Nguyen, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis
Abstract	We address the problem of jointly learning vision and language to understand the object in a fine-grained manner. The key idea of our approach is the use of object descriptions to provide the detailed understanding of an object. Based on this idea, we propose two new architectures to solve two related problems: object captioning and natural language-based object retrieval. The goal of the object captioning task is to simultaneously detect the object and generate its associated description, while in the object retrieval task, the goal is to localize an object given an input query. We demonstrate that both problems can be solved effectively using hybrid end-to-end CNN-LSTM networks. The experimental results on our new challenging dataset show that our methods outperform recent methods by a fair margin, while providing a detailed understanding of the object and having fast inference time. The source code will be made available.
Tasks
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06152v1
PDF	http://arxiv.org/pdf/1803.06152v1.pdf
PWC	https://paperswithcode.com/paper/object-captioning-and-retrieval-with-natural
Repo	https://github.com/nqanh/object_captioning
Framework	tf

Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints


Title	Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
Authors	Reza Mahjourian, Martin Wicke, Anelia Angelova
Abstract	We present a novel approach for unsupervised learning of depth and ego-motion from monocular video. Unsupervised learning removes the need for separate supervisory signals (depth or ego-motion ground truth, or multi-view video). Prior work in unsupervised depth learning uses pixel-wise or gradient-based losses, which only consider pixels in small local neighborhoods. Our main contribution is to explicitly consider the inferred 3D geometry of the scene, enforcing consistency of the estimated 3D point clouds and ego-motion across consecutive frames. This is a challenging task and is solved by a novel (approximate) backpropagation algorithm for aligning 3D structures. We combine this novel 3D-based loss with 2D losses based on photometric quality of frame reconstructions using estimated depth and ego-motion from adjacent frames. We also incorporate validity masks to avoid penalizing areas in which no useful information exists. We test our algorithm on the KITTI dataset and on a video dataset captured on an uncalibrated mobile phone camera. Our proposed approach consistently improves depth estimates on both datasets, and outperforms the state-of-the-art for both depth and ego-motion. Because we only require a simple video, learning depth and ego-motion on large and varied datasets becomes possible. We demonstrate this by training on the low quality uncalibrated video dataset and evaluating on KITTI, ranking among top performing prior methods which are trained on KITTI itself.
Tasks	Depth And Camera Motion, Depth Estimation
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05522v2
PDF	http://arxiv.org/pdf/1802.05522v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-depth-and-ego-motion-2
Repo	https://github.com/xinshuoweng/deep_icp_tensorflow
Framework	tf