October 15, 2019

2693 words 13 mins read

Paper Group NANR 197

Training Deep AutoEncoders for Recommender Systems. Variational Inference and Deep Generative Models. Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation. Visually Guided Spatial Relation Extraction from Text. Manzanilla: An Image Annotation Tool for TKB Building. Cross-Modal Ranking with Soft Consistency and Noisy La …

Training Deep AutoEncoders for Recommender Systems


Title	Training Deep AutoEncoders for Recommender Systems
Authors	Oleksii Kuchaiev, Boris Ginsburg
Abstract	This paper proposes a new model for the rating prediction task in recommender systems which significantly outperforms previous state-of-the art models on a time-split Netflix data set. Our model is based on deep autoencoder with 6 layers and is trained end-to-end without any layer-wise pre-training. We empirically demonstrate that: a) deep autoencoder models generalize much better than the shallow ones, b) non-linear activation functions with negative parts are crucial for training deep models, and c) heavy use of regularization techniques such as dropout is necessary to prevent over-fitting. We also propose a new training algorithm based on iterative output re-feeding to overcome natural sparseness of collaborate filtering. The new algorithm significantly speeds up training and improves model performance. Our code is publicly available.
Tasks	Recommendation Systems
Published	2018-01-01
URL	https://openreview.net/forum?id=SkNQeiRpb
PDF	https://openreview.net/pdf?id=SkNQeiRpb
PWC	https://paperswithcode.com/paper/training-deep-autoencoders-for-recommender
Repo
Framework

Variational Inference and Deep Generative Models


Title	Variational Inference and Deep Generative Models
Authors	Wilker Aziz, Philip Schulz
Abstract	NLP has seen a surge in neural network models in recent years. These models provide state-of-the-art performance on many supervised tasks. Unsupervised and semi-supervised learning has only been addressed scarcely, however. Deep generative models (DGMs) make it possible to integrate neural networks with probabilistic graphical models. Using DGMs one can easily design latent variable models that account for missing observations and thereby enable unsupervised and semi-supervised learning with neural networks. The method of choice for training these models is variational inference. This tutorial offers a general introduction to variational inference followed by a thorough and example-driven discussion of how to use variational methods for training DGMs. It provides both the mathematical background necessary for deriving the learning algorithms as well as practical implementation guidelines. Importantly, the tutorial will cover models with continuous and discrete variables. We provide practical coding exercises implemented in IPython notebooks as well as short notes on the more intricate mathematical details that the audience can use as a reference after the tutorial. We expect that with these additional materials the tutorial will have a long-lasting impact on the community.
Tasks	Latent Variable Models, Machine Translation, Natural Language Inference
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-5003/
PDF	https://www.aclweb.org/anthology/P18-5003
PWC	https://paperswithcode.com/paper/variational-inference-and-deep-generative
Repo
Framework

Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation


Title	Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation
Authors	Nima Pourdamghani, Marjan Ghazvininejad, Kevin Knight
Abstract	We present a method for improving word alignments using word similarities. This method is based on encouraging common alignment links between semantically similar words. We use word vectors trained on monolingual data to estimate similarity. Our experiments on translating fifteen languages into English show consistent BLEU score improvements across the languages.
Tasks	Machine Translation, Morphological Analysis, Word Alignment
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2083/
PDF	https://www.aclweb.org/anthology/N18-2083
PWC	https://paperswithcode.com/paper/using-word-vectors-to-improve-word-alignments
Repo
Framework

Visually Guided Spatial Relation Extraction from Text


Title	Visually Guided Spatial Relation Extraction from Text
Authors	Taher Rahgooy, Umar Manzoor, Parisa Kordjamshidi
Abstract	Extraction of spatial relations from sentences with complex/nesting relationships is very challenging as often needs resolving inherent semantic ambiguities. We seek help from visual modality to fill the information gap in the text modality and resolve spatial semantic ambiguities. We use various recent vision and language datasets and techniques to train inter-modality alignment models, visual relationship classifiers and propose a novel global inference model to integrate these components into our structured output prediction model for spatial role and relation extraction. Our global inference model enables us to utilize the visual and geometric relationships between objects and improves the state-of-art results of spatial information extraction from text.
Tasks	Activity Recognition, Image Captioning, Image Retrieval, Object Localization, Question Answering, Relation Extraction, Visual Question Answering
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2124/
PDF	https://www.aclweb.org/anthology/N18-2124
PWC	https://paperswithcode.com/paper/visually-guided-spatial-relation-extraction
Repo
Framework

Manzanilla: An Image Annotation Tool for TKB Building


Title	Manzanilla: An Image Annotation Tool for TKB Building
Authors	Arianne Reimerink, Pilar Le{'o}n-Ara{'u}z
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1170/
PDF	https://www.aclweb.org/anthology/L18-1170
PWC	https://paperswithcode.com/paper/manzanilla-an-image-annotation-tool-for-tkb
Repo
Framework


Title	Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking
Authors	Chenglong Li, Chengli Zhu, Yan Huang, Jin Tang, Liang Wang
Abstract	Due to the complementary benefits of visible (RGB) and thermal infrared (T) data, RGB-T object tracking attracts more and more attention recently for boosting the performance under adverse illumination conditions. Existing RGB-T tracking methods usually localize a target object with a bounding box, in which the trackers or detectors is often affected by the inclusion of background clutter. To address this problem, this paper presents a novel approach to suppress background effects for RGB-T tracking. Our approach relies on a novel cross-modal manifold ranking algorithm. First, we integrate the soft cross-modality consistency into the ranking model which allows the sparse inconsistency to account for the different properties between these two modalities. Second, we propose an optimal query learning method to handle label noises of queries. In particular, we introduce an intermediate variable to represent the optimal labels, and formulate it as a $l_1$-optimization based sparse learning problem. Moreover, we propose a single unified optimization algorithm to solve the proposed model with stable and efficient convergence behavior. Finally, the ranking results are incorporated into the patch-based object features to address the background effects, and the structured SVM is then adopted to perform RGB-T tracking. Extensive experiments suggest that the proposed approach performs well against the state-of-the-art methods on large-scale benchmark datasets.
Tasks	Object Tracking, Rgb-T Tracking, Sparse Learning
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Chenglong_Li_Cross-Modal_Ranking_with_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Chenglong_Li_Cross-Modal_Ranking_with_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/cross-modal-ranking-with-soft-consistency-and
Repo
Framework

An inference-based policy gradient method for learning options


Title	An inference-based policy gradient method for learning options
Authors	Matthew J. A. Smith, Herke van Hoof, Joelle Pineau
Abstract	In the pursuit of increasingly intelligent learning systems, abstraction plays a vital role in enabling sophisticated decisions to be made in complex environments. The options framework provides formalism for such abstraction over sequences of decisions. However most models require that options be given a priori, presumably specified by hand, which is neither efficient, nor scalable. Indeed, it is preferable to learn options directly from interaction with the environment. Despite several efforts, this remains a difficult problem: many approaches require access to a model of the environmental dynamics, and inferred options are often not interpretable, which limits our ability to explain the system behavior for verification or debugging purposes. In this work we develop a novel policy gradient method for the automatic learning of policies with options. This algorithm uses inference methods to simultaneously improve all of the options available to an agent, and thus can be employed in an off-policy manner, without observing option labels. Experimental results show that the options learned can be interpreted. Further, we find that the method presented here is more sample efficient than existing methods, leading to faster and more stable learning of policies with options.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rJIgf7bAZ
PDF	https://openreview.net/pdf?id=rJIgf7bAZ
PWC	https://paperswithcode.com/paper/an-inference-based-policy-gradient-method-for
Repo
Framework

Autostacker: an Automatic Evolutionary Hierarchical Machine Learning System


Title	Autostacker: an Automatic Evolutionary Hierarchical Machine Learning System
Authors	Boyuan Chen, Warren Mo, Ishanu Chattopadhyay, Hod Lipson
Abstract	This work provides an automatic machine learning (AutoML) modelling architecture called Autostacker. Autostacker improves the prediction accuracy of machine learning baselines by utilizing an innovative hierarchical stacking architecture and an efficient parameter search algorithm. Neither prior domain knowledge about the data nor feature preprocessing is needed. We significantly reduce the time of AutoML with a naturally inspired algorithm - Parallel Hill Climbing (PHC). By parallelizing PHC, Autostacker can provide candidate pipelines with sufficient prediction accuracy within a short amount of time. These pipelines can be used as is or as a starting point for human experts to build on. By focusing on the modelling process, Autostacker breaks the tradition of following fixed order pipelines by exploring not only single model pipeline but also innovative combinations and structures. As we will show in the experiment section, Autostacker achieves significantly better performance both in terms of test accuracy and time cost comparing with human initial trials and recent popular AutoML system.
Tasks	AutoML
Published	2018-01-01
URL	https://openreview.net/forum?id=SyvCD-b0W
PDF	https://openreview.net/pdf?id=SyvCD-b0W
PWC	https://paperswithcode.com/paper/autostacker-an-automatic-evolutionary
Repo
Framework

Learning Attribute Representations With Localization for Flexible Fashion Search


Title	Learning Attribute Representations With Localization for Flexible Fashion Search
Authors	Kenan E. Ak, Ashraf A. Kassim, Joo Hwee Lim, Jo Yew Tham
Abstract	In this paper, we investigate ways of conducting a detailed fashion search using query images and attributes. A credible fashion search platform should be able to (1) find images that share the same attributes as the query image, (2) allow users to manipulate certain attributes, e.g. replace collar attribute from round to v-neck, and (3) handle region-specific attribute manipulations, e.g. replacing the color attribute of the sleeve region without changing the color attribute of other regions. A key challenge to be addressed is that fashion products have multiple attributes and it is important for each of these attributes to have representative features. To address these challenges, we propose the FashionSearchNet which uses a weakly supervised localization method to extract regions of attributes. By doing so, unrelated features can be ignored thus improving the similarity learning. Also, FashionSearchNet incorporates a new procedure that enables region awareness to be able to handle region-specific requests. FashionSearchNet outperforms the most recent fashion search techniques and is shown to be able to carry out different search scenarios using the dynamic queries.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Ak_Learning_Attribute_Representations_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Ak_Learning_Attribute_Representations_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/learning-attribute-representations-with
Repo
Framework

Model Specialization for Inference Via End-to-End Distillation, Pruning, and Cascades


Title	Model Specialization for Inference Via End-to-End Distillation, Pruning, and Cascades
Authors	Daniel Kang, Karey Shi, Thao Ngyuen, Stephanie Mallard, Peter Bailis, Matei Zaharia
Abstract	The availability of general-purpose reference and benchmark datasets such as ImageNet have spurred the development of general-purpose popular reference model architectures and pre-trained weights. However, in practice, neural net- works are often employed to perform specific, more restrictive tasks, that are narrower in scope and complexity. Thus, simply fine-tuning or transfer learn- ing from a general-purpose network inherits a large computational cost that may not be necessary for a given task. In this work, we investigate the potential for model specialization, or reducing a model’s computational footprint by leverag- ing task-specific knowledge, such as a restricted inference distribution. We study three methods for model specialization—1) task-aware distillation, 2) task-aware pruning, and 3) specialized model cascades—and evaluate their performance on a range of classification tasks. Moreover, for the first time, we investigate how these techniques complement one another, enabling up to 5× speedups with no loss in accuracy and 9.8× speedups while remaining within 2.5% of a highly ac- curate ResNet on specialized image classification tasks. These results suggest that simple and easy-to-implement specialization procedures may benefit a large num- ber practical applications in which the representational power of general-purpose networks need not be inherited.
Tasks	Image Classification
Published	2018-01-01
URL	https://openreview.net/forum?id=ryOG3fWCW
PDF	https://openreview.net/pdf?id=ryOG3fWCW
PWC	https://paperswithcode.com/paper/model-specialization-for-inference-via-end-to
Repo
Framework

TRL: Discriminative Hints for Scalable Reverse Curriculum Learning


Title	TRL: Discriminative Hints for Scalable Reverse Curriculum Learning
Authors	Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu
Abstract	Deep reinforcement learning algorithms have proven successful in a variety of domains. However, tasks with sparse rewards remain challenging when the state space is large. Goal-oriented tasks are among the most typical problems in this domain, where a reward can only be received when the final goal is accomplished. In this work, we propose a potential solution to such problems with the introduction of an experience-based tendency reward mechanism, which provides the agent with additional hints based on a discriminative learning on past experiences during an automated reverse curriculum. This mechanism not only provides dense additional learning signals on what states lead to success, but also allows the agent to retain only this tendency reward instead of the whole histories of experience during multi-phase curriculum learning. We extensively study the advantages of our method on the standard sparse reward domains like Maze and Super Mario Bros and show that our method performs more efficiently and robustly than prior approaches in tasks with long time horizons and large state space. In addition, we demonstrate that using an optional keyframe scheme with very small quantity of key states, our approach can solve difficult robot manipulation challenges directly from perception and sparse rewards.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rJssAZ-0-
PDF	https://openreview.net/pdf?id=rJssAZ-0-
PWC	https://paperswithcode.com/paper/trl-discriminative-hints-for-scalable-reverse
Repo
Framework

Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates


Title	Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates
Authors	Quoc Tran Dinh
Abstract	We develop two new non-ergodic alternating proximal augmented Lagrangian algorithms (NEAPAL) to solve a class of nonsmooth constrained convex optimization problems. Our approach relies on a novel combination of the augmented Lagrangian framework, alternating/linearization scheme, Nesterov’s acceleration techniques, and adaptive strategy for parameters. Our algorithms have several new features compared to existing methods. Firstly, they have a Nesterov’s acceleration step on the primal variables compared to the dual one in several methods in the literature. Secondly, they achieve non-ergodic optimal convergence rates under standard assumptions, i.e. an $\mathcal{O}\left(\frac{1}{k}\right)$ rate without any smoothness or strong convexity-type assumption, or an $\mathcal{O}\left(\frac{1}{k^2}\right)$ rate under only semi-strong convexity, where $k$ is the iteration counter. Thirdly, they preserve or have better per-iteration complexity compared to existing algorithms. Fourthly, they can be implemented in a parallel fashion. Finally, all the parameters are adaptively updated without heuristic tuning. We verify our algorithms on different numerical examples and compare them with some state-of-the-art methods.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7730-non-ergodic-alternating-proximal-augmented-lagrangian-algorithms-with-optimal-rates
PDF	http://papers.nips.cc/paper/7730-non-ergodic-alternating-proximal-augmented-lagrangian-algorithms-with-optimal-rates.pdf
PWC	https://paperswithcode.com/paper/non-ergodic-alternating-proximal-augmented
Repo
Framework

Classification of Moral Foundations in Microblog Political Discourse


Title	Classification of Moral Foundations in Microblog Political Discourse
Authors	Kristen Johnson, Dan Goldwasser
Abstract	Previous works in computer science, as well as political and social science, have shown correlation in text between political ideologies and the moral foundations expressed within that text. Additional work has shown that policy frames, which are used by politicians to bias the public towards their stance on an issue, are also correlated with political ideology. Based on these associations, this work takes a first step towards modeling both the language and how politicians frame issues on Twitter, in order to predict the moral foundations that are used by politicians to express their stances on issues. The contributions of this work includes a dataset annotated for the moral foundations, annotation guidelines, and probabilistic graphical models which show the usefulness of jointly modeling abstract political slogans, as opposed to the unigrams of previous works, with policy frames for the prediction of the morality underlying political tweets.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1067/
PDF	https://www.aclweb.org/anthology/P18-1067
PWC	https://paperswithcode.com/paper/classification-of-moral-foundations-in
Repo
Framework

Proceedings of ACL 2018, System Demonstrations


Title	Proceedings of ACL 2018, System Demonstrations
Authors
Abstract
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-4000/
PDF	https://www.aclweb.org/anthology/P18-4000
PWC	https://paperswithcode.com/paper/proceedings-of-acl-2018-system-demonstrations
Repo
Framework

Understanding Local Minima in Neural Networks by Loss Surface Decomposition


Title	Understanding Local Minima in Neural Networks by Loss Surface Decomposition
Authors	Hanock Kwak, Byoung-Tak Zhang
Abstract	To provide principled ways of designing proper Deep Neural Network (DNN) models, it is essential to understand the loss surface of DNNs under realistic assumptions. We introduce interesting aspects for understanding the local minima and overall structure of the loss surface. The parameter domain of the loss surface can be decomposed into regions in which activation values (zero or one for rectified linear units) are consistent. We found that, in each region, the loss surface have properties similar to that of linear neural networks where every local minimum is a global minimum. This means that every differentiable local minimum is the global minimum of the corresponding region. We prove that for a neural network with one hidden layer using rectified linear units under realistic assumptions. There are poor regions that lead to poor local minima, and we explain why such regions exist even in the overparameterized DNNs.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=SJDYgPgCZ
PDF	https://openreview.net/pdf?id=SJDYgPgCZ
PWC	https://paperswithcode.com/paper/understanding-local-minima-in-neural-networks
Repo
Framework