Paper Group NANR 197
Training Deep AutoEncoders for Recommender Systems. Variational Inference and Deep Generative Models. Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation. Visually Guided Spatial Relation Extraction from Text. Manzanilla: An Image Annotation Tool for TKB Building. Cross-Modal Ranking with Soft Consistency and Noisy La …
Training Deep AutoEncoders for Recommender Systems
Title | Training Deep AutoEncoders for Recommender Systems |
Authors | Oleksii Kuchaiev, Boris Ginsburg |
Abstract | This paper proposes a new model for the rating prediction task in recommender systems which significantly outperforms previous state-of-the art models on a time-split Netflix data set. Our model is based on deep autoencoder with 6 layers and is trained end-to-end without any layer-wise pre-training. We empirically demonstrate that: a) deep autoencoder models generalize much better than the shallow ones, b) non-linear activation functions with negative parts are crucial for training deep models, and c) heavy use of regularization techniques such as dropout is necessary to prevent over-fitting. We also propose a new training algorithm based on iterative output re-feeding to overcome natural sparseness of collaborate filtering. The new algorithm significantly speeds up training and improves model performance. Our code is publicly available. |
Tasks | Recommendation Systems |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SkNQeiRpb |
https://openreview.net/pdf?id=SkNQeiRpb | |
PWC | https://paperswithcode.com/paper/training-deep-autoencoders-for-recommender |
Repo | |
Framework | |
Variational Inference and Deep Generative Models
Title | Variational Inference and Deep Generative Models |
Authors | Wilker Aziz, Philip Schulz |
Abstract | NLP has seen a surge in neural network models in recent years. These models provide state-of-the-art performance on many supervised tasks. Unsupervised and semi-supervised learning has only been addressed scarcely, however. Deep generative models (DGMs) make it possible to integrate neural networks with probabilistic graphical models. Using DGMs one can easily design latent variable models that account for missing observations and thereby enable unsupervised and semi-supervised learning with neural networks. The method of choice for training these models is variational inference. This tutorial offers a general introduction to variational inference followed by a thorough and example-driven discussion of how to use variational methods for training DGMs. It provides both the mathematical background necessary for deriving the learning algorithms as well as practical implementation guidelines. Importantly, the tutorial will cover models with continuous and discrete variables. We provide practical coding exercises implemented in IPython notebooks as well as short notes on the more intricate mathematical details that the audience can use as a reference after the tutorial. We expect that with these additional materials the tutorial will have a long-lasting impact on the community. |
Tasks | Latent Variable Models, Machine Translation, Natural Language Inference |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-5003/ |
https://www.aclweb.org/anthology/P18-5003 | |
PWC | https://paperswithcode.com/paper/variational-inference-and-deep-generative |
Repo | |
Framework | |
Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation
Title | Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation |
Authors | Nima Pourdamghani, Marjan Ghazvininejad, Kevin Knight |
Abstract | We present a method for improving word alignments using word similarities. This method is based on encouraging common alignment links between semantically similar words. We use word vectors trained on monolingual data to estimate similarity. Our experiments on translating fifteen languages into English show consistent BLEU score improvements across the languages. |
Tasks | Machine Translation, Morphological Analysis, Word Alignment |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2083/ |
https://www.aclweb.org/anthology/N18-2083 | |
PWC | https://paperswithcode.com/paper/using-word-vectors-to-improve-word-alignments |
Repo | |
Framework | |
Visually Guided Spatial Relation Extraction from Text
Title | Visually Guided Spatial Relation Extraction from Text |
Authors | Taher Rahgooy, Umar Manzoor, Parisa Kordjamshidi |
Abstract | Extraction of spatial relations from sentences with complex/nesting relationships is very challenging as often needs resolving inherent semantic ambiguities. We seek help from visual modality to fill the information gap in the text modality and resolve spatial semantic ambiguities. We use various recent vision and language datasets and techniques to train inter-modality alignment models, visual relationship classifiers and propose a novel global inference model to integrate these components into our structured output prediction model for spatial role and relation extraction. Our global inference model enables us to utilize the visual and geometric relationships between objects and improves the state-of-art results of spatial information extraction from text. |
Tasks | Activity Recognition, Image Captioning, Image Retrieval, Object Localization, Question Answering, Relation Extraction, Visual Question Answering |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2124/ |
https://www.aclweb.org/anthology/N18-2124 | |
PWC | https://paperswithcode.com/paper/visually-guided-spatial-relation-extraction |
Repo | |
Framework | |
Manzanilla: An Image Annotation Tool for TKB Building
Title | Manzanilla: An Image Annotation Tool for TKB Building |
Authors | Arianne Reimerink, Pilar Le{'o}n-Ara{'u}z |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1170/ |
https://www.aclweb.org/anthology/L18-1170 | |
PWC | https://paperswithcode.com/paper/manzanilla-an-image-annotation-tool-for-tkb |
Repo | |
Framework | |
Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking
Title | Cross-Modal Ranking with Soft Consistency and Noisy Labels for Robust RGB-T Tracking |
Authors | Chenglong Li, Chengli Zhu, Yan Huang, Jin Tang, Liang Wang |
Abstract | Due to the complementary benefits of visible (RGB) and thermal infrared (T) data, RGB-T object tracking attracts more and more attention recently for boosting the performance under adverse illumination conditions. Existing RGB-T tracking methods usually localize a target object with a bounding box, in which the trackers or detectors is often affected by the inclusion of background clutter. To address this problem, this paper presents a novel approach to suppress background effects for RGB-T tracking. Our approach relies on a novel cross-modal manifold ranking algorithm. First, we integrate the soft cross-modality consistency into the ranking model which allows the sparse inconsistency to account for the different properties between these two modalities. Second, we propose an optimal query learning method to handle label noises of queries. In particular, we introduce an intermediate variable to represent the optimal labels, and formulate it as a $l_1$-optimization based sparse learning problem. Moreover, we propose a single unified optimization algorithm to solve the proposed model with stable and efficient convergence behavior. Finally, the ranking results are incorporated into the patch-based object features to address the background effects, and the structured SVM is then adopted to perform RGB-T tracking. Extensive experiments suggest that the proposed approach performs well against the state-of-the-art methods on large-scale benchmark datasets. |
Tasks | Object Tracking, Rgb-T Tracking, Sparse Learning |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Chenglong_Li_Cross-Modal_Ranking_with_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Chenglong_Li_Cross-Modal_Ranking_with_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/cross-modal-ranking-with-soft-consistency-and |
Repo | |
Framework | |
An inference-based policy gradient method for learning options
Title | An inference-based policy gradient method for learning options |
Authors | Matthew J. A. Smith, Herke van Hoof, Joelle Pineau |
Abstract | In the pursuit of increasingly intelligent learning systems, abstraction plays a vital role in enabling sophisticated decisions to be made in complex environments. The options framework provides formalism for such abstraction over sequences of decisions. However most models require that options be given a priori, presumably specified by hand, which is neither efficient, nor scalable. Indeed, it is preferable to learn options directly from interaction with the environment. Despite several efforts, this remains a difficult problem: many approaches require access to a model of the environmental dynamics, and inferred options are often not interpretable, which limits our ability to explain the system behavior for verification or debugging purposes. In this work we develop a novel policy gradient method for the automatic learning of policies with options. This algorithm uses inference methods to simultaneously improve all of the options available to an agent, and thus can be employed in an off-policy manner, without observing option labels. Experimental results show that the options learned can be interpreted. Further, we find that the method presented here is more sample efficient than existing methods, leading to faster and more stable learning of policies with options. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJIgf7bAZ |
https://openreview.net/pdf?id=rJIgf7bAZ | |
PWC | https://paperswithcode.com/paper/an-inference-based-policy-gradient-method-for |
Repo | |
Framework | |
Autostacker: an Automatic Evolutionary Hierarchical Machine Learning System
Title | Autostacker: an Automatic Evolutionary Hierarchical Machine Learning System |
Authors | Boyuan Chen, Warren Mo, Ishanu Chattopadhyay, Hod Lipson |
Abstract | This work provides an automatic machine learning (AutoML) modelling architecture called Autostacker. Autostacker improves the prediction accuracy of machine learning baselines by utilizing an innovative hierarchical stacking architecture and an efficient parameter search algorithm. Neither prior domain knowledge about the data nor feature preprocessing is needed. We significantly reduce the time of AutoML with a naturally inspired algorithm - Parallel Hill Climbing (PHC). By parallelizing PHC, Autostacker can provide candidate pipelines with sufficient prediction accuracy within a short amount of time. These pipelines can be used as is or as a starting point for human experts to build on. By focusing on the modelling process, Autostacker breaks the tradition of following fixed order pipelines by exploring not only single model pipeline but also innovative combinations and structures. As we will show in the experiment section, Autostacker achieves significantly better performance both in terms of test accuracy and time cost comparing with human initial trials and recent popular AutoML system. |
Tasks | AutoML |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SyvCD-b0W |
https://openreview.net/pdf?id=SyvCD-b0W | |
PWC | https://paperswithcode.com/paper/autostacker-an-automatic-evolutionary |
Repo | |
Framework | |
Learning Attribute Representations With Localization for Flexible Fashion Search
Title | Learning Attribute Representations With Localization for Flexible Fashion Search |
Authors | Kenan E. Ak, Ashraf A. Kassim, Joo Hwee Lim, Jo Yew Tham |
Abstract | In this paper, we investigate ways of conducting a detailed fashion search using query images and attributes. A credible fashion search platform should be able to (1) find images that share the same attributes as the query image, (2) allow users to manipulate certain attributes, e.g. replace collar attribute from round to v-neck, and (3) handle region-specific attribute manipulations, e.g. replacing the color attribute of the sleeve region without changing the color attribute of other regions. A key challenge to be addressed is that fashion products have multiple attributes and it is important for each of these attributes to have representative features. To address these challenges, we propose the FashionSearchNet which uses a weakly supervised localization method to extract regions of attributes. By doing so, unrelated features can be ignored thus improving the similarity learning. Also, FashionSearchNet incorporates a new procedure that enables region awareness to be able to handle region-specific requests. FashionSearchNet outperforms the most recent fashion search techniques and is shown to be able to carry out different search scenarios using the dynamic queries. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Ak_Learning_Attribute_Representations_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Ak_Learning_Attribute_Representations_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-attribute-representations-with |
Repo | |
Framework | |
Model Specialization for Inference Via End-to-End Distillation, Pruning, and Cascades
Title | Model Specialization for Inference Via End-to-End Distillation, Pruning, and Cascades |
Authors | Daniel Kang, Karey Shi, Thao Ngyuen, Stephanie Mallard, Peter Bailis, Matei Zaharia |
Abstract | The availability of general-purpose reference and benchmark datasets such as ImageNet have spurred the development of general-purpose popular reference model architectures and pre-trained weights. However, in practice, neural net- works are often employed to perform specific, more restrictive tasks, that are narrower in scope and complexity. Thus, simply fine-tuning or transfer learn- ing from a general-purpose network inherits a large computational cost that may not be necessary for a given task. In this work, we investigate the potential for model specialization, or reducing a model’s computational footprint by leverag- ing task-specific knowledge, such as a restricted inference distribution. We study three methods for model specialization—1) task-aware distillation, 2) task-aware pruning, and 3) specialized model cascades—and evaluate their performance on a range of classification tasks. Moreover, for the first time, we investigate how these techniques complement one another, enabling up to 5× speedups with no loss in accuracy and 9.8× speedups while remaining within 2.5% of a highly ac- curate ResNet on specialized image classification tasks. These results suggest that simple and easy-to-implement specialization procedures may benefit a large num- ber practical applications in which the representational power of general-purpose networks need not be inherited. |
Tasks | Image Classification |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=ryOG3fWCW |
https://openreview.net/pdf?id=ryOG3fWCW | |
PWC | https://paperswithcode.com/paper/model-specialization-for-inference-via-end-to |
Repo | |
Framework | |
TRL: Discriminative Hints for Scalable Reverse Curriculum Learning
Title | TRL: Discriminative Hints for Scalable Reverse Curriculum Learning |
Authors | Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu |
Abstract | Deep reinforcement learning algorithms have proven successful in a variety of domains. However, tasks with sparse rewards remain challenging when the state space is large. Goal-oriented tasks are among the most typical problems in this domain, where a reward can only be received when the final goal is accomplished. In this work, we propose a potential solution to such problems with the introduction of an experience-based tendency reward mechanism, which provides the agent with additional hints based on a discriminative learning on past experiences during an automated reverse curriculum. This mechanism not only provides dense additional learning signals on what states lead to success, but also allows the agent to retain only this tendency reward instead of the whole histories of experience during multi-phase curriculum learning. We extensively study the advantages of our method on the standard sparse reward domains like Maze and Super Mario Bros and show that our method performs more efficiently and robustly than prior approaches in tasks with long time horizons and large state space. In addition, we demonstrate that using an optional keyframe scheme with very small quantity of key states, our approach can solve difficult robot manipulation challenges directly from perception and sparse rewards. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJssAZ-0- |
https://openreview.net/pdf?id=rJssAZ-0- | |
PWC | https://paperswithcode.com/paper/trl-discriminative-hints-for-scalable-reverse |
Repo | |
Framework | |
Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates
Title | Non-Ergodic Alternating Proximal Augmented Lagrangian Algorithms with Optimal Rates |
Authors | Quoc Tran Dinh |
Abstract | We develop two new non-ergodic alternating proximal augmented Lagrangian algorithms (NEAPAL) to solve a class of nonsmooth constrained convex optimization problems. Our approach relies on a novel combination of the augmented Lagrangian framework, alternating/linearization scheme, Nesterov’s acceleration techniques, and adaptive strategy for parameters. Our algorithms have several new features compared to existing methods. Firstly, they have a Nesterov’s acceleration step on the primal variables compared to the dual one in several methods in the literature. Secondly, they achieve non-ergodic optimal convergence rates under standard assumptions, i.e. an $\mathcal{O}\left(\frac{1}{k}\right)$ rate without any smoothness or strong convexity-type assumption, or an $\mathcal{O}\left(\frac{1}{k^2}\right)$ rate under only semi-strong convexity, where $k$ is the iteration counter. Thirdly, they preserve or have better per-iteration complexity compared to existing algorithms. Fourthly, they can be implemented in a parallel fashion. Finally, all the parameters are adaptively updated without heuristic tuning. We verify our algorithms on different numerical examples and compare them with some state-of-the-art methods. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7730-non-ergodic-alternating-proximal-augmented-lagrangian-algorithms-with-optimal-rates |
http://papers.nips.cc/paper/7730-non-ergodic-alternating-proximal-augmented-lagrangian-algorithms-with-optimal-rates.pdf | |
PWC | https://paperswithcode.com/paper/non-ergodic-alternating-proximal-augmented |
Repo | |
Framework | |
Classification of Moral Foundations in Microblog Political Discourse
Title | Classification of Moral Foundations in Microblog Political Discourse |
Authors | Kristen Johnson, Dan Goldwasser |
Abstract | Previous works in computer science, as well as political and social science, have shown correlation in text between political ideologies and the moral foundations expressed within that text. Additional work has shown that policy frames, which are used by politicians to bias the public towards their stance on an issue, are also correlated with political ideology. Based on these associations, this work takes a first step towards modeling both the language and how politicians frame issues on Twitter, in order to predict the moral foundations that are used by politicians to express their stances on issues. The contributions of this work includes a dataset annotated for the moral foundations, annotation guidelines, and probabilistic graphical models which show the usefulness of jointly modeling abstract political slogans, as opposed to the unigrams of previous works, with policy frames for the prediction of the morality underlying political tweets. |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1067/ |
https://www.aclweb.org/anthology/P18-1067 | |
PWC | https://paperswithcode.com/paper/classification-of-moral-foundations-in |
Repo | |
Framework | |
Proceedings of ACL 2018, System Demonstrations
Title | Proceedings of ACL 2018, System Demonstrations |
Authors | |
Abstract | |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-4000/ |
https://www.aclweb.org/anthology/P18-4000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-acl-2018-system-demonstrations |
Repo | |
Framework | |
Understanding Local Minima in Neural Networks by Loss Surface Decomposition
Title | Understanding Local Minima in Neural Networks by Loss Surface Decomposition |
Authors | Hanock Kwak, Byoung-Tak Zhang |
Abstract | To provide principled ways of designing proper Deep Neural Network (DNN) models, it is essential to understand the loss surface of DNNs under realistic assumptions. We introduce interesting aspects for understanding the local minima and overall structure of the loss surface. The parameter domain of the loss surface can be decomposed into regions in which activation values (zero or one for rectified linear units) are consistent. We found that, in each region, the loss surface have properties similar to that of linear neural networks where every local minimum is a global minimum. This means that every differentiable local minimum is the global minimum of the corresponding region. We prove that for a neural network with one hidden layer using rectified linear units under realistic assumptions. There are poor regions that lead to poor local minima, and we explain why such regions exist even in the overparameterized DNNs. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJDYgPgCZ |
https://openreview.net/pdf?id=SJDYgPgCZ | |
PWC | https://paperswithcode.com/paper/understanding-local-minima-in-neural-networks |
Repo | |
Framework | |