July 26, 2019

2635 words 13 mins read

Paper Group NAWR 3

Paper Group NAWR 3

DropoutNet: Addressing Cold Start in Recommender Systems. A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size. Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition. FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling. Deep …

DropoutNet: Addressing Cold Start in Recommender Systems

Title DropoutNet: Addressing Cold Start in Recommender Systems
Authors Maksims Volkovs, Guangwei Yu, Tomi Poutanen
Abstract Latent models have become the default choice for recommender systems due to their performance and scalability. However, research in this area has primarily focused on modeling user-item interactions, and few latent models have been developed for cold start. Deep learning has recently achieved remarkable success showing excellent results for diverse input types. Inspired by these results we propose a neural network based latent model called DropoutNet to address the cold start problem in recommender systems. Unlike existing approaches that incorporate additional content-based objective terms, we instead focus on the optimization and show that neural network models can be explicitly trained for cold start through dropout. Our model can be applied on top of any existing latent model effectively providing cold start capabilities, and full power of deep architectures. Empirically we demonstrate state-of-the-art accuracy on publicly available benchmarks. Code is available at https://github.com/layer6ai-labs/DropoutNet.
Tasks Recommendation Systems
Published 2017-12-01
URL http://papers.nips.cc/paper/7081-dropoutnet-addressing-cold-start-in-recommender-systems
PDF http://papers.nips.cc/paper/7081-dropoutnet-addressing-cold-start-in-recommender-systems.pdf
PWC https://paperswithcode.com/paper/dropoutnet-addressing-cold-start-in
Repo https://github.com/layer6ai-labs/DropoutNet
Framework tf

A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size

Title A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size
Authors Masato Neishi, Jin Sakuma, Satoshi Tohda, Shonosuke Ishiwatari, Naoki Yoshinaga, Masashi Toyoda
Abstract In this paper, we describe the team UT-IIS{'}s system and results for the WAT 2017 translation tasks. We further investigated several tricks including a novel technique for initializing embedding layers using only the parallel corpus, which increased the BLEU score by 1.28, found a practical large batch size of 256, and gained insights regarding hyperparameter settings. Ultimately, our system obtained a better result than the state-of-the-art system of WAT 2016. Our code is available on \url{https://github.com/nem6ishi/wat17}.
Tasks Machine Translation, Word Embeddings
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5708/
PDF https://www.aclweb.org/anthology/W17-5708
PWC https://paperswithcode.com/paper/a-bag-of-useful-tricks-for-practical-neural
Repo https://github.com/nem6ishi/wat17
Framework tf

Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition

Title Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition
Authors Junwu Weng, Chaoqun Weng, Junsong Yuan
Abstract Motivated by previous success of using non-parametric methods to recognize objects, e.g., NBNN, we extend it to recognize actions using skeletons. Each 3D action is presented by a sequence of 3D poses. Similar to NBNN, our proposed Spatio-Temporal-NBNN applies stage-to-class distance to classify actions. However, ST-NBNN takes the spatio-temporal structure of 3D actions into consideration and relaxes the Naive Bayes assumption of NBNN. Specifically, ST-NBNN adopts bilinear classifiers to identify both key temporal stages as well as spatial joints for action classification. Although only using a linear classifier, experiments on three benchmark datasets show that by combining the strength of both non-parametric and parametric models, ST-NBNN can achieve competitive performance compared with state-of-the-art results using sophisticated models such as deep learning. Moreover, by identifying key skeleton joints and temporal stages for each action class, our ST-NBNN can capture the essential spatio-temporal patterns that play key roles of recognizing actions, which is not always achievable by using end-to-end models.
Tasks Action Classification, Skeleton Based Action Recognition, Temporal Action Localization
Published 2017-07-01
URL http://openaccess.thecvf.com/content_cvpr_2017/html/Weng_Spatio-Temporal_Naive-Bayes_Nearest-Neighbor_CVPR_2017_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2017/papers/Weng_Spatio-Temporal_Naive-Bayes_Nearest-Neighbor_CVPR_2017_paper.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-naive-bayes-nearest-neighbor
Repo https://github.com/Wuie/ST-NBNN-demo
Framework none

FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling

Title FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling
Authors Yuanming Hu, Baoyuan Wang, Stephen Lin
Abstract Improvements in color constancy have arisen from the use of convolutional neural networks (CNNs). However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors. Image patches with estimation ambiguity not only appear with great frequency in photographs, but also significantly degrade the quality of network training and inference. To overcome this problem, we present a fully convolutional network architecture in which patches throughout an image can carry different confidence weights according to the value they provide for color constancy estimation. These confidence weights are learned and applied within a novel pooling layer where the local estimates are merged into a global solution. With this formulation, the network is able to determine “what to learn” and “how to pool” automatically from color constancy datasets without additional supervision. The proposed network also allows for end-to-end training, and achieves higher efficiency and accuracy. On standard benchmarks, our network outperforms the previous state-of-the-art while achieving 120x greater efficiency.
Tasks Color Constancy
Published 2017-07-01
URL http://openaccess.thecvf.com/content_cvpr_2017/html/Hu_FC4_Fully_Convolutional_CVPR_2017_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2017/papers/Hu_FC4_Fully_Convolutional_CVPR_2017_paper.pdf
PWC https://paperswithcode.com/paper/fc4-fully-convolutional-color-constancy-with
Repo https://github.com/yuanming-hu/fc4
Framework tf

Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks

Title Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks
Authors Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, Jiashi Feng
Abstract We introduce a new problem of gaze anticipation on egocentric videos. This substantially extends the conventional gaze prediction problem to future frames by no longer confining it on the current frame. To solve this problem, we propose a new generative adversarial neural network based model, Deep Future Gaze (DFG). DFG generates multiple future frames conditioned on the single current frame and anticipates corresponding future gazes in next few seconds. It consists of two networks: generator and discriminator. The generator uses a two-stream spatial temporal convolution architecture (3D-CNN) explicitly untangling the foreground and the background to generate future frames. It then attaches another 3D-CNN for gaze anticipation based on these synthetic frames. The discriminator plays against the generator by differentiating the synthetic frames of the generator from the real frames. Through competition with discriminator, the generator progressively improves quality of the future frames and thus anticipates future gaze better. Experimental results on the publicly available egocentric datasets show that DFG significantly outperforms all well-established baselines. Moreover, we demonstrate that DFG achieves better performance of gaze prediction on current frames than state-of-the-art methods. This is due to benefiting from learning motion discriminative representations in frame generation. We further contribute a new egocentric dataset (OST) in the object search task. DFG also achieves the best performance for this challenging dataset.
Tasks Gaze Prediction
Published 2017-07-01
URL http://openaccess.thecvf.com/content_cvpr_2017/html/Zhang_Deep_Future_Gaze_CVPR_2017_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhang_Deep_Future_Gaze_CVPR_2017_paper.pdf
PWC https://paperswithcode.com/paper/deep-future-gaze-gaze-anticipation-on
Repo https://github.com/Mengmi/deepfuturegaze_gan
Framework torch

Reconstructing the house from the ad: Structured prediction on real estate classifieds

Title Reconstructing the house from the ad: Structured prediction on real estate classifieds
Authors Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract In this paper, we address the (to the best of our knowledge) new problem of extracting a structured description of real estate properties from their natural language descriptions in classifieds. We survey and present several models to (a) identify important entities of a property (e.g.,rooms) from classifieds and (b) structure them into a tree format, with the entities as nodes and edges representing a part-of relation. Experiments show that a graph-based system deriving the tree from an initially fully connected entity graph, outperforms a transition-based system starting from only the entity nodes, since it better reconstructs the tree.
Tasks Dependency Parsing, Named Entity Recognition, Structured Prediction
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2044/
PDF https://www.aclweb.org/anthology/E17-2044
PWC https://paperswithcode.com/paper/reconstructing-the-house-from-the-ad
Repo https://github.com/bekou/ad_data
Framework none

Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle

Title Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle
Authors Miryam de Lhoneux, Sara Stymne, Joakim Nivre
Abstract In this paper, we extend the arc-hybrid system for transition-based parsing with a swap transition that enables reordering of the words and construction of non-projective trees. Although this extension breaks the arc-decomposability of the transition system, we show how the existing dynamic oracle for this system can be modified and combined with a static oracle only for the swap transition. Experiments on 5 languages show that the new system gives competitive accuracy and is significantly better than a system trained with a purely static oracle.
Tasks Dependency Parsing
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6314/
PDF https://www.aclweb.org/anthology/W17-6314
PWC https://paperswithcode.com/paper/arc-hybrid-non-projective-dependency-parsing
Repo https://github.com/UppsalaNLP/uuparser
Framework none

Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes

Title Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
Authors Taylor W. Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez
Abstract We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent parameters and the state space. We also replace the original Gaussian Process-based model with a Bayesian Neural Network, enabling more scalable inference. Thus, we expand the scope of the HiP-MDP to applications with higher dimensions and more complex dynamics.
Tasks Transfer Learning
Published 2017-12-01
URL http://papers.nips.cc/paper/7205-robust-and-efficient-transfer-learning-with-hidden-parameter-markov-decision-processes
PDF http://papers.nips.cc/paper/7205-robust-and-efficient-transfer-learning-with-hidden-parameter-markov-decision-processes.pdf
PWC https://paperswithcode.com/paper/robust-and-efficient-transfer-learning-with-1
Repo https://github.com/dtak/hip-mdp-public
Framework tf

Evaluating the morphological competence of Machine Translation Systems

Title Evaluating the morphological competence of Machine Translation Systems
Authors Franck Burlot, Fran{\c{c}}ois Yvon
Abstract
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4705/
PDF https://www.aclweb.org/anthology/W17-4705
PWC https://paperswithcode.com/paper/evaluating-the-morphological-competence-of
Repo https://github.com/franckbrl/morpheval
Framework none

Deep learning for decentralized parking lot occupancy detection

Title Deep learning for decentralized parking lot occupancy detection
Authors G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Meghini, C. Vairo
Abstract A smart camera is a vision system capable of extracting application-specific information from the captured images. The paper proposes a decentralized and efficient solution for visual parking lot occupancy detection based on a deep Convolutional Neural Network (CNN) specifically designed for smart cameras. This solution is compared with state-of-the-art approaches using two visual datasets: PKLot, already existing in literature, and CNRPark-EXT. The former is an existing dataset, that allowed us to exhaustively compare with previous works. The latter dataset has been created in the context of this research, accumulating data across various seasons of the year, to test our approach in particularly challenging situations, exhibiting occlusions, and diverse and difficult viewpoints. This dataset is public available to the scientific community and is another contribution of our research. Our experiments show that our solution outperforms and generalizes the best performing approaches on both datasets. The performance of our proposed CNN architecture on the parking lot occupancy detection task, is comparable to the well-known AlexNet, which is three orders of magnitude larger.
Tasks
Published 2017-04-15
URL http://cnrpark.it/
PDF http://www.nmis.isti.cnr.it/falchi/Draft/2017-ESWA-Draft.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-decentralized-parking-lot
Repo https://github.com/fabiocarrara/deep-parking
Framework none

Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing

Title Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing
Authors Aoqian Zhang, Shaoxu Song, Jianmin Wang, Philip S. Yu
Abstract Errors are prevalent in time series data, such as GPS trajectories or sensor readings. Existing methods focus more on anomaly detection but not on repairing the detected anomalies. By simply filtering out the dirty data via anomaly detection, applications could still be unreliable over the incomplete time series. Instead of simply discarding anomalies, we propose to (iteratively) repair them in time series data, by creatively bonding the beauty of temporal nature in anomaly detection with the widely considered minimum change principle in data repairing. Our major contributions include: (1) a novel framework of iterative minimum repairing (IMR) over time series data, (2) explicit analysis on convergence of the proposed iterative minimum repairing, and (3) efficient estimation of parameters in each iteration. Remarkably, with incremental computation, we reduce the complexity of parameter estimation from O(n) to O(1). Experiments on real datasets demonstrate the superiority of our proposal compared to the state-of-the-art approaches. In particular, we show that (the proposed) repairing indeed improves the time series classification application.
Tasks Anomaly Detection, Time Series, Time Series Classification
Published 2017-06-10
URL https://dl.acm.org/citation.cfm?id=3115410
PDF http://www.vldb.org/pvldb/vol10/p1046-song.pdf
PWC https://paperswithcode.com/paper/time-series-data-cleaning-from-anomaly
Repo https://github.com/zaqthss/vldb17-imr
Framework none

Social Bias in Elicited Natural Language Inferences

Title Social Bias in Elicited Natural Language Inferences
Authors Rachel Rudinger, Ch May, ler, Benjamin Van Durme
Abstract We analyze the Stanford Natural Language Inference (SNLI) corpus in an investigation of bias and stereotyping in NLP data. The SNLI human-elicitation protocol makes it prone to amplifying bias and stereotypical associations, which we demonstrate statistically (using pointwise mutual information) and with qualitative examples.
Tasks Language Modelling, Natural Language Inference, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1609/
PDF https://www.aclweb.org/anthology/W17-1609
PWC https://paperswithcode.com/paper/social-bias-in-elicited-natural-language
Repo https://github.com/cjmay/snli-ethics
Framework none

Supervised and unsupervised segmentation using superpixels, model estimation, and Graph Cut.

Title Supervised and unsupervised segmentation using superpixels, model estimation, and Graph Cut.
Authors Jiří Borovec, Jan Švihlík, Jan Kybic, David Habart
Abstract Image segmentation is widely used as an initial phase of many image analysis tasks. It is often advantageous to first group pixels into compact, edge-respecting superpixels, because these reduce the size of the segmentation problem and thus the segmentation time by an order of magnitudes. In addition, features calculated from superpixel regions are more robust than features calculated from fixed pixel neighborhoods. We present a fast and general multiclass image segmentation method consisting of the following steps: (i) computation of superpixels; (ii) extraction of superpixel-based descriptors; (iii) calculating image-based class probabilities in a supervised or unsupervised manner; and (iv) regularized superpixel classification using graph cut. We apply this segmentation pipeline to five real-world medical imaging applications and compare the results with three baseline methods: pixelwise graph cut segmentation, supertexton-based segmentation, and classical superpixel-based segmentation. On all datasets, we outperform the baseline results. We also show that unsupervised segmentation is surprisingly efficient in many situations. Unsupervised segmentation provides similar results to the supervised method but does not require manually annotated training data, which is often expensive to obtain.
Tasks Semantic Segmentation
Published 2017-11-06
URL https://doi.org/10.1117/1.JEI.26.6.061610
PDF https://doi.org/10.1117/1.JEI.26.6.061610
PWC https://paperswithcode.com/paper/supervised-and-unsupervised-segmentation
Repo https://github.com/Borda/pyImSegm
Framework none

Deriving Consensus for Multi-Parallel Corpora: an English Bible Study

Title Deriving Consensus for Multi-Parallel Corpora: an English Bible Study
Authors Patrick Xia, David Yarowsky
Abstract What can you do with multiple noisy versions of the same text? We present a method which generates a single consensus between multi-parallel corpora. By maximizing a function of linguistic features between word pairs, we jointly learn a single corpus-wide multiway alignment: a consensus between 27 versions of the English Bible. We additionally produce English paraphrases, word-level distributions of tags, and consensus dependency parses. Our method is language independent and applicable to any multi-parallel corpora. Given the Bible{'}s unique role as alignable bitext for over 800 of the world{'}s languages, this consensus alignment and resulting resources offer value for multilingual annotation projection, and also shed potential insights into the Bible itself.
Tasks Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-2076/
PDF https://www.aclweb.org/anthology/I17-2076
PWC https://paperswithcode.com/paper/deriving-consensus-for-multi-parallel-corpora
Repo https://github.com/pitrack/monolign
Framework none

Comparative Evaluation of Hand-Crafted and Learned Local Features

Title Comparative Evaluation of Hand-Crafted and Learned Local Features
Authors Johannes L. Sch¨onberger, Hans Hardmeier, Torsten Sattler, Marc Pollefeys
Abstract Matching local image descriptors is a key step in many computer vision applications. For more than a decade,hand-crafted descriptors such as SIFT have been used for this task. Recently, multiple new descriptors learned from data have been proposed and shown to improve on SIFT interms of discriminative power. This paper is dedicated to an extensive experimental evaluation of learned local features to establish a single evaluation protocol that ensures comparable results. In terms of matching performance, we evaluate the different descriptors regarding standard criteria.However, considering matching performance in isolation only provides an incomplete measure of a descriptor’s quality. For example, finding additional correct matches between similar images does not necessarily lead to a better performance when trying to match images under extreme viewpoint or illumination changes. Besides pure descriptor matching, we thus also evaluate the different descriptors in the context of image-based reconstruction. This enables us to study the descriptor performance on a set of more practical criteria including image retrieval, the ability to register images under strong viewpoint and illumination changes, and the accuracy and completeness of the reconstructed cameras and scenes. To facilitate future research, the full evaluation pipeline is made publicly available.
Tasks Image Retrieval
Published 2017-07-01
URL https://www.cvg.ethz.ch/research/local-feature-evaluation/schoenberger2017comparative.pdf
PDF https://www.cvg.ethz.ch/research/local-feature-evaluation/schoenberger2017comparative.pdf
PWC https://paperswithcode.com/paper/comparative-evaluation-of-hand-crafted-and-1
Repo https://github.com/ahojnnes/local-feature-evaluation
Framework none
comments powered by Disqus