July 26, 2019

2635 words 13 mins read

Paper Group NAWR 3

DropoutNet: Addressing Cold Start in Recommender Systems. A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size. Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition. FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling. Deep …

DropoutNet: Addressing Cold Start in Recommender Systems


Title	DropoutNet: Addressing Cold Start in Recommender Systems
Authors	Maksims Volkovs, Guangwei Yu, Tomi Poutanen
Abstract	Latent models have become the default choice for recommender systems due to their performance and scalability. However, research in this area has primarily focused on modeling user-item interactions, and few latent models have been developed for cold start. Deep learning has recently achieved remarkable success showing excellent results for diverse input types. Inspired by these results we propose a neural network based latent model called DropoutNet to address the cold start problem in recommender systems. Unlike existing approaches that incorporate additional content-based objective terms, we instead focus on the optimization and show that neural network models can be explicitly trained for cold start through dropout. Our model can be applied on top of any existing latent model effectively providing cold start capabilities, and full power of deep architectures. Empirically we demonstrate state-of-the-art accuracy on publicly available benchmarks. Code is available at https://github.com/layer6ai-labs/DropoutNet.
Tasks	Recommendation Systems
Published	2017-12-01
URL	http://papers.nips.cc/paper/7081-dropoutnet-addressing-cold-start-in-recommender-systems
PDF	http://papers.nips.cc/paper/7081-dropoutnet-addressing-cold-start-in-recommender-systems.pdf
PWC	https://paperswithcode.com/paper/dropoutnet-addressing-cold-start-in
Repo	https://github.com/layer6ai-labs/DropoutNet
Framework	tf

A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size


Title	A Bag of Useful Tricks for Practical Neural Machine Translation: Embedding Layer Initialization and Large Batch Size
Authors	Masato Neishi, Jin Sakuma, Satoshi Tohda, Shonosuke Ishiwatari, Naoki Yoshinaga, Masashi Toyoda
Abstract	In this paper, we describe the team UT-IIS{'}s system and results for the WAT 2017 translation tasks. We further investigated several tricks including a novel technique for initializing embedding layers using only the parallel corpus, which increased the BLEU score by 1.28, found a practical large batch size of 256, and gained insights regarding hyperparameter settings. Ultimately, our system obtained a better result than the state-of-the-art system of WAT 2016. Our code is available on \url{https://github.com/nem6ishi/wat17}.
Tasks	Machine Translation, Word Embeddings
Published	2017-11-01
URL	https://www.aclweb.org/anthology/W17-5708/
PDF	https://www.aclweb.org/anthology/W17-5708
PWC	https://paperswithcode.com/paper/a-bag-of-useful-tricks-for-practical-neural
Repo	https://github.com/nem6ishi/wat17
Framework	tf

Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition


Title	Spatio-Temporal Naive-Bayes Nearest-Neighbor (ST-NBNN) for Skeleton-Based Action Recognition
Authors	Junwu Weng, Chaoqun Weng, Junsong Yuan
Abstract	Motivated by previous success of using non-parametric methods to recognize objects, e.g., NBNN, we extend it to recognize actions using skeletons. Each 3D action is presented by a sequence of 3D poses. Similar to NBNN, our proposed Spatio-Temporal-NBNN applies stage-to-class distance to classify actions. However, ST-NBNN takes the spatio-temporal structure of 3D actions into consideration and relaxes the Naive Bayes assumption of NBNN. Specifically, ST-NBNN adopts bilinear classifiers to identify both key temporal stages as well as spatial joints for action classification. Although only using a linear classifier, experiments on three benchmark datasets show that by combining the strength of both non-parametric and parametric models, ST-NBNN can achieve competitive performance compared with state-of-the-art results using sophisticated models such as deep learning. Moreover, by identifying key skeleton joints and temporal stages for each action class, our ST-NBNN can capture the essential spatio-temporal patterns that play key roles of recognizing actions, which is not always achievable by using end-to-end models.
Tasks	Action Classification, Skeleton Based Action Recognition, Temporal Action Localization
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Weng_Spatio-Temporal_Naive-Bayes_Nearest-Neighbor_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Weng_Spatio-Temporal_Naive-Bayes_Nearest-Neighbor_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-naive-bayes-nearest-neighbor
Repo	https://github.com/Wuie/ST-NBNN-demo
Framework	none

FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling


Title	FC4: Fully Convolutional Color Constancy With Confidence-Weighted Pooling
Authors	Yuanming Hu, Baoyuan Wang, Stephen Lin
Abstract	Improvements in color constancy have arisen from the use of convolutional neural networks (CNNs). However, the patch-based CNNs that exist for this problem are faced with the issue of estimation ambiguity, where a patch may contain insufficient information to establish a unique or even a limited possible range of illumination colors. Image patches with estimation ambiguity not only appear with great frequency in photographs, but also significantly degrade the quality of network training and inference. To overcome this problem, we present a fully convolutional network architecture in which patches throughout an image can carry different confidence weights according to the value they provide for color constancy estimation. These confidence weights are learned and applied within a novel pooling layer where the local estimates are merged into a global solution. With this formulation, the network is able to determine “what to learn” and “how to pool” automatically from color constancy datasets without additional supervision. The proposed network also allows for end-to-end training, and achieves higher efficiency and accuracy. On standard benchmarks, our network outperforms the previous state-of-the-art while achieving 120x greater efficiency.
Tasks	Color Constancy
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Hu_FC4_Fully_Convolutional_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Hu_FC4_Fully_Convolutional_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/fc4-fully-convolutional-color-constancy-with
Repo	https://github.com/yuanming-hu/fc4
Framework	tf

Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks


Title	Deep Future Gaze: Gaze Anticipation on Egocentric Videos Using Adversarial Networks
Authors	Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, Jiashi Feng
Abstract	We introduce a new problem of gaze anticipation on egocentric videos. This substantially extends the conventional gaze prediction problem to future frames by no longer confining it on the current frame. To solve this problem, we propose a new generative adversarial neural network based model, Deep Future Gaze (DFG). DFG generates multiple future frames conditioned on the single current frame and anticipates corresponding future gazes in next few seconds. It consists of two networks: generator and discriminator. The generator uses a two-stream spatial temporal convolution architecture (3D-CNN) explicitly untangling the foreground and the background to generate future frames. It then attaches another 3D-CNN for gaze anticipation based on these synthetic frames. The discriminator plays against the generator by differentiating the synthetic frames of the generator from the real frames. Through competition with discriminator, the generator progressively improves quality of the future frames and thus anticipates future gaze better. Experimental results on the publicly available egocentric datasets show that DFG significantly outperforms all well-established baselines. Moreover, we demonstrate that DFG achieves better performance of gaze prediction on current frames than state-of-the-art methods. This is due to benefiting from learning motion discriminative representations in frame generation. We further contribute a new egocentric dataset (OST) in the object search task. DFG also achieves the best performance for this challenging dataset.
Tasks	Gaze Prediction
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Zhang_Deep_Future_Gaze_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhang_Deep_Future_Gaze_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/deep-future-gaze-gaze-anticipation-on
Repo	https://github.com/Mengmi/deepfuturegaze_gan
Framework	torch

Reconstructing the house from the ad: Structured prediction on real estate classifieds


Title	Reconstructing the house from the ad: Structured prediction on real estate classifieds
Authors	Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder
Abstract	In this paper, we address the (to the best of our knowledge) new problem of extracting a structured description of real estate properties from their natural language descriptions in classifieds. We survey and present several models to (a) identify important entities of a property (e.g.,rooms) from classifieds and (b) structure them into a tree format, with the entities as nodes and edges representing a part-of relation. Experiments show that a graph-based system deriving the tree from an initially fully connected entity graph, outperforms a transition-based system starting from only the entity nodes, since it better reconstructs the tree.
Tasks	Dependency Parsing, Named Entity Recognition, Structured Prediction
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2044/
PDF	https://www.aclweb.org/anthology/E17-2044
PWC	https://paperswithcode.com/paper/reconstructing-the-house-from-the-ad
Repo	https://github.com/bekou/ad_data
Framework	none

Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle


Title	Arc-Hybrid Non-Projective Dependency Parsing with a Static-Dynamic Oracle
Authors	Miryam de Lhoneux, Sara Stymne, Joakim Nivre
Abstract	In this paper, we extend the arc-hybrid system for transition-based parsing with a swap transition that enables reordering of the words and construction of non-projective trees. Although this extension breaks the arc-decomposability of the transition system, we show how the existing dynamic oracle for this system can be modified and combined with a static oracle only for the swap transition. Experiments on 5 languages show that the new system gives competitive accuracy and is significantly better than a system trained with a purely static oracle.
Tasks	Dependency Parsing
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6314/
PDF	https://www.aclweb.org/anthology/W17-6314
PWC	https://paperswithcode.com/paper/arc-hybrid-non-projective-dependency-parsing
Repo	https://github.com/UppsalaNLP/uuparser
Framework	none

Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes


Title	Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes
Authors	Taylor W. Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez
Abstract	We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent parameters and the state space. We also replace the original Gaussian Process-based model with a Bayesian Neural Network, enabling more scalable inference. Thus, we expand the scope of the HiP-MDP to applications with higher dimensions and more complex dynamics.
Tasks	Transfer Learning
Published	2017-12-01
URL	http://papers.nips.cc/paper/7205-robust-and-efficient-transfer-learning-with-hidden-parameter-markov-decision-processes
PDF	http://papers.nips.cc/paper/7205-robust-and-efficient-transfer-learning-with-hidden-parameter-markov-decision-processes.pdf
PWC	https://paperswithcode.com/paper/robust-and-efficient-transfer-learning-with-1
Repo	https://github.com/dtak/hip-mdp-public
Framework	tf

Evaluating the morphological competence of Machine Translation Systems


Title	Evaluating the morphological competence of Machine Translation Systems
Authors	Franck Burlot, Fran{\c{c}}ois Yvon
Abstract
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4705/
PDF	https://www.aclweb.org/anthology/W17-4705
PWC	https://paperswithcode.com/paper/evaluating-the-morphological-competence-of
Repo	https://github.com/franckbrl/morpheval
Framework	none

Deep learning for decentralized parking lot occupancy detection


Title	Deep learning for decentralized parking lot occupancy detection
Authors	G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Meghini, C. Vairo
Abstract	A smart camera is a vision system capable of extracting application-specific information from the captured images. The paper proposes a decentralized and efficient solution for visual parking lot occupancy detection based on a deep Convolutional Neural Network (CNN) specifically designed for smart cameras. This solution is compared with state-of-the-art approaches using two visual datasets: PKLot, already existing in literature, and CNRPark-EXT. The former is an existing dataset, that allowed us to exhaustively compare with previous works. The latter dataset has been created in the context of this research, accumulating data across various seasons of the year, to test our approach in particularly challenging situations, exhibiting occlusions, and diverse and difficult viewpoints. This dataset is public available to the scientific community and is another contribution of our research. Our experiments show that our solution outperforms and generalizes the best performing approaches on both datasets. The performance of our proposed CNN architecture on the parking lot occupancy detection task, is comparable to the well-known AlexNet, which is three orders of magnitude larger.
Tasks
Published	2017-04-15
URL	http://cnrpark.it/
PDF	http://www.nmis.isti.cnr.it/falchi/Draft/2017-ESWA-Draft.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-decentralized-parking-lot
Repo	https://github.com/fabiocarrara/deep-parking
Framework	none

Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing


Title	Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing
Authors	Aoqian Zhang, Shaoxu Song, Jianmin Wang, Philip S. Yu
Abstract	Errors are prevalent in time series data, such as GPS trajectories or sensor readings. Existing methods focus more on anomaly detection but not on repairing the detected anomalies. By simply filtering out the dirty data via anomaly detection, applications could still be unreliable over the incomplete time series. Instead of simply discarding anomalies, we propose to (iteratively) repair them in time series data, by creatively bonding the beauty of temporal nature in anomaly detection with the widely considered minimum change principle in data repairing. Our major contributions include: (1) a novel framework of iterative minimum repairing (IMR) over time series data, (2) explicit analysis on convergence of the proposed iterative minimum repairing, and (3) efficient estimation of parameters in each iteration. Remarkably, with incremental computation, we reduce the complexity of parameter estimation from O(n) to O(1). Experiments on real datasets demonstrate the superiority of our proposal compared to the state-of-the-art approaches. In particular, we show that (the proposed) repairing indeed improves the time series classification application.
Tasks	Anomaly Detection, Time Series, Time Series Classification
Published	2017-06-10
URL	https://dl.acm.org/citation.cfm?id=3115410
PDF	http://www.vldb.org/pvldb/vol10/p1046-song.pdf
PWC	https://paperswithcode.com/paper/time-series-data-cleaning-from-anomaly
Repo	https://github.com/zaqthss/vldb17-imr
Framework	none


Title	Social Bias in Elicited Natural Language Inferences
Authors	Rachel Rudinger, Ch May, ler, Benjamin Van Durme
Abstract	We analyze the Stanford Natural Language Inference (SNLI) corpus in an investigation of bias and stereotyping in NLP data. The SNLI human-elicitation protocol makes it prone to amplifying bias and stereotypical associations, which we demonstrate statistically (using pointwise mutual information) and with qualitative examples.
Tasks	Language Modelling, Natural Language Inference, Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1609/
PDF	https://www.aclweb.org/anthology/W17-1609
PWC	https://paperswithcode.com/paper/social-bias-in-elicited-natural-language
Repo	https://github.com/cjmay/snli-ethics
Framework	none

Supervised and unsupervised segmentation using superpixels, model estimation, and Graph Cut.


Title	Supervised and unsupervised segmentation using superpixels, model estimation, and Graph Cut.
Authors	Jiří Borovec, Jan Švihlík, Jan Kybic, David Habart
Abstract	Image segmentation is widely used as an initial phase of many image analysis tasks. It is often advantageous to first group pixels into compact, edge-respecting superpixels, because these reduce the size of the segmentation problem and thus the segmentation time by an order of magnitudes. In addition, features calculated from superpixel regions are more robust than features calculated from fixed pixel neighborhoods. We present a fast and general multiclass image segmentation method consisting of the following steps: (i) computation of superpixels; (ii) extraction of superpixel-based descriptors; (iii) calculating image-based class probabilities in a supervised or unsupervised manner; and (iv) regularized superpixel classification using graph cut. We apply this segmentation pipeline to five real-world medical imaging applications and compare the results with three baseline methods: pixelwise graph cut segmentation, supertexton-based segmentation, and classical superpixel-based segmentation. On all datasets, we outperform the baseline results. We also show that unsupervised segmentation is surprisingly efficient in many situations. Unsupervised segmentation provides similar results to the supervised method but does not require manually annotated training data, which is often expensive to obtain.
Tasks	Semantic Segmentation
Published	2017-11-06
URL	https://doi.org/10.1117/1.JEI.26.6.061610
PDF	https://doi.org/10.1117/1.JEI.26.6.061610
PWC	https://paperswithcode.com/paper/supervised-and-unsupervised-segmentation
Repo	https://github.com/Borda/pyImSegm
Framework	none

Deriving Consensus for Multi-Parallel Corpora: an English Bible Study


Title	Deriving Consensus for Multi-Parallel Corpora: an English Bible Study
Authors	Patrick Xia, David Yarowsky
Abstract	What can you do with multiple noisy versions of the same text? We present a method which generates a single consensus between multi-parallel corpora. By maximizing a function of linguistic features between word pairs, we jointly learn a single corpus-wide multiway alignment: a consensus between 27 versions of the English Bible. We additionally produce English paraphrases, word-level distributions of tags, and consensus dependency parses. Our method is language independent and applicable to any multi-parallel corpora. Given the Bible{'}s unique role as alignable bitext for over 800 of the world{'}s languages, this consensus alignment and resulting resources offer value for multilingual annotation projection, and also shed potential insights into the Bible itself.
Tasks	Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-2076/
PDF	https://www.aclweb.org/anthology/I17-2076
PWC	https://paperswithcode.com/paper/deriving-consensus-for-multi-parallel-corpora
Repo	https://github.com/pitrack/monolign
Framework	none

Comparative Evaluation of Hand-Crafted and Learned Local Features


Title	Comparative Evaluation of Hand-Crafted and Learned Local Features
Authors	Johannes L. Sch¨onberger, Hans Hardmeier, Torsten Sattler, Marc Pollefeys
Abstract	Matching local image descriptors is a key step in many computer vision applications. For more than a decade,hand-crafted descriptors such as SIFT have been used for this task. Recently, multiple new descriptors learned from data have been proposed and shown to improve on SIFT interms of discriminative power. This paper is dedicated to an extensive experimental evaluation of learned local features to establish a single evaluation protocol that ensures comparable results. In terms of matching performance, we evaluate the different descriptors regarding standard criteria.However, considering matching performance in isolation only provides an incomplete measure of a descriptor’s quality. For example, finding additional correct matches between similar images does not necessarily lead to a better performance when trying to match images under extreme viewpoint or illumination changes. Besides pure descriptor matching, we thus also evaluate the different descriptors in the context of image-based reconstruction. This enables us to study the descriptor performance on a set of more practical criteria including image retrieval, the ability to register images under strong viewpoint and illumination changes, and the accuracy and completeness of the reconstructed cameras and scenes. To facilitate future research, the full evaluation pipeline is made publicly available.
Tasks	Image Retrieval
Published	2017-07-01
URL	https://www.cvg.ethz.ch/research/local-feature-evaluation/schoenberger2017comparative.pdf
PDF	https://www.cvg.ethz.ch/research/local-feature-evaluation/schoenberger2017comparative.pdf
PWC	https://paperswithcode.com/paper/comparative-evaluation-of-hand-crafted-and-1
Repo	https://github.com/ahojnnes/local-feature-evaluation
Framework	none