May 7, 2019

3099 words 15 mins read

Paper Group AWR 26

Fully Character-Level Neural Machine Translation without Explicit Segmentation. A Kernel Test of Goodness of Fit. Latent Tree Language Model. The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain. Combining Data-driven and Model-driven Methods for Robust Facial Landmark …

Fully Character-Level Neural Machine Translation without Explicit Segmentation


Title	Fully Character-Level Neural Machine Translation without Explicit Segmentation
Authors	Jason Lee, Kyunghyun Cho, Thomas Hofmann
Abstract	Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subword-level encoder on WMT’15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single character-level encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.
Tasks	Machine Translation
Published	2016-10-10
URL	http://arxiv.org/abs/1610.03017v3
PDF	http://arxiv.org/pdf/1610.03017v3.pdf
PWC	https://paperswithcode.com/paper/fully-character-level-neural-machine
Repo	https://github.com/newhiwoong/Keras-Applications
Framework	none

A Kernel Test of Goodness of Fit


Title	A Kernel Test of Goodness of Fit
Authors	Kacper Chwialkowski, Heiko Strathmann, Arthur Gretton
Abstract	We propose a nonparametric statistical test for goodness-of-fit: given a set of samples, the test determines how likely it is that these were generated from a target density function. The measure of goodness-of-fit is a divergence constructed via Stein’s method using functions from a Reproducing Kernel Hilbert Space. Our test statistic is based on an empirical estimate of this divergence, taking the form of a V-statistic in terms of the log gradients of the target density and the kernel. We derive a statistical test, both for i.i.d. and non-i.i.d. samples, where we estimate the null distribution quantiles using a wild bootstrap procedure. We apply our test to quantifying convergence of approximate Markov Chain Monte Carlo methods, statistical model criticism, and evaluating quality of fit vs model complexity in nonparametric density estimation.
Tasks	Density Estimation
Published	2016-02-09
URL	http://arxiv.org/abs/1602.02964v4
PDF	http://arxiv.org/pdf/1602.02964v4.pdf
PWC	https://paperswithcode.com/paper/a-kernel-test-of-goodness-of-fit
Repo	https://github.com/karlnapf/kernel_goodness_of_fit
Framework	none

Latent Tree Language Model


Title	Latent Tree Language Model
Authors	Tomas Brychcin
Abstract	In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of a given sentence as a tree of word roles. The learning phase iteratively updates the trees by moving nodes according to Gibbs sampling. We introduce two algorithms to infer a tree for a given sentence. The first one is based on Gibbs sampling. It is fast, but does not guarantee to find the most probable tree. The second one is based on dynamic programming. It is slower, but guarantees to find the most probable tree. We provide comparison of both algorithms. We combine LTLM with 4-gram Modified Kneser-Ney language model via linear interpolation. Our experiments with English and Czech corpora show significant perplexity reductions (up to 46% for English and 49% for Czech) compared with standalone 4-gram Modified Kneser-Ney language model.
Tasks	Language Modelling
Published	2016-07-24
URL	http://arxiv.org/abs/1607.07057v3
PDF	http://arxiv.org/pdf/1607.07057v3.pdf
PWC	https://paperswithcode.com/paper/latent-tree-language-model
Repo	https://github.com/brychcin/LTLM
Framework	none

The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain


Title	The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain
Authors	Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith
Abstract	This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.
Tasks	Anomaly Detection
Published	2016-09-02
URL	http://arxiv.org/abs/1609.00464v2
PDF	http://arxiv.org/pdf/1609.00464v2.pdf
PWC	https://paperswithcode.com/paper/the-semantic-knowledge-graph-a-compact-auto
Repo	https://github.com/shalder/knowledge_graph
Framework	none

Combining Data-driven and Model-driven Methods for Robust Facial Landmark Detection


Title	Combining Data-driven and Model-driven Methods for Robust Facial Landmark Detection
Authors	Hongwen Zhang, Qi Li, Zhenan Sun, Yunfan Liu
Abstract	Facial landmark detection is an important yet challenging task for real-world computer vision applications. This paper proposes an effective and robust approach for facial landmark detection by combining data- and model-driven methods. Firstly, a Fully Convolutional Network (FCN) is trained to compute response maps of all facial landmark points. Such a data-driven method could make full use of holistic information in a facial image for global estimation of facial landmarks. After that, the maximum points in the response maps are fitted with a pre-trained Point Distribution Model (PDM) to generate the initial facial shape. This model-driven method is able to correct the inaccurate locations of outliers by considering the shape prior information. Finally, a weighted version of Regularized Landmark Mean-Shift (RLMS) is employed to fine-tune the facial shape iteratively. This Estimation-Correction-Tuning process perfectly combines the advantages of the global robustness of data-driven method (FCN), outlier correction capability of model-driven method (PDM) and non-parametric optimization of RLMS. Results of extensive experiments demonstrate that our approach achieves state-of-the-art performances on challenging datasets including 300W, AFLW, AFW and COFW. The proposed method is able to produce satisfying detection results on face images with exaggerated expressions, large head poses, and partial occlusions.
Tasks	Facial Landmark Detection
Published	2016-11-30
URL	http://arxiv.org/abs/1611.10152v2
PDF	http://arxiv.org/pdf/1611.10152v2.pdf
PWC	https://paperswithcode.com/paper/combining-data-driven-and-model-driven
Repo	https://github.com/HongwenZhang/ECT-FaceAlignment
Framework	none

Specialized Support Vector Machines for Open-set Recognition


Title	Specialized Support Vector Machines for Open-set Recognition
Authors	Pedro Ribeiro Mendes Júnior, Terrance E. Boult, Jacques Wainer, Anderson Rocha
Abstract	Often, when dealing with real-world recognition problems, we do not need, and often cannot have, knowledge of the entire set of possible classes that might appear during operational testing. In such cases, we need to think of robust classification methods able to deal with the “unknown” and properly reject samples belonging to classes never seen during training. Notwithstanding, almost all existing classifiers to date were mostly developed for the closed-set scenario, i.e., the classification setup in which it is assumed that all test samples belong to one of the classes with which the classifier was trained. In the open-set scenario, however, a test sample can belong to none of the known classes and the classifier must properly reject it by classifying it as unknown. In this work, we extend upon the well-known Support Vector Machines (SVM) classifier and introduce the Specialized Support Vector Machines (SSVM), which is suitable for recognition in open-set setups. SSVM balances the empirical risk and the risk of the unknown and ensures that the region of the feature space in which a test sample would be classified as known (one of the known classes) is always bounded, ensuring a finite risk of the unknown. In this work, we also highlight the properties of the SVM classifier related to the open-set scenario, and provide necessary and sufficient conditions for an RBF SVM to have bounded open-space risk.
Tasks	Open Set Learning
Published	2016-06-13
URL	https://arxiv.org/abs/1606.03802v9
PDF	https://arxiv.org/pdf/1606.03802v9.pdf
PWC	https://paperswithcode.com/paper/specialized-support-vector-machines-for-open
Repo	https://github.com/pedrormjunior/ssvm-results
Framework	none

Model-based Deep Hand Pose Estimation


Title	Model-based Deep Hand Pose Estimation
Authors	Xingyi Zhou, Qingfu Wan, Wei Zhang, Xiangyang Xue, Yichen Wei
Abstract	Previous learning based hand pose estimation methods does not fully exploit the prior information in hand model geometry. Instead, they usually rely a separate model fitting step to generate valid hand poses. Such a post processing is inconvenient and sub-optimal. In this work, we propose a model based deep learning approach that adopts a forward kinematics based layer to ensure the geometric validity of estimated poses. For the first time, we show that embedding such a non-linear generative process in deep learning is feasible for hand pose estimation. Our approach is verified on challenging public datasets and achieves state-of-the-art performance.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2016-06-22
URL	http://arxiv.org/abs/1606.06854v1
PDF	http://arxiv.org/pdf/1606.06854v1.pdf
PWC	https://paperswithcode.com/paper/model-based-deep-hand-pose-estimation
Repo	https://github.com/tenstep/DeepModel
Framework	none

Smart Reply: Automated Response Suggestion for Email


Title	Smart Reply: Automated Response Suggestion for Email
Authors	Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter Young, Vivek Ramavajjala
Abstract	In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of all mobile responses. It is designed to work at very high throughput and process hundreds of millions of messages daily. The system exploits state-of-the-art, large-scale deep learning. We describe the architecture of the system as well as the challenges that we faced while building it, like response diversity and scalability. We also introduce a new method for semantic clustering of user-generated content that requires only a modest amount of explicitly labeled data.
Tasks
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04870v1
PDF	http://arxiv.org/pdf/1606.04870v1.pdf
PWC	https://paperswithcode.com/paper/smart-reply-automated-response-suggestion-for
Repo	https://github.com/yatindma/Automated-Response-Suggestion-for-Email
Framework	none

Adversarially Learned Inference


Title	Adversarially Learned Inference
Authors	Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville
Abstract	We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an adversarial process. The generation network maps samples from stochastic latent variables to the data space while the inference network maps training examples in data space to the space of latent variables. An adversarial game is cast between these two networks and a discriminative network is trained to distinguish between joint latent/data-space samples from the generative network and joint samples from the inference network. We illustrate the ability of the model to learn mutually coherent inference and generation networks through the inspections of model samples and reconstructions and confirm the usefulness of the learned representations by obtaining a performance competitive with state-of-the-art on the semi-supervised SVHN and CIFAR10 tasks.
Tasks	Image Generation, Image-to-Image Translation
Published	2016-06-02
URL	http://arxiv.org/abs/1606.00704v3
PDF	http://arxiv.org/pdf/1606.00704v3.pdf
PWC	https://paperswithcode.com/paper/adversarially-learned-inference
Repo	https://github.com/lkhphuc/Anomaly-BiGAN
Framework	pytorch

ASAGA: Asynchronous Parallel SAGA


Title	ASAGA: Asynchronous Parallel SAGA
Authors	Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien
Abstract	We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates. Through a novel perspective, we revisit and clarify a subtle but important technical issue present in a large fraction of the recent convergence rate proofs for asynchronous parallel optimization algorithms, and propose a simplification of the recently introduced “perturbed iterate” framework that resolves it. We thereby prove that ASAGA can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions. We present results of an implementation on a 40-core architecture illustrating the practical speedup as well as the hardware overhead.
Tasks
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04809v3
PDF	http://arxiv.org/pdf/1606.04809v3.pdf
PWC	https://paperswithcode.com/paper/asaga-asynchronous-parallel-saga
Repo	https://github.com/RemiLeblond/ASAGA
Framework	none

Machine Comprehension Using Match-LSTM and Answer Pointer


Title	Machine Comprehension Using Match-LSTM and Answer Pointer
Authors	Shuohang Wang, Jing Jiang
Abstract	Machine comprehension of text is an important problem in natural language processing. A recently released dataset, the Stanford Question Answering Dataset (SQuAD), offers a large number of real questions and their answers created by humans through crowdsourcing. SQuAD provides a challenging testbed for evaluating machine comprehension algorithms, partly because compared with previous datasets, in SQuAD the answers do not come from a small set of candidate answers and they have variable lengths. We propose an end-to-end neural architecture for the task. The architecture is based on match-LSTM, a model we proposed previously for textual entailment, and Pointer Net, a sequence-to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences. We propose two ways of using Pointer Net for our task. Our experiments show that both of our two models substantially outperform the best results obtained by Rajpurkar et al.(2016) using logistic regression and manually crafted features.
Tasks	Natural Language Inference, Question Answering, Reading Comprehension
Published	2016-08-29
URL	http://arxiv.org/abs/1608.07905v2
PDF	http://arxiv.org/pdf/1608.07905v2.pdf
PWC	https://paperswithcode.com/paper/machine-comprehension-using-match-lstm-and
Repo	https://github.com/shuohangwang/SeqMatchSeq
Framework	torch

Estimating Mixture Models via Mixtures of Polynomials


Title	Estimating Mixture Models via Mixtures of Polynomials
Authors	Sida I. Wang, Arun Tejasvi Chaganty, Percy Liang
Abstract	Mixture modeling is a general technique for making any simple model more expressive through weighted combination. This generality and simplicity in part explains the success of the Expectation Maximization (EM) algorithm, in which updates are easy to derive for a wide class of mixture models. However, the likelihood of a mixture model is non-convex, so EM has no known global convergence guarantees. Recently, method of moments approaches offer global guarantees for some mixture models, but they do not extend easily to the range of mixture models that exist. In this work, we present Polymom, an unifying framework based on method of moments in which estimation procedures are easily derivable, just as in EM. Polymom is applicable when the moments of a single mixture component are polynomials of the parameters. Our key observation is that the moments of the mixture model are a mixture of these polynomials, which allows us to cast estimation as a Generalized Moment Problem. We solve its relaxations using semidefinite optimization, and then extract parameters using ideas from computer algebra. This framework allows us to draw insights and apply tools from convex optimization, computer algebra and the theory of moments to study problems in statistical estimation.
Tasks
Published	2016-03-28
URL	http://arxiv.org/abs/1603.08482v1
PDF	http://arxiv.org/pdf/1603.08482v1.pdf
PWC	https://paperswithcode.com/paper/estimating-mixture-models-via-mixtures-of
Repo	https://github.com/sidaw/mompy
Framework	none

Fashion Landmark Detection in the Wild


Title	Fashion Landmark Detection in the Wild
Authors	Ziwei Liu, Sijie Yan, Ping Luo, Xiaogang Wang, Xiaoou Tang
Abstract	Visual fashion analysis has attracted many attentions in the recent years. Previous work represented clothing regions by either bounding boxes or human joints. This work presents fashion landmark detection or fashion alignment, which is to predict the positions of functional key points defined on the fashion items, such as the corners of neckline, hemline, and cuff. To encourage future studies, we introduce a fashion landmark dataset with over 120K images, where each image is labeled with eight landmarks. With this dataset, we study fashion alignment by cascading multiple convolutional neural networks in three stages. These stages gradually improve the accuracies of landmark predictions. Extensive experiments demonstrate the effectiveness of the proposed method, as well as its generalization ability to pose estimation. Fashion landmark is also compared to clothing bounding boxes and human joints in two applications, fashion attribute prediction and clothes retrieval, showing that fashion landmark is a more discriminative representation to understand fashion images.
Tasks	Pose Estimation
Published	2016-08-10
URL	http://arxiv.org/abs/1608.03049v1
PDF	http://arxiv.org/pdf/1608.03049v1.pdf
PWC	https://paperswithcode.com/paper/fashion-landmark-detection-in-the-wild
Repo	https://github.com/shumming/GLE_FLD
Framework	pytorch

MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection


Title	MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection
Authors	Guido Zarrella, Amy Marsh
Abstract	We describe MITRE’s submission to the SemEval-2016 Task 6, Detecting Stance in Tweets. This effort achieved the top score in Task A on supervised stance detection, producing an average F1 score of 67.8 when assessing whether a tweet author was in favor or against a topic. We employed a recurrent neural network initialized with features learned via distant supervision on two large unlabeled datasets. We trained embeddings of words and phrases with the word2vec skip-gram method, then used those features to learn sentence representations via a hashtag prediction auxiliary task. These sentence vectors were then fine-tuned for stance detection on several hundred labeled examples. The result was a high performing system that used transfer learning to maximize the value of the available training data.
Tasks	Stance Detection, Transfer Learning
Published	2016-06-13
URL	http://arxiv.org/abs/1606.03784v1
PDF	http://arxiv.org/pdf/1606.03784v1.pdf
PWC	https://paperswithcode.com/paper/mitre-at-semeval-2016-task-6-transfer
Repo	https://github.com/DamiFur/Twitter-semeval2016
Framework	none

Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes


Title	Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes
Authors	Tobias Pohlen, Alexander Hermans, Markus Mathias, Bastian Leibe
Abstract	Semantic image segmentation is an essential component of modern autonomous driving systems, as an accurate understanding of the surrounding scene is crucial to navigation and action planning. Current state-of-the-art approaches in semantic image segmentation rely on pre-trained networks that were initially developed for classifying images as a whole. While these networks exhibit outstanding recognition performance (i.e., what is visible?), they lack localization accuracy (i.e., where precisely is something located?). Therefore, additional processing steps have to be performed in order to obtain pixel-accurate segmentation masks at the full image resolution. To alleviate this problem we propose a novel ResNet-like architecture that exhibits strong localization and recognition performance. We combine multi-scale context with pixel-level accuracy by using two processing streams within our network: One stream carries information at the full image resolution, enabling precise adherence to segment boundaries. The other stream undergoes a sequence of pooling operations to obtain robust features for recognition. The two streams are coupled at the full image resolution using residuals. Without additional processing steps and without pre-training, our approach achieves an intersection-over-union score of 71.8% on the Cityscapes dataset.
Tasks	Autonomous Driving, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2016-11-24
URL	http://arxiv.org/abs/1611.08323v2
PDF	http://arxiv.org/pdf/1611.08323v2.pdf
PWC	https://paperswithcode.com/paper/full-resolution-residual-networks-for
Repo	https://github.com/robin-chan/decision-rules
Framework	tf