Paper Group AWR 26
Fully Character-Level Neural Machine Translation without Explicit Segmentation. A Kernel Test of Goodness of Fit. Latent Tree Language Model. The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain. Combining Data-driven and Model-driven Methods for Robust Facial Landmark …
Fully Character-Level Neural Machine Translation without Explicit Segmentation
Title | Fully Character-Level Neural Machine Translation without Explicit Segmentation |
Authors | Jason Lee, Kyunghyun Cho, Thomas Hofmann |
Abstract | Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subword-level encoder on WMT’15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single character-level encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment. |
Tasks | Machine Translation |
Published | 2016-10-10 |
URL | http://arxiv.org/abs/1610.03017v3 |
http://arxiv.org/pdf/1610.03017v3.pdf | |
PWC | https://paperswithcode.com/paper/fully-character-level-neural-machine |
Repo | https://github.com/newhiwoong/Keras-Applications |
Framework | none |
A Kernel Test of Goodness of Fit
Title | A Kernel Test of Goodness of Fit |
Authors | Kacper Chwialkowski, Heiko Strathmann, Arthur Gretton |
Abstract | We propose a nonparametric statistical test for goodness-of-fit: given a set of samples, the test determines how likely it is that these were generated from a target density function. The measure of goodness-of-fit is a divergence constructed via Stein’s method using functions from a Reproducing Kernel Hilbert Space. Our test statistic is based on an empirical estimate of this divergence, taking the form of a V-statistic in terms of the log gradients of the target density and the kernel. We derive a statistical test, both for i.i.d. and non-i.i.d. samples, where we estimate the null distribution quantiles using a wild bootstrap procedure. We apply our test to quantifying convergence of approximate Markov Chain Monte Carlo methods, statistical model criticism, and evaluating quality of fit vs model complexity in nonparametric density estimation. |
Tasks | Density Estimation |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.02964v4 |
http://arxiv.org/pdf/1602.02964v4.pdf | |
PWC | https://paperswithcode.com/paper/a-kernel-test-of-goodness-of-fit |
Repo | https://github.com/karlnapf/kernel_goodness_of_fit |
Framework | none |
Latent Tree Language Model
Title | Latent Tree Language Model |
Authors | Tomas Brychcin |
Abstract | In this paper we introduce Latent Tree Language Model (LTLM), a novel approach to language modeling that encodes syntax and semantics of a given sentence as a tree of word roles. The learning phase iteratively updates the trees by moving nodes according to Gibbs sampling. We introduce two algorithms to infer a tree for a given sentence. The first one is based on Gibbs sampling. It is fast, but does not guarantee to find the most probable tree. The second one is based on dynamic programming. It is slower, but guarantees to find the most probable tree. We provide comparison of both algorithms. We combine LTLM with 4-gram Modified Kneser-Ney language model via linear interpolation. Our experiments with English and Czech corpora show significant perplexity reductions (up to 46% for English and 49% for Czech) compared with standalone 4-gram Modified Kneser-Ney language model. |
Tasks | Language Modelling |
Published | 2016-07-24 |
URL | http://arxiv.org/abs/1607.07057v3 |
http://arxiv.org/pdf/1607.07057v3.pdf | |
PWC | https://paperswithcode.com/paper/latent-tree-language-model |
Repo | https://github.com/brychcin/LTLM |
Framework | none |
The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain
Title | The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain |
Authors | Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith |
Abstract | This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain. |
Tasks | Anomaly Detection |
Published | 2016-09-02 |
URL | http://arxiv.org/abs/1609.00464v2 |
http://arxiv.org/pdf/1609.00464v2.pdf | |
PWC | https://paperswithcode.com/paper/the-semantic-knowledge-graph-a-compact-auto |
Repo | https://github.com/shalder/knowledge_graph |
Framework | none |
Combining Data-driven and Model-driven Methods for Robust Facial Landmark Detection
Title | Combining Data-driven and Model-driven Methods for Robust Facial Landmark Detection |
Authors | Hongwen Zhang, Qi Li, Zhenan Sun, Yunfan Liu |
Abstract | Facial landmark detection is an important yet challenging task for real-world computer vision applications. This paper proposes an effective and robust approach for facial landmark detection by combining data- and model-driven methods. Firstly, a Fully Convolutional Network (FCN) is trained to compute response maps of all facial landmark points. Such a data-driven method could make full use of holistic information in a facial image for global estimation of facial landmarks. After that, the maximum points in the response maps are fitted with a pre-trained Point Distribution Model (PDM) to generate the initial facial shape. This model-driven method is able to correct the inaccurate locations of outliers by considering the shape prior information. Finally, a weighted version of Regularized Landmark Mean-Shift (RLMS) is employed to fine-tune the facial shape iteratively. This Estimation-Correction-Tuning process perfectly combines the advantages of the global robustness of data-driven method (FCN), outlier correction capability of model-driven method (PDM) and non-parametric optimization of RLMS. Results of extensive experiments demonstrate that our approach achieves state-of-the-art performances on challenging datasets including 300W, AFLW, AFW and COFW. The proposed method is able to produce satisfying detection results on face images with exaggerated expressions, large head poses, and partial occlusions. |
Tasks | Facial Landmark Detection |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1611.10152v2 |
http://arxiv.org/pdf/1611.10152v2.pdf | |
PWC | https://paperswithcode.com/paper/combining-data-driven-and-model-driven |
Repo | https://github.com/HongwenZhang/ECT-FaceAlignment |
Framework | none |
Specialized Support Vector Machines for Open-set Recognition
Title | Specialized Support Vector Machines for Open-set Recognition |
Authors | Pedro Ribeiro Mendes Júnior, Terrance E. Boult, Jacques Wainer, Anderson Rocha |
Abstract | Often, when dealing with real-world recognition problems, we do not need, and often cannot have, knowledge of the entire set of possible classes that might appear during operational testing. In such cases, we need to think of robust classification methods able to deal with the “unknown” and properly reject samples belonging to classes never seen during training. Notwithstanding, almost all existing classifiers to date were mostly developed for the closed-set scenario, i.e., the classification setup in which it is assumed that all test samples belong to one of the classes with which the classifier was trained. In the open-set scenario, however, a test sample can belong to none of the known classes and the classifier must properly reject it by classifying it as unknown. In this work, we extend upon the well-known Support Vector Machines (SVM) classifier and introduce the Specialized Support Vector Machines (SSVM), which is suitable for recognition in open-set setups. SSVM balances the empirical risk and the risk of the unknown and ensures that the region of the feature space in which a test sample would be classified as known (one of the known classes) is always bounded, ensuring a finite risk of the unknown. In this work, we also highlight the properties of the SVM classifier related to the open-set scenario, and provide necessary and sufficient conditions for an RBF SVM to have bounded open-space risk. |
Tasks | Open Set Learning |
Published | 2016-06-13 |
URL | https://arxiv.org/abs/1606.03802v9 |
https://arxiv.org/pdf/1606.03802v9.pdf | |
PWC | https://paperswithcode.com/paper/specialized-support-vector-machines-for-open |
Repo | https://github.com/pedrormjunior/ssvm-results |
Framework | none |
Model-based Deep Hand Pose Estimation
Title | Model-based Deep Hand Pose Estimation |
Authors | Xingyi Zhou, Qingfu Wan, Wei Zhang, Xiangyang Xue, Yichen Wei |
Abstract | Previous learning based hand pose estimation methods does not fully exploit the prior information in hand model geometry. Instead, they usually rely a separate model fitting step to generate valid hand poses. Such a post processing is inconvenient and sub-optimal. In this work, we propose a model based deep learning approach that adopts a forward kinematics based layer to ensure the geometric validity of estimated poses. For the first time, we show that embedding such a non-linear generative process in deep learning is feasible for hand pose estimation. Our approach is verified on challenging public datasets and achieves state-of-the-art performance. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2016-06-22 |
URL | http://arxiv.org/abs/1606.06854v1 |
http://arxiv.org/pdf/1606.06854v1.pdf | |
PWC | https://paperswithcode.com/paper/model-based-deep-hand-pose-estimation |
Repo | https://github.com/tenstep/DeepModel |
Framework | none |
Smart Reply: Automated Response Suggestion for Email
Title | Smart Reply: Automated Response Suggestion for Email |
Authors | Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter Young, Vivek Ramavajjala |
Abstract | In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of all mobile responses. It is designed to work at very high throughput and process hundreds of millions of messages daily. The system exploits state-of-the-art, large-scale deep learning. We describe the architecture of the system as well as the challenges that we faced while building it, like response diversity and scalability. We also introduce a new method for semantic clustering of user-generated content that requires only a modest amount of explicitly labeled data. |
Tasks | |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04870v1 |
http://arxiv.org/pdf/1606.04870v1.pdf | |
PWC | https://paperswithcode.com/paper/smart-reply-automated-response-suggestion-for |
Repo | https://github.com/yatindma/Automated-Response-Suggestion-for-Email |
Framework | none |
Adversarially Learned Inference
Title | Adversarially Learned Inference |
Authors | Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville |
Abstract | We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an adversarial process. The generation network maps samples from stochastic latent variables to the data space while the inference network maps training examples in data space to the space of latent variables. An adversarial game is cast between these two networks and a discriminative network is trained to distinguish between joint latent/data-space samples from the generative network and joint samples from the inference network. We illustrate the ability of the model to learn mutually coherent inference and generation networks through the inspections of model samples and reconstructions and confirm the usefulness of the learned representations by obtaining a performance competitive with state-of-the-art on the semi-supervised SVHN and CIFAR10 tasks. |
Tasks | Image Generation, Image-to-Image Translation |
Published | 2016-06-02 |
URL | http://arxiv.org/abs/1606.00704v3 |
http://arxiv.org/pdf/1606.00704v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarially-learned-inference |
Repo | https://github.com/lkhphuc/Anomaly-BiGAN |
Framework | pytorch |
ASAGA: Asynchronous Parallel SAGA
Title | ASAGA: Asynchronous Parallel SAGA |
Authors | Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien |
Abstract | We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates. Through a novel perspective, we revisit and clarify a subtle but important technical issue present in a large fraction of the recent convergence rate proofs for asynchronous parallel optimization algorithms, and propose a simplification of the recently introduced “perturbed iterate” framework that resolves it. We thereby prove that ASAGA can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions. We present results of an implementation on a 40-core architecture illustrating the practical speedup as well as the hardware overhead. |
Tasks | |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04809v3 |
http://arxiv.org/pdf/1606.04809v3.pdf | |
PWC | https://paperswithcode.com/paper/asaga-asynchronous-parallel-saga |
Repo | https://github.com/RemiLeblond/ASAGA |
Framework | none |
Machine Comprehension Using Match-LSTM and Answer Pointer
Title | Machine Comprehension Using Match-LSTM and Answer Pointer |
Authors | Shuohang Wang, Jing Jiang |
Abstract | Machine comprehension of text is an important problem in natural language processing. A recently released dataset, the Stanford Question Answering Dataset (SQuAD), offers a large number of real questions and their answers created by humans through crowdsourcing. SQuAD provides a challenging testbed for evaluating machine comprehension algorithms, partly because compared with previous datasets, in SQuAD the answers do not come from a small set of candidate answers and they have variable lengths. We propose an end-to-end neural architecture for the task. The architecture is based on match-LSTM, a model we proposed previously for textual entailment, and Pointer Net, a sequence-to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences. We propose two ways of using Pointer Net for our task. Our experiments show that both of our two models substantially outperform the best results obtained by Rajpurkar et al.(2016) using logistic regression and manually crafted features. |
Tasks | Natural Language Inference, Question Answering, Reading Comprehension |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.07905v2 |
http://arxiv.org/pdf/1608.07905v2.pdf | |
PWC | https://paperswithcode.com/paper/machine-comprehension-using-match-lstm-and |
Repo | https://github.com/shuohangwang/SeqMatchSeq |
Framework | torch |
Estimating Mixture Models via Mixtures of Polynomials
Title | Estimating Mixture Models via Mixtures of Polynomials |
Authors | Sida I. Wang, Arun Tejasvi Chaganty, Percy Liang |
Abstract | Mixture modeling is a general technique for making any simple model more expressive through weighted combination. This generality and simplicity in part explains the success of the Expectation Maximization (EM) algorithm, in which updates are easy to derive for a wide class of mixture models. However, the likelihood of a mixture model is non-convex, so EM has no known global convergence guarantees. Recently, method of moments approaches offer global guarantees for some mixture models, but they do not extend easily to the range of mixture models that exist. In this work, we present Polymom, an unifying framework based on method of moments in which estimation procedures are easily derivable, just as in EM. Polymom is applicable when the moments of a single mixture component are polynomials of the parameters. Our key observation is that the moments of the mixture model are a mixture of these polynomials, which allows us to cast estimation as a Generalized Moment Problem. We solve its relaxations using semidefinite optimization, and then extract parameters using ideas from computer algebra. This framework allows us to draw insights and apply tools from convex optimization, computer algebra and the theory of moments to study problems in statistical estimation. |
Tasks | |
Published | 2016-03-28 |
URL | http://arxiv.org/abs/1603.08482v1 |
http://arxiv.org/pdf/1603.08482v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-mixture-models-via-mixtures-of |
Repo | https://github.com/sidaw/mompy |
Framework | none |
Fashion Landmark Detection in the Wild
Title | Fashion Landmark Detection in the Wild |
Authors | Ziwei Liu, Sijie Yan, Ping Luo, Xiaogang Wang, Xiaoou Tang |
Abstract | Visual fashion analysis has attracted many attentions in the recent years. Previous work represented clothing regions by either bounding boxes or human joints. This work presents fashion landmark detection or fashion alignment, which is to predict the positions of functional key points defined on the fashion items, such as the corners of neckline, hemline, and cuff. To encourage future studies, we introduce a fashion landmark dataset with over 120K images, where each image is labeled with eight landmarks. With this dataset, we study fashion alignment by cascading multiple convolutional neural networks in three stages. These stages gradually improve the accuracies of landmark predictions. Extensive experiments demonstrate the effectiveness of the proposed method, as well as its generalization ability to pose estimation. Fashion landmark is also compared to clothing bounding boxes and human joints in two applications, fashion attribute prediction and clothes retrieval, showing that fashion landmark is a more discriminative representation to understand fashion images. |
Tasks | Pose Estimation |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03049v1 |
http://arxiv.org/pdf/1608.03049v1.pdf | |
PWC | https://paperswithcode.com/paper/fashion-landmark-detection-in-the-wild |
Repo | https://github.com/shumming/GLE_FLD |
Framework | pytorch |
MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection
Title | MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection |
Authors | Guido Zarrella, Amy Marsh |
Abstract | We describe MITRE’s submission to the SemEval-2016 Task 6, Detecting Stance in Tweets. This effort achieved the top score in Task A on supervised stance detection, producing an average F1 score of 67.8 when assessing whether a tweet author was in favor or against a topic. We employed a recurrent neural network initialized with features learned via distant supervision on two large unlabeled datasets. We trained embeddings of words and phrases with the word2vec skip-gram method, then used those features to learn sentence representations via a hashtag prediction auxiliary task. These sentence vectors were then fine-tuned for stance detection on several hundred labeled examples. The result was a high performing system that used transfer learning to maximize the value of the available training data. |
Tasks | Stance Detection, Transfer Learning |
Published | 2016-06-13 |
URL | http://arxiv.org/abs/1606.03784v1 |
http://arxiv.org/pdf/1606.03784v1.pdf | |
PWC | https://paperswithcode.com/paper/mitre-at-semeval-2016-task-6-transfer |
Repo | https://github.com/DamiFur/Twitter-semeval2016 |
Framework | none |
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes
Title | Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes |
Authors | Tobias Pohlen, Alexander Hermans, Markus Mathias, Bastian Leibe |
Abstract | Semantic image segmentation is an essential component of modern autonomous driving systems, as an accurate understanding of the surrounding scene is crucial to navigation and action planning. Current state-of-the-art approaches in semantic image segmentation rely on pre-trained networks that were initially developed for classifying images as a whole. While these networks exhibit outstanding recognition performance (i.e., what is visible?), they lack localization accuracy (i.e., where precisely is something located?). Therefore, additional processing steps have to be performed in order to obtain pixel-accurate segmentation masks at the full image resolution. To alleviate this problem we propose a novel ResNet-like architecture that exhibits strong localization and recognition performance. We combine multi-scale context with pixel-level accuracy by using two processing streams within our network: One stream carries information at the full image resolution, enabling precise adherence to segment boundaries. The other stream undergoes a sequence of pooling operations to obtain robust features for recognition. The two streams are coupled at the full image resolution using residuals. Without additional processing steps and without pre-training, our approach achieves an intersection-over-union score of 71.8% on the Cityscapes dataset. |
Tasks | Autonomous Driving, Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08323v2 |
http://arxiv.org/pdf/1611.08323v2.pdf | |
PWC | https://paperswithcode.com/paper/full-resolution-residual-networks-for |
Repo | https://github.com/robin-chan/decision-rules |
Framework | tf |