Paper Group NANR 3
Context Contrasted Feature and Gated Multi-Scale Aggregation for Scene Segmentation. VOCABULARY-INFORMED VISUAL FEATURE AUGMENTATION FOR ONE-SHOT LEARNING. Debugging Sequence-to-Sequence Models with Seq2Seq-Vis. Product Quantization Network for Fast Image Retrieval. Sparse Covariance Modeling in High Dimensions with Gaussian Processes. Assessing th …
Context Contrasted Feature and Gated Multi-Scale Aggregation for Scene Segmentation
Title | Context Contrasted Feature and Gated Multi-Scale Aggregation for Scene Segmentation |
Authors | Henghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, Gang Wang |
Abstract | Scene segmentation is a challenging task as it need label every pixel in the image. It is crucial to exploit discriminative context and aggregate multi-scale features to achieve better segmentation. In this paper, we first propose a novel context contrasted local feature that not only leverages the informative context but also spotlights the local information in contrast to the context. The proposed context contrasted local feature greatly improves the parsing performance, especially for inconspicuous objects and background stuff. Furthermore, we propose a scheme of gated sum to selectively aggregate multi-scale features for each spatial position. The gates in this scheme control the information flow of different scale features. Their values are generated from the testing image by the proposed network learnt from the training data so that they are adaptive not only to the training data, but also to the specific testing image. Without bells and whistles, the proposed approach achieves the state-of-the-arts consistently on the three popular scene segmentation datasets, Pascal Context, SUN-RGBD and COCO Stuff. |
Tasks | Scene Segmentation, Semantic Segmentation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Ding_Context_Contrasted_Feature_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Ding_Context_Contrasted_Feature_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/context-contrasted-feature-and-gated-multi |
Repo | |
Framework | |
VOCABULARY-INFORMED VISUAL FEATURE AUGMENTATION FOR ONE-SHOT LEARNING
Title | VOCABULARY-INFORMED VISUAL FEATURE AUGMENTATION FOR ONE-SHOT LEARNING |
Authors | jianqi ma, hangyu lin, yinda zhang, yanwei fu, xiangyang xue |
Abstract | A natural solution for one-shot learning is to augment training data to handle the data deficiency problem. However, directly augmenting in the image domain may not necessarily generate training data that sufficiently explore the intra-class space for one-shot classification. Inspired by the recent vocabulary-informed learning, we propose to generate synthetic training data with the guide of the semantic word space. Essentially, we train an auto-encoder as a bridge to enable the transformation between the image feature space and the semantic space. Besides directly augmenting image features, we transform the image features to semantic space using the encoder and perform the data augmentation. The decoder then synthesizes the image features for the augmented instances from the semantic space. Experiments on three datasets show that our data augmentation method effectively improves the performance of one-shot classification. An extensive study shows that data augmented from semantic space are complementary with those from the image space, and thus boost the classification accuracy dramatically. Source code and dataset will be available. |
Tasks | Data Augmentation, One-Shot Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B1mAkPxCZ |
https://openreview.net/pdf?id=B1mAkPxCZ | |
PWC | https://paperswithcode.com/paper/vocabulary-informed-visual-feature |
Repo | |
Framework | |
Debugging Sequence-to-Sequence Models with Seq2Seq-Vis
Title | Debugging Sequence-to-Sequence Models with Seq2Seq-Vis |
Authors | Hendrik Strobelt, Sebastian Gehrmann, Michael Behrisch, Adam Perer, Hanspeter Pfister, Alex Rush, er |
Abstract | Neural attention-based sequence-to-sequence models (seq2seq) (Sutskever et al., 2014; Bahdanau et al., 2014) have proven to be accurate and robust for many sequence prediction tasks. They have become the standard approach for automatic translation of text, at the cost of increased model complexity and uncertainty. End-to-end trained neural models act as a black box, which makes it difficult to examine model decisions and attribute errors to a specific part of a model. The highly connected and high-dimensional internal representations pose a challenge for analysis and visualization tools. The development of methods to understand seq2seq predictions is crucial for systems in production settings, as mistakes involving language are often very apparent to human readers. For instance, a widely publicized incident resulted from a translation system mistakenly translating {}good morning{''} into { }attack them{''} leading to a wrongful arrest (Hern, 2017). |
Tasks | |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5451/ |
https://www.aclweb.org/anthology/W18-5451 | |
PWC | https://paperswithcode.com/paper/debugging-sequence-to-sequence-models-with |
Repo | |
Framework | |
Product Quantization Network for Fast Image Retrieval
Title | Product Quantization Network for Fast Image Retrieval |
Authors | Tan Yu, Junsong Yuan, Chen Fang, Hailin Jin |
Abstract | Product quantization has been widely used in fast image retrieval due to its effectiveness of coding high-dimensional visual features. By extending the hard assignment to soft assignment, we make it feasible to incorporate the product quantization as a layer of a convolutional neural network and propose our product quantization network. Meanwhile, we come up with a novel asymmetric triplet loss, which effectively boosts the retrieval accuracy of the proposed product quantization network based on asymmetric similarity. Through the proposed product quantization network, we can obtain a discriminative and compact image representation in an end-to-end manner, which further enables a fast and accurate image retrieval. Comprehensive experiments conducted on public benchmark datasets demonstrate the state-of-the-art performance of the proposed product quantization network. |
Tasks | Image Retrieval, Quantization |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Tan_Yu_Product_Quantization_Network_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Tan_Yu_Product_Quantization_Network_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/product-quantization-network-for-fast-image |
Repo | |
Framework | |
Sparse Covariance Modeling in High Dimensions with Gaussian Processes
Title | Sparse Covariance Modeling in High Dimensions with Gaussian Processes |
Authors | Rui Li, Kishan Kc, Feng Cui, Justin Domke, Anne Haake |
Abstract | This paper studies statistical relationships among components of high-dimensional observations varying across non-random covariates. We propose to model the observation elements’ changing covariances as sparse multivariate stochastic processes. In particular, our novel covariance modeling method reduces dimensionality by relating the observation vectors to a lower dimensional subspace. To characterize the changing correlations, we jointly model the latent factors and the factor loadings as collections of basis functions that vary with the covariates as Gaussian processes. Automatic relevance determination (ARD) encodes basis sparsity through their coefficients to account for the inherent redundancy. Experiments conducted across domains show superior performances to the state-of-the-art methods. |
Tasks | Gaussian Processes |
Published | 2018-01-01 |
URL | http://papers.nips.cc/paper/7354-sparse-covariance-modeling-in-high-dimensions-with-gaussian-processes |
http://papers.nips.cc/paper/7354-sparse-covariance-modeling-in-high-dimensions-with-gaussian-processes.pdf | |
PWC | https://paperswithcode.com/paper/sparse-covariance-modeling-in-high-dimensions |
Repo | |
Framework | |
Assessing the Impact of Incremental Error Detection and Correction. A Case Study on the Italian Universal Dependency Treebank
Title | Assessing the Impact of Incremental Error Detection and Correction. A Case Study on the Italian Universal Dependency Treebank |
Authors | Chiara Alzetta, Felice Dell{'}Orletta, Simonetta Montemagni, Maria Simi, Giulia Venturi |
Abstract | Detection and correction of errors and inconsistencies in {``}gold treebanks{''} are becoming more and more central topics of corpus annotation. The paper illustrates a new incremental method for enhancing treebanks, with particular emphasis on the extension of error patterns across different textual genres and registers. Impact and role of corrections have been assessed in a dependency parsing experiment carried out with four different parsers, whose results are promising. For both evaluation datasets, the performance of parsers increases, in terms of the standard LAS and UAS measures and of a more focused measure taking into account only relations involved in error patterns, and at the level of individual dependencies. | |
Tasks | Dependency Parsing |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6001/ |
https://www.aclweb.org/anthology/W18-6001 | |
PWC | https://paperswithcode.com/paper/assessing-the-impact-of-incremental-error |
Repo | |
Framework | |
會議語音辨識使用語者資訊之語言模型調適技術 (On the Use of Speaker-Aware Language Model Adaptation Techniques for Meeting Speech Recognition ) [In Chinese]
Title | 會議語音辨識使用語者資訊之語言模型調適技術 (On the Use of Speaker-Aware Language Model Adaptation Techniques for Meeting Speech Recognition ) [In Chinese] |
Authors | Ying-wen Chen, Tien-hong Lo, Hsiu-jui Chang, Wei-Cheng Chao, Berlin Chen |
Abstract | |
Tasks | Language Modelling, Speech Recognition |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/O18-1004/ |
https://www.aclweb.org/anthology/O18-1004 | |
PWC | https://paperswithcode.com/paper/eeae3e34-ea12c-eaee3e-a1eae-aeaee-on-the-use |
Repo | |
Framework | |
Context-Sensitive Generation of Open-Domain Conversational Responses
Title | Context-Sensitive Generation of Open-Domain Conversational Responses |
Authors | Weinan Zhang, Yiming Cui, Yifa Wang, Qingfu Zhu, Lingzhi Li, Lianqiang Zhou, Ting Liu |
Abstract | Despite the success of existing works on single-turn conversation generation, taking the coherence in consideration, human conversing is actually a context-sensitive process. Inspired by the existing studies, this paper proposed the static and dynamic attention based approaches for context-sensitive generation of open-domain conversational responses. Experimental results on two public datasets show that the proposed static attention based approach outperforms all the baselines on automatic and human evaluation. |
Tasks | Information Retrieval, Machine Translation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1206/ |
https://www.aclweb.org/anthology/C18-1206 | |
PWC | https://paperswithcode.com/paper/context-sensitive-generation-of-open-domain |
Repo | |
Framework | |
Towards Efficient Machine Translation Evaluation by Modelling Annotators
Title | Towards Efficient Machine Translation Evaluation by Modelling Annotators |
Authors | Nitika Mathur, Timothy Baldwin, Trevor Cohn |
Abstract | Accurate evaluation of translation has long been a difficult, yet important problem. Current evaluations use direct assessment (DA), based on crowd sourcing judgements from a large pool of workers, along with quality control checks, and a robust method for combining redundant judgements. In this paper we show that the quality control mechanism is overly conservative, which increases the time and expense of the evaluation. We propose a model that does not rely on a pre-processing step to filter workers and takes into account varying annotator reliabilities. Our model effectively weights each worker’s scores based on the inferred precision of the worker, and is much more reliable than the mean of either the raw scores or the standardised scores. We also show that DA does not deliver on the promise of longitudinal evaluation, and propose redesigning the structure of the annotation tasks that can solve this problem. |
Tasks | Machine Translation |
Published | 2018-12-01 |
URL | https://www.aclweb.org/anthology/U18-1010/ |
https://www.aclweb.org/anthology/U18-1010 | |
PWC | https://paperswithcode.com/paper/towards-efficient-machine-translation |
Repo | |
Framework | |
Evolutionary Expectation Maximization for Generative Models with Binary Latents
Title | Evolutionary Expectation Maximization for Generative Models with Binary Latents |
Authors | Enrico Guiraud, Jakob Drefs, Joerg Luecke |
Abstract | We establish a theoretical link between evolutionary algorithms and variational parameter optimization of probabilistic generative models with binary hidden variables. While the novel approach is independent of the actual generative model, here we use two such models to investigate its applicability and scalability: a noisy-OR Bayes Net (as a standard example of binary data) and Binary Sparse Coding (as a model for continuous data). Learning of probabilistic generative models is first formulated as approximate maximum likelihood optimization using variational expectation maximization (EM). We choose truncated posteriors as variational distributions in which discrete latent states serve as variational parameters. In the variational E-step, the latent states are then optimized according to a tractable free-energy objective. Given a data point, we can show that evolutionary algorithms can be used for the variational optimization loop by (A)~considering the bit-vectors of the latent states as genomes of individuals, and by (B)~defining the fitness of the individuals as the (log) joint probabilities given by the used generative model. As a proof of concept, we apply the novel evolutionary EM approach to the optimization of the parameters of noisy-OR Bayes nets and binary sparse coding on artificial and real data (natural image patches). Using point mutations and single-point cross-over for the evolutionary algorithm, we find that scalable variational EM algorithms are obtained which efficiently improve the data likelihood. In general we believe that, with the link established here, standard as well as recent results in the field of evolutionary optimization can be leveraged to address the difficult problem of parameter optimization in generative models. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SyjjD1WRb |
https://openreview.net/pdf?id=SyjjD1WRb | |
PWC | https://paperswithcode.com/paper/evolutionary-expectation-maximization-for |
Repo | |
Framework | |
Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima
Title | Third-order Smoothness Helps: Faster Stochastic Optimization Algorithms for Finding Local Minima |
Authors | Yaodong Yu, Pan Xu, Quanquan Gu |
Abstract | We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs $\tilde{O}(\epsilon^{-10/3})$ stochastic gradient evaluations to converge to an approximate local minimum $\mathbf{x}$, which satisfies $\nabla f(\mathbf{x})_2\leq\epsilon$ and $\lambda_{\min}(\nabla^2 f(\mathbf{x}))\geq -\sqrt{\epsilon}$ in unconstrained stochastic optimization, where $\tilde{O}(\cdot)$ hides logarithm polynomial terms and constants. This improves upon the $\tilde{O}(\epsilon^{-7/2})$ gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of $\tilde{O}(\epsilon^{-1/6})$. Experiments on two nonconvex optimization problems demonstrate the effectiveness of our algorithm and corroborate our theory. |
Tasks | Stochastic Optimization |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7704-third-order-smoothness-helps-faster-stochastic-optimization-algorithms-for-finding-local-minima |
http://papers.nips.cc/paper/7704-third-order-smoothness-helps-faster-stochastic-optimization-algorithms-for-finding-local-minima.pdf | |
PWC | https://paperswithcode.com/paper/third-order-smoothness-helps-faster |
Repo | |
Framework | |
KDGAN: Knowledge Distillation with Generative Adversarial Networks
Title | KDGAN: Knowledge Distillation with Generative Adversarial Networks |
Authors | Xiaojie Wang, Rui Zhang, Yu Sun, Jianzhong Qi |
Abstract | Knowledge distillation (KD) aims to train a lightweight classifier suitable to provide accurate inference with constrained resources in multi-label learning. Instead of directly consuming feature-label pairs, the classifier is trained by a teacher, i.e., a high-capacity model whose training may be resource-hungry. The accuracy of the classifier trained this way is usually suboptimal because it is difficult to learn the true data distribution from the teacher. An alternative method is to adversarially train the classifier against a discriminator in a two-player game akin to generative adversarial networks (GAN), which can ensure the classifier to learn the true data distribution at the equilibrium of this game. However, it may take excessively long time for such a two-player game to reach equilibrium due to high-variance gradient updates. To address these limitations, we propose a three-player game named KDGAN consisting of a classifier, a teacher, and a discriminator. The classifier and the teacher learn from each other via distillation losses and are adversarially trained against the discriminator via adversarial losses. By simultaneously optimizing the distillation and adversarial losses, the classifier will learn the true data distribution at the equilibrium. We approximate the discrete distribution learned by the classifier (or the teacher) with a concrete distribution. From the concrete distribution, we generate continuous samples to obtain low-variance gradient updates, which speed up the training. Extensive experiments using real datasets confirm the superiority of KDGAN in both accuracy and training speed. |
Tasks | Multi-Label Learning |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7358-kdgan-knowledge-distillation-with-generative-adversarial-networks |
http://papers.nips.cc/paper/7358-kdgan-knowledge-distillation-with-generative-adversarial-networks.pdf | |
PWC | https://paperswithcode.com/paper/kdgan-knowledge-distillation-with-generative |
Repo | |
Framework | |
Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces
Title | Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces |
Authors | Anthony Rios, Ramakanth Kavuluru |
Abstract | Large multi-label datasets contain labels that occur thousands of times (frequent group), those that occur only a few times (few-shot group), and labels that never appear in the training dataset (zero-shot group). Multi-label few- and zero-shot label prediction is mostly unexplored on datasets with large label spaces, especially for text classification. In this paper, we perform a fine-grained evaluation to understand how state-of-the-art methods perform on infrequent labels. Furthermore, we develop few- and zero-shot methods for multi-label text classification when there is a known structure over the label space, and evaluate them on two publicly available medical text datasets: MIMIC II and MIMIC III. For few-shot labels we achieve improvements of 6.2{%} and 4.8{%} in R@10 for MIMIC II and MIMIC III, respectively, over prior efforts; the corresponding R@10 improvements for zero-shot labels are 17.3{%} and 19{%}. |
Tasks | Multi-Label Classification, Multi-Label Learning, Multi-Label Text Classification, Text Classification |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1352/ |
https://www.aclweb.org/anthology/D18-1352 | |
PWC | https://paperswithcode.com/paper/few-shot-and-zero-shot-multi-label-learning |
Repo | |
Framework | |
An Interpretable Neural Network with Topical Information for Relevant Emotion Ranking
Title | An Interpretable Neural Network with Topical Information for Relevant Emotion Ranking |
Authors | Yang Yang, Deyu Zhou, Yulan He |
Abstract | Text might express or evoke multiple emotions with varying intensities. As such, it is crucial to predict and rank multiple relevant emotions by their intensities. Moreover, as emotions might be evoked by hidden topics, it is important to unveil and incorporate such topical information to understand how the emotions are evoked. We proposed a novel interpretable neural network approach for relevant emotion ranking. Specifically, motivated by transfer learning, the neural network is initialized to make the hidden layer approximate the behavior of topic models. Moreover, a novel error function is defined to optimize the whole neural network for relevant emotion ranking. Experimental results on three real-world corpora show that the proposed approach performs remarkably better than the state-of-the-art emotion detection approaches and multi-label learning methods. Moreover, the extracted emotion-associated topic words indeed represent emotion-evoking events and are in line with our common-sense knowledge. |
Tasks | Common Sense Reasoning, Emotion Classification, Multi-Label Learning, Topic Models, Transfer Learning |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1379/ |
https://www.aclweb.org/anthology/D18-1379 | |
PWC | https://paperswithcode.com/paper/an-interpretable-neural-network-with-topical |
Repo | |
Framework | |
CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning
Title | CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning |
Authors | Wissam Siblini, Pascale Kuntz, Frank Meyer |
Abstract | Extreme Multi-label Learning (XML) considers large sets of items described by a number of labels that can exceed one million. Tree-based methods, which hierarchically partition the problem into small scale sub-problems, are particularly promising in this context to reduce the learning/prediction complexity and to open the way to parallelization. However, the current best approaches do not exploit tree randomization which has shown its efficiency in random forests and they resort to complex partitioning strategies. To overcome these limits, we here introduce a new random forest based algorithm with a very fast partitioning approach called CRAFTML. Experimental comparisons on nine datasets from the XML literature show that it outperforms the other tree-based approaches. Moreover with a parallelized implementation reduced to five cores, it is competitive with the best state-of-the-art methods which run on one hundred-core machines. |
Tasks | Multi-Label Learning |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2216 |
http://proceedings.mlr.press/v80/siblini18a/siblini18a.pdf | |
PWC | https://paperswithcode.com/paper/craftml-an-efficient-clustering-based-random |
Repo | |
Framework | |