January 25, 2020

2772 words 14 mins read

Paper Group NANR 9

Compact Feature Learning for Multi-Domain Image Classification. Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation. Assessing Back-Translation as a Corpus Generation Strategy for non-English Tasks: A Study in Reading Comprehension and Word Sense Disambiguation. Attentive Region Embedding Network for Zero-Shot Learning. On th …

Compact Feature Learning for Multi-Domain Image Classification


Title	Compact Feature Learning for Multi-Domain Image Classification
Authors	Yajing Liu, Xinmei Tian, Ya Li, Zhiwei Xiong, Feng Wu
Abstract	The goal of multi-domain learning is to improve the performance over multiple domains by making full use of all training data from them. However, variations of feature distributions across different domains result in a non-trivial solution of multi-domain learning. The state-of-the-art work regarding multi-domain classification aims to extract domain-invariant features and domain-specific features independently. However, they view the distributions of features from different classes as a general distribution and try to match these distributions across domains, which lead to the mixture of features from different classes across domains and degrade the performance of classification. Additionally, existing works only force the shared features among domains to be orthogonal to the features in the domain-specific network. However, redundant features between the domain-specific networks still remain, which may shrink the discriminative ability of domain-specific features. Therefore, we propose an end-to-end network to obtain the more optimal features, which we call compact features. We propose to extract the domain-invariant features by matching the joint distributions of different domains, which have dis- tinct boundaries between different classes. Moreover, we add an orthogonal constraint between the private features across domains to ensure the discriminative ability of the domain-specific space. The proposed method is validated on three landmark datasets, and the results demonstrate the effectiveness of our method.
Tasks	Image Classification
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Liu_Compact_Feature_Learning_for_Multi-Domain_Image_Classification_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Liu_Compact_Feature_Learning_for_Multi-Domain_Image_Classification_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/compact-feature-learning-for-multi-domain
Repo
Framework

Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation


Title	Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation
Authors	Artur Grigorev, Artem Sevastopolsky, Alexander Vakhitov, Victor Lempitsky
Abstract	We present a new deep learning approach to pose-guided resynthesis of human photographs. At the heart of the new approach is the estimation of the complete body surface texture based on a single photograph. Since the input photograph always observes only a part of the surface, we suggest a new inpainting method that completes the texture of the human body. Rather than working directly with colors of texture elements, the inpainting network estimates an appropriate source location in the input image for each element of the body surface. This correspondence field between the input image and the texture is then further warped into the target image coordinate frame based on the desired pose, effectively establishing the correspondence between the source and the target view even when the pose change is drastic. The final convolutional network then uses the established correspondence and all other available information to synthesize the output image. A fully-convolutional architecture with deformable skip connections guided by the estimated correspondence field is used. We show state-of-the-art result for pose-guided image synthesis. Additionally, we demonstrate the performance of our system for garment transfer and pose-guided face resynthesis.
Tasks	Image Generation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Grigorev_Coordinate-Based_Texture_Inpainting_for_Pose-Guided_Human_Image_Generation_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Grigorev_Coordinate-Based_Texture_Inpainting_for_Pose-Guided_Human_Image_Generation_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/coordinate-based-texture-inpainting-for-pose-1
Repo
Framework

Assessing Back-Translation as a Corpus Generation Strategy for non-English Tasks: A Study in Reading Comprehension and Word Sense Disambiguation


Title	Assessing Back-Translation as a Corpus Generation Strategy for non-English Tasks: A Study in Reading Comprehension and Word Sense Disambiguation
Authors	Fabricio Monsalve, Kervy Rivas Rojas, Marco Antonio Sobrevilla Cabezudo, Arturo Oncevay
Abstract	Corpora curated by experts have sustained Natural Language Processing mainly in English, but the expensiveness of corpora creation is a barrier for the development in further languages. Thus, we propose a corpus generation strategy that only requires a machine translation system between English and the target language in both directions, where we filter the best translations by computing automatic translation metrics and the task performance score. By studying Reading Comprehension in Spanish and Word Sense Disambiguation in Portuguese, we identified that a more quality-oriented metric has high potential in the corpora selection without degrading the task performance. We conclude that it is possible to systematise the building of quality corpora using machine translation and automatic metrics, besides some prior effort to clean and process the data.
Tasks	Machine Translation, Reading Comprehension, Word Sense Disambiguation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4010/
PDF	https://www.aclweb.org/anthology/W19-4010
PWC	https://paperswithcode.com/paper/assessing-back-translation-as-a-corpus
Repo
Framework

Attentive Region Embedding Network for Zero-Shot Learning


Title	Attentive Region Embedding Network for Zero-Shot Learning
Authors	Guo-Sen Xie, Li Liu, Xiaobo Jin, Fan Zhu, Zheng Zhang, Jie Qin, Yazhou Yao, Ling Shao
Abstract	Zero-shot learning (ZSL) aims to classify images from unseen categories, by merely utilizing seen class images as the training data. Existing works on ZSL mainly leverage the global features or learn the global regions, from which, to construct the embeddings to the semantic space. However, few of them study the discrimination power implied in local image regions (parts), which, in some sense, correspond to semantic attributes, have stronger discrimination than attributes, and can thus assist the semantic transfer between seen/unseen classes. In this paper, to discover (semantic) regions, we propose the attentive region embedding network (AREN), which is tailored to advance the ZSL task. Specifically, AREN is end-to-end trainable and consists of two network branches, i.e., the attentive region embedding (ARE) stream, and the attentive compressed second-order embedding (ACSE) stream. ARE is capable of discovering multiple part regions under the guidance of the attention and the compatibility loss. Moreover, a novel adaptive thresholding mechanism is proposed for suppressing redundant (such as background) attention regions. To further guarantee more stable semantic transfer from the perspective of second-order collaboration, ACSE is incorporated into the AREN. In the comprehensive evaluations on four benchmarks, our models achieve state-of-the-art performances under ZSL setting, and compelling results under generalized ZSL setting.
Tasks	Zero-Shot Learning
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Xie_Attentive_Region_Embedding_Network_for_Zero-Shot_Learning_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Xie_Attentive_Region_Embedding_Network_for_Zero-Shot_Learning_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/attentive-region-embedding-network-for-zero
Repo
Framework

On the Robustness of Self-Attentive Models


Title	On the Robustness of Self-Attentive Models
Authors	Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei Wei, Wen-Lian Hsu, Cho-Jui Hsieh
Abstract	This work examines the robustness of self-attentive neural networks against adversarial input perturbations. Specifically, we investigate the attention and feature extraction mechanisms of state-of-the-art recurrent neural networks and self-attentive architectures for sentiment analysis, entailment and machine translation under adversarial attacks. We also propose a novel attack algorithm for generating more natural adversarial examples that could mislead neural models but not humans. Experimental results show that, compared to recurrent neural models, self-attentive models are more robust against adversarial perturbation. In addition, we provide theoretical explanations for their superior robustness to support our claims.
Tasks	Machine Translation, Sentiment Analysis
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1147/
PDF	https://www.aclweb.org/anthology/P19-1147
PWC	https://paperswithcode.com/paper/on-the-robustness-of-self-attentive-models
Repo
Framework

UTFPR at SemEval-2019 Task 6: Relying on Compositionality to Find Offense


Title	UTFPR at SemEval-2019 Task 6: Relying on Compositionality to Find Offense
Authors	Gustavo Henrique Paetzold
Abstract	We present the UTFPR system for the OffensEval shared task of SemEval 2019: A character-to-word-to-sentence compositional RNN model trained exclusively over the training data provided by the organizers. We find that, although not very competitive for the task at hand, it offers a robust solution to the orthographic irregularity inherent to tweets.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2140/
PDF	https://www.aclweb.org/anthology/S19-2140
PWC	https://paperswithcode.com/paper/utfpr-at-semeval-2019-task-6-relying-on
Repo
Framework

Dual Learning: Theoretical Study and Algorithmic Extensions


Title	Dual Learning: Theoretical Study and Algorithmic Extensions
Authors	Zhibing Zhao, Yingce Xia, Tao Qin, Tie-Yan Liu
Abstract	Dual learning has been successfully applied in many machine learning applications, including machine translation, image-to-image transformation, etc. The high-level idea of dual learning is very intuitive: if we map an x from one domain to another and then map it back, we should recover the original x. Although its effectiveness has been empirically verified, theoretical understanding of dual learning is still missing. In this paper, we conduct a theoretical study to understand why and when dual learning can improve a mapping function. Based on the theoretical discoveries, we extend dual learning by introducing more related mappings and propose highly symmetric frameworks, cycle dual learning and multipath dual learning, in both of which we can leverage the feedback signals from additional domains to improve the qualities of the mappings. We prove that both cycle dual learning and multipath dual learning can boost the performance of standard dual learning under mild conditions. Experiments on WMT 14 English↔German and MultiUN English↔French translations verify our theoretical findings on dual learning, and the results on the translations among English, French, and Spanish of MultiUN demonstrate the efficacy of cycle dual learning and multipath dual learning.
Tasks	Machine Translation
Published	2019-05-01
URL	https://openreview.net/forum?id=HyMxAi05Km
PDF	https://openreview.net/pdf?id=HyMxAi05Km
PWC	https://paperswithcode.com/paper/dual-learning-theoretical-study-and
Repo
Framework

On-device Structured and Context Partitioned Projection Networks


Title	On-device Structured and Context Partitioned Projection Networks
Authors	Sujith Ravi, Zornitsa Kozareva
Abstract	A challenging problem in on-device text classification is to build highly accurate neural models that can fit in small memory footprint and have low latency. To address this challenge, we propose an on-device neural network SGNN++ which dynamically learns compact projection vectors from raw text using structured and context-dependent partition projections. We show that this results in accelerated inference and performance improvements. We conduct extensive evaluation on multiple conversational tasks and languages such as English, Japanese, Spanish and French. Our SGNN++ model significantly outperforms all baselines, improves upon existing on-device neural models and even surpasses RNN, CNN and BiLSTM models on dialog act and intent prediction. Through a series of ablation studies we show the impact of the partitioned projections and structured information leading to 10{%} improvement. We study the impact of the model size on accuracy and introduce quatization-aware training for SGNN++ to further reduce the model size while preserving the same quality. Finally, we show fast inference on mobile phones.
Tasks	Text Classification
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1368/
PDF	https://www.aclweb.org/anthology/P19-1368
PWC	https://paperswithcode.com/paper/on-device-structured-and-context-partitioned
Repo
Framework

Feature Space Perturbations Yield More Transferable Adversarial Examples


Title	Feature Space Perturbations Yield More Transferable Adversarial Examples
Authors	Nathan Inkawhich, Wei Wen, Hai (Helen) Li, Yiran Chen
Abstract	Many recent works have shown that deep learning models are vulnerable to quasi-imperceptible input perturbations, yet practitioners cannot fully explain this behavior. This work describes a transfer-based blackbox targeted adversarial attack of deep feature space representations that also provides insights into cross-model class representations of deep CNNs. The attack is explicitly designed for transferability and drives feature space representation of a source image at layer L towards the representation of a target image at L. The attack yields highly transferable targeted examples, which outperform competition winning methods by over 30% in targeted attack metrics. We also show the choice of L to generate examples from is important, transferability characteristics are blackbox model agnostic, and indicate that well trained deep models have similar highly-abstract representations.
Tasks	Adversarial Attack
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Inkawhich_Feature_Space_Perturbations_Yield_More_Transferable_Adversarial_Examples_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Inkawhich_Feature_Space_Perturbations_Yield_More_Transferable_Adversarial_Examples_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/feature-space-perturbations-yield-more
Repo
Framework

Embedded Block Residual Network: A Recursive Restoration Model for Single-Image Super-Resolution


Title	Embedded Block Residual Network: A Recursive Restoration Model for Single-Image Super-Resolution
Authors	Yajun Qiu, Ruxin Wang, Dapeng Tao, Jun Cheng
Abstract	Single-image super-resolution restores the lost structures and textures from low-resolved images, which has achieved extensive attention from the research community. The top performers in this field include deep or wide convolutional neural networks, or recurrent neural networks. However, the methods enforce a single model to process all kinds of textures and structures. A typical operation is that a certain layer restores the textures based on the ones recovered by the preceding layers, ignoring the characteristics of image textures. In this paper, we believe that the lower-frequency and higher-frequency information in images have different levels of complexity and should be restored by models of different representational capacity. Inspired by this, we propose a novel embedded block residual network (EBRN) which is an incremental recovering progress for texture super-resolution. Specifically, different modules in the model restores information of different frequencies. For lower-frequency information, we use shallower modules of the network to recover; for higher-frequency information, we use deeper modules to restore. Extensive experiments indicate that the proposed EBRN model achieves superior performance and visual improvements against the state-of-the-arts.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Qiu_Embedded_Block_Residual_Network_A_Recursive_Restoration_Model_for_Single-Image_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Qiu_Embedded_Block_Residual_Network_A_Recursive_Restoration_Model_for_Single-Image_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/embedded-block-residual-network-a-recursive
Repo
Framework

Comparative judgments are more consistent than binary classification for labelling word complexity


Title	Comparative judgments are more consistent than binary classification for labelling word complexity
Authors	Sian Gooding, Ekaterina Kochmar, Advait Sarkar, Alan Blackwell
Abstract	Lexical simplification systems replace complex words with simple ones based on a model of which words are complex in context. We explore how users can help train complex word identification models through labelling more efficiently and reliably. We show that using an interface where annotators make comparative rather than binary judgments leads to more reliable and consistent labels, and explore whether comparative judgments may provide a faster way for collecting labels.
Tasks	Complex Word Identification, Lexical Simplification
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4024/
PDF	https://www.aclweb.org/anthology/W19-4024
PWC	https://paperswithcode.com/paper/comparative-judgments-are-more-consistent
Repo
Framework

AudioCaps: Generating Captions for Audios in The Wild


Title	AudioCaps: Generating Captions for Audios in The Wild
Authors	Chris Dongjoo Kim, Byeongchang Kim, Hyunmin Lee, Gunhee Kim
Abstract	We explore the problem of Audio Captioning: generating natural language description for any kind of audio in the wild, which has been surprisingly unexplored in previous research. We contribute a large-scale dataset of 46K audio clips with human-written text pairs collected via crowdsourcing on the AudioSet dataset. Our thorough empirical studies not only show that our collected captions are indeed faithful to audio inputs but also discover what forms of audio representation and captioning models are effective for the audio captioning. From extensive experiments, we also propose two novel components that help improve audio captioning performance: the top-down multi-scale encoder and aligned semantic attention.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1011/
PDF	https://www.aclweb.org/anthology/N19-1011
PWC	https://paperswithcode.com/paper/audiocaps-generating-captions-for-audios-in
Repo
Framework

Exploiting Open IE for Deriving Multiple Premises Entailment Corpus


Title	Exploiting Open IE for Deriving Multiple Premises Entailment Corpus
Authors	Martin V{'\i}ta, Jakub Kl{'\i}mek
Abstract	Natural language inference (NLI) is a key part of natural language understanding. The NLI task is defined as a decision problem whether a given sentence {–} hypothesis {–} can be inferred from a given text. Typically, we deal with a text consisting of just a single premise/single sentence, which is called a single premise entailment (SPE) task. Recently, a derived task of NLI from multiple premises (MPE) was introduced together with the first annotated corpus and corresponding several strong baselines. Nevertheless, the further development in MPE field requires accessibility of huge amounts of annotated data. In this paper we introduce a novel method for rapid deriving of MPE corpora from an existing NLI (SPE) annotated data that does not require any additional annotation work. This proposed approach is based on using an open information extraction system. We demonstrate the application of the method on a well known SNLI corpus. Over the obtained corpus, we provide the first evaluations as well as we state a strong baseline.
Tasks	Natural Language Inference, Open Information Extraction
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1144/
PDF	https://www.aclweb.org/anthology/R19-1144
PWC	https://paperswithcode.com/paper/exploiting-open-ie-for-deriving-multiple
Repo
Framework

A Skip Connection Architecture for Localization of Image Manipulations


Title	A Skip Connection Architecture for Localization of Image Manipulations
Authors	Ghazal Mazaheri, Niluthpol Chowdhury Mithun, Jawadul H. Bappy, Amit K. Roy-Chowdhury
Abstract	Detection and localization of image manipulations are becoming of increasing interest to researchers in recent years due to the significant rise of malicious content-changing image tampering on the web. One of the major challenges for an image manipulation detection method is to discriminate between the tampered regions and other regions in an image. We observe that most of the manipulated images leave some traces near boundaries of manipulated regions including blurred edges. In order to exploit these traces in localizing the tampered regions, we propose an encoder-decoder based network where we fuse representations from early layers in the encoder (which are richer in low-level spatial cues, like edges) by skip pooling with representations of the last layer of the decoder and use for manipulation detection. In addition, we utilize resampling features extracted from patches of images by feeding them to LSTM cells to capture the transition between manipulated and non manipulated blocks in the frequency domain and combine the output of the LSTM with our encoder. The overall framework is capable of detecting different types of image manipulations simultaneously including copy-move, removal, and splicing. Experimental results on two standard benchmark datasets (CASIA 1.0 and NIST’16) demonstrate that the proposed method can achieve a significantly better performance than the state-of-the-art methods and baselines.
Tasks	Image Manipulation Detection
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPRW_2019/html/Media_Forensics/Mazaheri_A_Skip_Connection_Architecture_for_Localization_of_Image_Manipulations_CVPRW_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPRW_2019/papers/Media%20Forensics/Mazaheri_A_Skip_Connection_Architecture_for_Localization_of_Image_Manipulations_CVPRW_2019_paper.pdf
PWC	https://paperswithcode.com/paper/a-skip-connection-architecture-for
Repo
Framework

Proceedings of the 2nd Workshop on New Frontiers in Summarization


Title	Proceedings of the 2nd Workshop on New Frontiers in Summarization
Authors
Abstract
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5400/
PDF	https://www.aclweb.org/anthology/D19-5400
PWC	https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-new
Repo
Framework