October 15, 2019

2833 words 14 mins read

Paper Group NANR 109

Paper Group NANR 109

Depth-Aware Stereo Video Retargeting. Contour location via entropy reduction leveraging multiple information sources. Texterra at SemEval-2018 Task 7: Exploiting Syntactic Information for Relation Extraction and Classification in Scientific Papers. NNEval: Neural Network based Evaluation Metric for Image Captioning. Cross-Lingual Content Scoring. T …

Depth-Aware Stereo Video Retargeting

Title Depth-Aware Stereo Video Retargeting
Authors Bing Li, Chia-Wen Lin, Boxin Shi, Tiejun Huang, Wen Gao, C.-C. Jay Kuo
Abstract As compared with traditional video retargeting, stereo video retargeting poses new challenges because stereo video contains the depth information of salient objects and its time dynamics. In this work, we propose a depth-aware stereo video retargeting method by imposing the depth fidelity constraint. The proposed depth-aware retargeting method reconstructs the 3D scene to obtain the depth information of salient objects. We cast it as a constrained optimization problem, where the total cost function includes the shape, temporal and depth distortions of salient objects. As a result, the solution can preserve the shape, temporal and depth fidelity of salient objects simultaneously. It is demonstrated by experimental results that the depth-aware retargeting method achieves higher retargeting quality and provides better user experience.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Depth-Aware_Stereo_Video_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Depth-Aware_Stereo_Video_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/depth-aware-stereo-video-retargeting
Repo
Framework

Contour location via entropy reduction leveraging multiple information sources

Title Contour location via entropy reduction leveraging multiple information sources
Authors Alexandre Marques, Remi Lam, Karen Willcox
Abstract We introduce an algorithm to locate contours of functions that are expensive to evaluate. The problem of locating contours arises in many applications, including classification, constrained optimization, and performance analysis of mechanical and dynamical systems (reliability, probability of failure, stability, etc.). Our algorithm locates contours using information from multiple sources, which are available in the form of relatively inexpensive, biased, and possibly noisy approximations to the original function. Considering multiple information sources can lead to significant cost savings. We also introduce the concept of contour entropy, a formal measure of uncertainty about the location of the zero contour of a function approximated by a statistical surrogate model. Our algorithm locates contours efficiently by maximizing the reduction of contour entropy per unit cost.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7768-contour-location-via-entropy-reduction-leveraging-multiple-information-sources
PDF http://papers.nips.cc/paper/7768-contour-location-via-entropy-reduction-leveraging-multiple-information-sources.pdf
PWC https://paperswithcode.com/paper/contour-location-via-entropy-reduction-1
Repo
Framework

Texterra at SemEval-2018 Task 7: Exploiting Syntactic Information for Relation Extraction and Classification in Scientific Papers

Title Texterra at SemEval-2018 Task 7: Exploiting Syntactic Information for Relation Extraction and Classification in Scientific Papers
Authors Andrey Sysoev, Vladimir Mayorov
Abstract In this work we evaluate applicability of entity pair models and neural network architectures for relation extraction and classification in scientific papers at SemEval-2018. We carry out experiments with representing entity pairs through sentence tokens and through shortest path in dependency tree, comparing approaches based on convolutional and recurrent neural networks. With convolutional network applied to shortest path in dependency tree we managed to be ranked eighth in subtask 1.1 ({}clean data{''}), ninth in 1.2 ({}noisy data{''}). Similar model applied to separate parts of the shortest path was mounted to ninth (extraction track) and seventh (classification track) positions in subtask 2 ranking.
Tasks Relation Classification, Relation Extraction
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1131/
PDF https://www.aclweb.org/anthology/S18-1131
PWC https://paperswithcode.com/paper/texterra-at-semeval-2018-task-7-exploiting
Repo
Framework

NNEval: Neural Network based Evaluation Metric for Image Captioning

Title NNEval: Neural Network based Evaluation Metric for Image Captioning
Authors Naeha Sharif, Lyndon White, Mohammed Bennamoun, Syed Afaq Ali Shah
Abstract The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems. Existing metrics to automatically evaluate image captioning systems fail to achieve a satisfactory level of correlation with human judgements at the sentence-level. Moreover, these metrics, unlike humans, tend to focus on specific aspects of quality, such as the n-gram overlap or the semantic meaning. In this paper, we present the first learning-based metric to evaluate image captions. Our proposed framework enables us to incorporate both lexical and semantic information into a single learned metric. This results in an evaluator that takes into account various linguistic features to assess the caption quality. The experiments we performed to assess the proposed metric, show improvements upon the state of the art in terms of correlation with human judgements and demonstrate its superior robustness to distractions.
Tasks Image Captioning
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Naeha_Sharif_NNEval_Neural_Network_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Naeha_Sharif_NNEval_Neural_Network_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/nneval-neural-network-based-evaluation-metric
Repo
Framework

Cross-Lingual Content Scoring

Title Cross-Lingual Content Scoring
Authors Andrea Horbach, Sebastian Stennmanns, Torsten Zesch
Abstract We investigate the feasibility of cross-lingual content scoring, a scenario where training and test data in an automatic scoring task are from two different languages. Cross-lingual scoring can contribute to educational equality by allowing answers in multiple languages. Training a model in one language and applying it to another language might also help to overcome data sparsity issues by re-using trained models from other languages. As there is no suitable dataset available for this new task, we create a comparable bi-lingual corpus by extending the English ASAP dataset with German answers. Our experiments with cross-lingual scoring based on machine-translating either training or test data show a considerable drop in scoring quality.
Tasks Machine Translation
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-0550/
PDF https://www.aclweb.org/anthology/W18-0550
PWC https://paperswithcode.com/paper/cross-lingual-content-scoring
Repo
Framework

Transferring from Formal Newswire Domain with Hypernet for Twitter POS Tagging

Title Transferring from Formal Newswire Domain with Hypernet for Twitter POS Tagging
Authors Tao Gui, Qi Zhang, Jingjing Gong, Minlong Peng, Di Liang, Keyu Ding, Xuanjing Huang
Abstract Part-of-Speech (POS) tagging for Twitter has received considerable attention in recent years. Because most POS tagging methods are based on supervised models, they usually require a large amount of labeled data for training. However, the existing labeled datasets for Twitter are much smaller than those for newswire text. Hence, to help POS tagging for Twitter, most domain adaptation methods try to leverage newswire datasets by learning the shared features between the two domains. However, from a linguistic perspective, Twitter users not only tend to mimic the formal expressions of traditional media, like news, but they also appear to be developing linguistically informal styles. Therefore, POS tagging for the formal Twitter context can be learned together with the newswire dataset, while POS tagging for the informal Twitter context should be learned separately. To achieve this task, in this work, we propose a hypernetwork-based method to generate different parameters to separately model contexts with different expression styles. Experimental results on three different datasets show that our approach achieves better performance than state-of-the-art methods in most cases.
Tasks Domain Adaptation, Multi-Task Learning, Part-Of-Speech Tagging, Stock Prediction
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1275/
PDF https://www.aclweb.org/anthology/D18-1275
PWC https://paperswithcode.com/paper/transferring-from-formal-newswire-domain-with
Repo
Framework

The Context-Dependent Additive Recurrent Neural Net

Title The Context-Dependent Additive Recurrent Neural Net
Authors Quan Hung Tran, Tuan Lai, Gholamreza Haffari, Ingrid Zukerman, Trung Bui, Hung Bui
Abstract Contextual sequence mapping is one of the fundamental problems in Natural Language Processing (NLP). Here, instead of relying solely on the information presented in the text, the learning agents have access to a strong external signal given to assist the learning process. In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to address this type of problem. The experimental results on public datasets in the dialog problem (Babi dialog Task 6 and Frame), contextual language model (Switchboard and Penn Tree Bank) and question answering (Trec QA) show that our novel CARNN-based architectures outperform previous methods.
Tasks Language Modelling, Question Answering
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1115/
PDF https://www.aclweb.org/anthology/N18-1115
PWC https://paperswithcode.com/paper/the-context-dependent-additive-recurrent
Repo
Framework

Partially Shared Multi-Task Convolutional Neural Network With Local Constraint for Face Attribute Learning

Title Partially Shared Multi-Task Convolutional Neural Network With Local Constraint for Face Attribute Learning
Authors Jiajiong Cao, Yingming Li, Zhongfei Zhang
Abstract In this paper, we study the face attribute learning problem by considering the identity information and attribute relationships simultaneously. In particular, we first introduce a Partially Shared Multi-task Convolutional Neural Network (PS-MCNN), in which four Task Specific Networks (TSNets) and one Shared Network (SNet) are connected by Partially Shared (PS) structures to learn better shared and task specific representations. To utilize identity information to further boost the performance, we introduce a local learning constraint which minimizes the difference between the representations of each sample and its local geometric neighbours with the same identity. Consequently, we present a local constraint regularized multi-task network, called Partially Shared Multi-task Convolutional Neural Network with Local Constraint (PS-MCNN-LC), where PS structure and local constraint are integrated together to help the framework learn better attribute representations. The experimental results on CelebA and LFWA demonstrate the promise of the proposed methods.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Cao_Partially_Shared_Multi-Task_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Cao_Partially_Shared_Multi-Task_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/partially-shared-multi-task-convolutional
Repo
Framework

SSNet: Scale Selection Network for Online 3D Action Prediction

Title SSNet: Scale Selection Network for Online 3D Action Prediction
Authors Jun Liu, Amir Shahroudy, Gang Wang, Ling-Yu Duan, Alex C. Kot
Abstract In action prediction (early action recognition), the goal is to predict the class label of an ongoing action using its observed part so far. In this paper, we focus on online action prediction in streaming 3D skeleton sequences. A dilated convolutional network is introduced to model the motion dynamics in temporal dimension via a sliding window over the time axis. As there are significant temporal scale variations of the observed part of the ongoing action at different progress levels, we propose a novel window scale selection scheme to make our network focus on the performed part of the ongoing action and try to suppress the noise from the previous actions at each time step. Furthermore, an activation sharing scheme is proposed to deal with the overlapping computations among the adjacent steps, which allows our model to run more efficiently. The extensive experiments on two challenging datasets show the effectiveness of the proposed action prediction framework.
Tasks Temporal Action Localization
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Liu_SSNet_Scale_Selection_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_SSNet_Scale_Selection_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/ssnet-scale-selection-network-for-online-3d
Repo
Framework

Weakly Supervised Region Proposal Network and Object Detection

Title Weakly Supervised Region Proposal Network and Object Detection
Authors Peng Tang, Xinggang Wang, Angtian Wang, Yongluan Yan, Wenyu Liu, Junzhou Huang, Alan Yuille
Abstract The Convolutional Neural Network (CNN) based region proposal generation method (i.e. region proposal network), trained using bounding box annotations, is an essential component in modern fully supervised object detectors. However, Weakly Supervised Object Detection (WSOD) has not benefited from CNN-based proposal generation due to the absence of bounding box annotations, and is relying on standard proposal generation methods such as selective search. In this paper, we propose a weakly supervised region proposal network which is trained using only image-level annotations. The weakly supervised region proposal network consists of two stages. The first stage evaluates the objectness scores of sliding window boxes by exploiting the low-level information in CNN and the second stage refines the proposals from the first stage using a region-based CNN classifier. Our proposed region proposal network is suitable for WSOD, can be plugged into a WSOD network easily, and can share its convolutional computations with the WSOD network. Experiments on the PASCAL VOC and ImageNet detection datasets show that our method achieves the state-of-the-art performance for WSOD with performance gain of about 3% on average.
Tasks Object Detection, Weakly Supervised Object Detection
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Peng_Tang_Weakly_Supervised_Region_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Peng_Tang_Weakly_Supervised_Region_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-region-proposal-network-and
Repo
Framework

ECNU at SemEval-2018 Task 1: Emotion Intensity Prediction Using Effective Features and Machine Learning Models

Title ECNU at SemEval-2018 Task 1: Emotion Intensity Prediction Using Effective Features and Machine Learning Models
Authors Huimin Xu, Man Lan, Yuanbin Wu
Abstract This paper describes our submissions to SemEval 2018 task 1. The task is affect intensity prediction in tweets, including five subtasks. We participated in all subtasks of English tweets. We extracted several traditional NLP, sentiment lexicon, emotion lexicon and domain specific features from tweets, adopted supervised machine learning algorithms to perform emotion intensity prediction.
Tasks Emotion Classification, Feature Engineering, Sentiment Analysis, Tokenization
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1035/
PDF https://www.aclweb.org/anthology/S18-1035
PWC https://paperswithcode.com/paper/ecnu-at-semeval-2018-task-1-emotion-intensity
Repo
Framework

A State-transition Framework to Answer Complex Questions over Knowledge Base

Title A State-transition Framework to Answer Complex Questions over Knowledge Base
Authors Sen Hu, Lei Zou, Xinbo Zhang
Abstract Although natural language question answering over knowledge graphs have been studied in the literature, existing methods have some limitations in answering complex questions. To address that, in this paper, we propose a State Transition-based approach to translate a complex natural language question N to a semantic query graph (SQG), which is used to match the underlying knowledge graph to find the answers to question N. In order to generate SQG, we propose four primitive operations (expand, fold, connect and merge) and a learning-based state transition approach. Extensive experiments on several benchmarks (such as QALD, WebQuestions and ComplexQuestions) with two knowledge bases (DBpedia and Freebase) confirm the superiority of our approach compared with state-of-the-arts.
Tasks Knowledge Graphs, Question Answering
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1234/
PDF https://www.aclweb.org/anthology/D18-1234
PWC https://paperswithcode.com/paper/a-state-transition-framework-to-answer
Repo
Framework

Deep Attention Neural Tensor Network for Visual Question Answering

Title Deep Attention Neural Tensor Network for Visual Question Answering
Authors Yalong Bai, Jianlong Fu, Tiejun Zhao, Tao Mei
Abstract Visual question answering (VQA) has drawn great attention in cross-modal learning problems, which enables a machine to answer a natural language question given a reference image. Significant progress has been made by learning rich embedding features from images and questions by bilinear models, while neglects the key role from answers. In this paper, we propose a novel deep attention neural tensor network (DA-NTN) for visual question answering, which can discover the joint correlations over images, questions and answers with tensor-based representations. First, we model one of the pairwise interaction (e.g., image and question) by bilinear features, which is further encoded with the third dimension (e.g., answer) to be a triplet by bilinear tensor product. Second, we decompose the correlation of different triplets by different answer and question types, and further propose a slice-wise attention module on tensor to select the most discriminative reasoning process for inference. Third, we optimize the proposed DA-NTN by learning a label regression with KL-divergence losses. Such a design enables scalable training and fast convergence over a large number of answer set. We integrate the proposed DA-NTN structure into the state-of-the-art VQA models (e.g., MLB and MUTAN). Extensive experiments demonstrate the superior accuracy than the original MLB and MUTAN models, with 1.98%, 1.70% relative increases on VQA-2.0 dataset, respectively.
Tasks Deep Attention, Question Answering, Visual Question Answering
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Yalong_Bai_Deep_Attention_Neural_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Yalong_Bai_Deep_Attention_Neural_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/deep-attention-neural-tensor-network-for
Repo
Framework

Classifier-to-Generator Attack: Estimation of Training Data Distribution from Classifier

Title Classifier-to-Generator Attack: Estimation of Training Data Distribution from Classifier
Authors Kosuke Kusano, Jun Sakuma
Abstract Suppose a deep classification model is trained with samples that need to be kept private for privacy or confidentiality reasons. In this setting, can an adversary obtain the private samples if the classification model is given to the adversary? We call this reverse engineering against the classification model the Classifier-to-Generator (C2G) Attack. This situation arises when the classification model is embedded into mobile devices for offline prediction (e.g., object recognition for the automatic driving car and face recognition for mobile phone authentication). For C2G attack, we introduce a novel GAN, PreImageGAN. In PreImageGAN, the generator is designed to estimate the the sample distribution conditioned by the preimage of classification model $f$, $P(Xf(X)=y)$, where $X$ is the random variable on the sample space and $y$ is the probability vector representing the target label arbitrary specified by the adversary. In experiments, we demonstrate PreImageGAN works successfully with hand-written character recognition and face recognition. In character recognition, we show that, given a recognition model of hand-written digits, PreImageGAN allows the adversary to extract alphabet letter images without knowing that the model is built for alphabet letter images. In face recognition, we show that, when an adversary obtains a face recognition model for a set of individuals, PreImageGAN allows the adversary to extract face images of specific individuals contained in the set, even when the adversary has no knowledge of the face of the individuals.
Tasks Face Recognition, Object Recognition
Published 2018-01-01
URL https://openreview.net/forum?id=SJOl4DlCZ
PDF https://openreview.net/pdf?id=SJOl4DlCZ
PWC https://paperswithcode.com/paper/classifier-to-generator-attack-estimation-of
Repo
Framework

A Transformer-Based Multi-Source Automatic Post-Editing System

Title A Transformer-Based Multi-Source Automatic Post-Editing System
Authors Santanu Pal, Nico Herbig, Antonio Kr{"u}ger, Josef van Genabith
Abstract This paper presents our English{–}German Automatic Post-Editing (APE) system submitted to the APE Task organized at WMT 2018 (Chatterjee et al., 2018). The proposed model is an extension of the transformer architecture: two separate self-attention-based encoders encode the machine translation output (mt) and the source (src), followed by a joint encoder that attends over a combination of these two encoded sequences (encsrc and encmt) for generating the post-edited sentence. We compare this multi-source architecture (i.e, {src, mt} → pe) to a monolingual transformer (i.e., mt → pe) model and an ensemble combining the multi-source {src, mt} → pe and single-source mt → pe models. For both the PBSMT and the NMT task, the ensemble yields the best results, followed by the multi-source model and last the single-source approach. Our best model, the ensemble, achieves a BLEU score of 66.16 and 74.22 for the PBSMT and NMT task, respectively.
Tasks Automatic Post-Editing, Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-6468/
PDF https://www.aclweb.org/anthology/W18-6468
PWC https://paperswithcode.com/paper/a-transformer-based-multi-source-automatic
Repo
Framework
comments powered by Disqus