October 15, 2019

2839 words 14 mins read

Paper Group NANR 49

Visual Grounding via Accumulated Attention. Semantic Pleonasm Detection. Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset. Gated ConvNets for Letter-Based ASR. A Self-Organizing Memory Network. Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation. Findings of the WMT 2018 Shared Task on Parall …

Visual Grounding via Accumulated Attention


Title	Visual Grounding via Accumulated Attention
Authors	Chaorui Deng, Qi Wu, Qingyao Wu, Fuyuan Hu, Fan Lyu, Mingkui Tan
Abstract	Visual Grounding (VG) aims to locate the most relevant object or region in an image, based on a natural language query. The query can be a phrase, a sentence or even a multi-round dialogue. There are three main challenges in VG: 1) what is the main focus in a query; 2) how to understand an image; 3) how to locate an object. Most existing methods combine all the information curtly, which may suffer from the problem of information redundancy (i.e. ambiguous query, complicated image and a large number of objects). In this paper, we formulate these challenges as three attention problems and propose an accumulated attention (A-ATT) mechanism to reason among them jointly. Our A-ATT mechanism can circularly accumulate the attention for useful information in image, query, and objects, while the noises are ignored gradually. We evaluate the performance of A-ATT on four popular datasets (namely ReferCOCO, ReferCOCO+, ReferCOCOg, and Guesswhat?!), and the experimental results show the superiority of the proposed method in term of accuracy.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Deng_Visual_Grounding_via_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Deng_Visual_Grounding_via_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/visual-grounding-via-accumulated-attention
Repo
Framework

Semantic Pleonasm Detection


Title	Semantic Pleonasm Detection
Authors	Omid Kashefi, Andrew T. Lucas, Rebecca Hwa
Abstract	Pleonasms are words that are redundant. To aid the development of systems that detect pleonasms in text, we introduce an annotated corpus of semantic pleonasms. We validate the integrity of the corpus with interannotator agreement analyses. We also compare it against alternative resources in terms of their effects on several automatic redundancy detection methods.
Tasks	Machine Translation
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2036/
PDF	https://www.aclweb.org/anthology/N18-2036
PWC	https://paperswithcode.com/paper/semantic-pleonasm-detection
Repo
Framework

Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset


Title	Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset
Authors	Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, Lorenzo Torresani, Manohar Paluri
Abstract	This paper introduces a large-scale, multi-label and multitask video dataset named Scenes-Objects-Actions (SOA). Most prior video datasets are based on a predened taxonomy, which is used to de- ne the keyword queries issued to search engines. The videos retrieved by the search engines are then veried for correctness by human annotators. Datasets collected in this manner tend to generate high classication accuracy as search engines typically rank easy” videos rst. The SOA dataset adopts a dierent approach. We rely on uniform sampling to get a better representations of videos on the Web. Trained annotators are asked to provide free-form text labels describing each video in three dierent aspects: scene, object and action. These raw labels are then merged, split and renamed to generate a taxonomy for SOA. All the annotations are veried again based on the taxonomy. The nal dataset includes 562K videos with 3.64M annotations spanning 49 categories for scenes, 356 for objects, and 148 for actions. We show that datasets collected in this way are quite challenging by evaluating existing popular video models on SOA. We provide in-depth analysis about the performance of dierent models on SOA, and highlight potential new directions in video classication. A key-feature of SOA is that it enables the empirical study of correlation among scene, object and action recognition in video. We present results of this study and further analyze the potential of using the information learned from one task to improve the others. We compare SOA with existing datasets in the context of transfer learning and demonstrate that pre-training on SOA consistently improves the accuracy on a wide variety of datasets. We believe that the challenges presented by SOA oer the opportunity for further advancement in video analysis as we progress from single-label classication towards a more comprehensive understanding of video data.
Tasks	Temporal Action Localization, Transfer Learning
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Heng_Wang_Scenes-Objects-Actions_A_Multi-Task_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Heng_Wang_Scenes-Objects-Actions_A_Multi-Task_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/scenes-objects-actions-a-multi-task-multi
Repo
Framework

Gated ConvNets for Letter-Based ASR


Title	Gated ConvNets for Letter-Based ASR
Authors	Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert
Abstract	In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model. The acoustic model requires only audio transcription for training – no alignment annotations, nor any forced alignment step is needed. At inference, our decoder takes only a word list and a language model, and is fed with letter scores from the acoustic model – no phonetic word lexicon is needed. Key ingredients for the acoustic model are Gated Linear Units and high dropout. We show near state-of-the-art results in word error rate on the LibriSpeech corpus with MFSC features, both on the clean and other configurations.
Tasks	Language Modelling, Speech Recognition
Published	2018-01-01
URL	https://openreview.net/forum?id=Hyig0zb0Z
PDF	https://openreview.net/pdf?id=Hyig0zb0Z
PWC	https://paperswithcode.com/paper/gated-convnets-for-letter-based-asr
Repo
Framework

A Self-Organizing Memory Network


Title	A Self-Organizing Memory Network
Authors	Callie Federer, Joel Zylberberg
Abstract	Working memory requires information about external stimuli to be represented in the brain even after those stimuli go away. This information is encoded in the activities of neurons, and neural activities change over timescales of tens of milliseconds. Information in working memory, however, is retained for tens of seconds, suggesting the question of how time-varying neural activities maintain stable representations. Prior work shows that, if the neural dynamics are in the ` null space’ of the representation - so that changes to neural activity do not affect the downstream read-out of stimulus information - then information can be retained for periods much longer than the time-scale of individual-neuronal activities. The prior work, however, requires precisely constructed synaptic connectivity matrices, without explaining how this would arise in a biological neural network. To identify mechanisms through which biological networks can self-organize to learn memory function, we derived biologically plausible synaptic plasticity rules that dynamically modify the connectivity matrix to enable information storing. Networks implementing this plasticity rule can successfully learn to form memory representations even if only 10% of the synapses are plastic, they are robust to synaptic noise, and they can represent information about multiple stimuli. \|
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=Syl3_2JCZ
PDF	https://openreview.net/pdf?id=Syl3_2JCZ
PWC	https://paperswithcode.com/paper/a-self-organizing-memory-network
Repo
Framework

Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation


Title	Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation
Authors	Kai Li, Junliang Xing, Chi Su, Weiming Hu, Yundong Zhang, Stephen Maybank
Abstract	Facial age estimation from a face image is an important yet very challenging task in computer vision, since humans with different races and/or genders, exhibit quite different patterns in their facial aging processes. To deal with the influence of race and gender, previous methods perform age estimation within each population separately. In practice, however, it is often very difficult to collect and label sufficient data for each population. Therefore, it would be helpful to exploit an existing large labeled dataset of one (source) population to improve the age estimation performance on another (target) population with only a small labeled dataset available. In this work, we propose a Deep Cross-Population (DCP) age estimation model to achieve this goal. In particular, our DCP model develops a two-stage training strategy. First, a novel cost-sensitive multi-task loss function is designed to learn transferable aging features by training on the source population. Second, a novel order-preserving pair-wise loss function is designed to align the aging features of the two populations. By doing so, our DCP model can transfer the knowledge encoded in the source population to the target population. Extensive experiments on the two of the largest benchmark datasets show that our DCP model outperforms several strong baseline methods and many state-of-the-art methods.
Tasks	Age Estimation
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Deep_Cost-Sensitive_and_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Deep_Cost-Sensitive_and_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/deep-cost-sensitive-and-order-preserving
Repo
Framework

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering


Title	Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering
Authors	Philipp Koehn, Huda Khayrallah, Kenneth Heafield, Mikel L. Forcada
Abstract	We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1{%} and 10{%} of high-quality data to be used to train machine translation systems. Seventeen participants from companies, national research labs, and universities participated in this task.
Tasks	Machine Translation, Outlier Detection
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6453/
PDF	https://www.aclweb.org/anthology/W18-6453
PWC	https://paperswithcode.com/paper/findings-of-the-wmt-2018-shared-task-on-2
Repo
Framework

CRIM at SemEval-2018 Task 9: A Hybrid Approach to Hypernym Discovery


Title	CRIM at SemEval-2018 Task 9: A Hybrid Approach to Hypernym Discovery
Authors	Gabriel Bernier-Colborne, Caroline Barri{`e}re
Abstract	This report describes the system developed by the CRIM team for the hypernym discovery task at SemEval 2018. This system exploits a combination of supervised projection learning and unsupervised pattern-based hypernym discovery. It was ranked first on the 3 sub-tasks for which we submitted results.
Tasks	Hypernym Discovery, Relation Extraction
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1116/
PDF	https://www.aclweb.org/anthology/S18-1116
PWC	https://paperswithcode.com/paper/crim-at-semeval-2018-task-9-a-hybrid-approach
Repo
Framework

Domain Generalization With Adversarial Feature Learning


Title	Domain Generalization With Adversarial Feature Learning
Authors	Haoliang Li, Sinno Jialin Pan, Shiqi Wang, Alex C. Kot
Abstract	In this paper, we tackle the problem of domain generalization: how to learn a generalized feature representation for an âunseenâ target domain by taking the advantage of multiple seen source-domain data. We present a novel framework based on adversarial autoencoders to learn a generalized latent feature representation across domains for domain generalization. To be specific, we extend adversarial autoencoders by imposing the Maximum Mean Discrepancy (MMD) measure to align the distributions among different domains, and matching the aligned distribution to an arbitrary prior distribution via adversarial feature learning. In this way, the learned feature representation is supposed to be universal to the seen source domains because of the MMD regularization, and is expected to generalize well on the target domain because of the introduction of the prior distribution. We proposed an algorithm to jointly train different components of our proposed framework. Extensive experiments on various vision tasks demonstrate that our proposed framework can learn better generalized features for the unseen target domain compared with state of-the-art domain generalization methods.
Tasks	Domain Generalization
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Domain_Generalization_With_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Domain_Generalization_With_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/domain-generalization-with-adversarial
Repo
Framework

Adversarial reading networks for machine comprehension


Title	Adversarial reading networks for machine comprehension
Authors	Quentin Grail, Julien Perez
Abstract	Machine reading has recently shown remarkable progress thanks to differentiable reasoning models. In this context, End-to-End trainable Memory Networks (MemN2N) have demonstrated promising performance on simple natural language based reasoning tasks such as factual reasoning and basic deduction. However, the task of machine comprehension is currently bounded to a supervised setting and available question answering dataset. In this paper we explore the paradigm of adversarial learning and self-play for the task of machine reading comprehension. Inspired by the successful propositions in the domain of game learning, we present a novel approach of training for this task that is based on the definition of a coupled attention-based memory model. On one hand, a reader network is in charge of finding answers regarding a passage of text and a question. On the other hand, a narrator network is in charge of obfuscating spans of text in order to minimize the probability of success of the reader. We experimented the model on several question-answering corpora. The proposed learning paradigm and associated models present encouraging results.
Tasks	Machine Reading Comprehension, Question Answering, Reading Comprehension
Published	2018-01-01
URL	https://openreview.net/forum?id=Hy3MvSlRW
PDF	https://openreview.net/pdf?id=Hy3MvSlRW
PWC	https://paperswithcode.com/paper/adversarial-reading-networks-for-machine
Repo
Framework

Learning Typed Entailment Graphs with Global Soft Constraints


Title	Learning Typed Entailment Graphs with Global Soft Constraints
Authors	Mohammad Javad Hosseini, Nathanael Chambers, Siva Reddy, Xavier R. Holt, Shay B. Cohen, Mark Johnson, Mark Steedman
Abstract	This paper presents a new method for learning typed entailment graphs from text. We extract predicate-argument structures from multiple-source news corpora, and compute local distributional similarity scores to learn entailments between predicates with typed arguments (e.g., person contracted disease). Previous work has used transitivity constraints to improve local decisions, but these constraints are intractable on large graphs. We instead propose a scalable method that learns globally consistent similarity scores based on new soft constraints that consider both the structures across typed entailment graphs and inside each graph. Learning takes only a few hours to run over 100K predicates and our results show large improvements over local similarity scores on two entailment data sets. We further show improvements over paraphrases and entailments from the Paraphrase Database, and prior state-of-the-art entailment graphs. We show that the entailment graphs improve performance in a downstream task.
Tasks
Published	2018-01-01
URL	https://www.aclweb.org/anthology/Q18-1048/
PDF	https://www.aclweb.org/anthology/Q18-1048
PWC	https://paperswithcode.com/paper/learning-typed-entailment-graphs-with-global
Repo
Framework

Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms


Title	Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms
Authors	Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song
Abstract	Existing black-box attacks on deep neural networks (DNNs) have largely focused on transferability, where an adversarial instance generated for a locally trained model can âtransferâ to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target modelâs class probabilities, which do not rely on transferability. We also propose strategies to decouple the number of queries required to generate each adversarial sample from the dimensionality of the input. An iterative variant of our attack achieves close to 100% attack success rates for both targeted and untargeted attacks on DNNs. We carry out a thorough comparative evaluation of black-box attacks and show that Gradient Estimation attacks achieve attack success rates similar to state-of-the-art white-box attacks on the MNIST and CIFAR-10 datasets. We also apply the Gradient Estimation attacks successfully against real-world classiï¬ers hosted by Clarifai. Further, we evaluate black-box attacks against state-of-the-art defenses based on adversarial training and show that the Gradient Estimation attacks are very eï¬ective even against these defenses.
Tasks
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Arjun_Nitin_Bhagoji_Practical_Black-box_Attacks_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Arjun_Nitin_Bhagoji_Practical_Black-box_Attacks_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/practical-black-box-attacks-on-deep-neural
Repo
Framework

Learning to Select: Problem, Solution, and Applications


Title	Learning to Select: Problem, Solution, and Applications
Authors	Heechang Ryu, Donghyun Kim, Hayong Shin
Abstract	We propose a “Learning to Select” problem that selects the best among the flexible size candidates. This makes decisions based not only on the properties of the candidate, but also on the environment in which they belong to. For example, job dispatching in the manufacturing factory is a typical “Learning to Select” problem. We propose Variable-Length CNN which combines the classification power using hidden features from CNN and the idea of flexible input from Learning to Rank algorithms. This not only can handles flexible candidates using Dynamic Computation Graph, but also is computationally efficient because it only builds a network with the necessary sizes to fit the situation. We applied the algorithm to the job dispatching problem which uses the dispatching log data obtained from the virtual fine-tuned factory. Our proposed algorithm shows considerably better performance than other comparable algorithms.
Tasks	Learning-To-Rank
Published	2018-01-01
URL	https://openreview.net/forum?id=SJVHY9lCb
PDF	https://openreview.net/pdf?id=SJVHY9lCb
PWC	https://paperswithcode.com/paper/learning-to-select-problem-solution-and
Repo
Framework

Very Large-Scale Global SfM by Distributed Motion Averaging


Title	Very Large-Scale Global SfM by Distributed Motion Averaging
Authors	Siyu Zhu, Runze Zhang, Lei Zhou, Tianwei Shen, Tian Fang, Ping Tan, Long Quan
Abstract	Global Structure-from-Motion (SfM) techniques have demonstrated superior efficiency and accuracy than the conventional incremental approach in many recent studies. This work proposes a divide-and-conquer framework to solve very large global SfM at the scale of millions of images. Specifically, we first divide all images into multiple partitions that preserve strong data association for well posed and parallel local motion averaging. Then, we solve a global motion averaging that determines cameras at partition boundaries and a similarity transformation per partition to register all cameras in a single coordinate frame. Finally, local and global motion averaging are iterated until convergence. Since local camera poses are fixed during the global motion average, we can avoid caching the whole reconstruction in memory at once. This distributed framework significantly enhances the efficiency and robustness of large-scale motion averaging.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Zhu_Very_Large-Scale_Global_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhu_Very_Large-Scale_Global_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/very-large-scale-global-sfm-by-distributed
Repo
Framework

Uncertainty-aware generative models for inferring document class prevalence


Title	Uncertainty-aware generative models for inferring document class prevalence
Authors	Katherine Keith, Brendan O{'}Connor
Abstract	Prevalence estimation is the task of inferring the relative frequency of classes of unlabeled examples in a group{—}for example, the proportion of a document collection with positive sentiment. Previous work has focused on aggregating and adjusting discriminative individual classifiers to obtain prevalence point estimates. But imperfect classifier accuracy ought to be reflected in uncertainty over the predicted prevalence for scientifically valid inference. In this work, we present (1) a generative probabilistic modeling approach to prevalence estimation, and (2) the construction and evaluation of prevalence confidence intervals; in particular, we demonstrate that an off-the-shelf discriminative classifier can be given a generative re-interpretation, by backing out an implicit individual-level likelihood function, which can be used to conduct fast and simple group-level Bayesian inference. Empirically, we demonstrate our approach provides better confidence interval coverage than an alternative, and is dramatically more robust to shifts in the class prior between training and testing.
Tasks	Bayesian Inference, Epidemiology
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1487/
PDF	https://www.aclweb.org/anthology/D18-1487
PWC	https://paperswithcode.com/paper/uncertainty-aware-generative-models-for
Repo
Framework