Paper Group NANR 49
Visual Grounding via Accumulated Attention. Semantic Pleonasm Detection. Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset. Gated ConvNets for Letter-Based ASR. A Self-Organizing Memory Network. Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation. Findings of the WMT 2018 Shared Task on Parall …
Visual Grounding via Accumulated Attention
Title | Visual Grounding via Accumulated Attention |
Authors | Chaorui Deng, Qi Wu, Qingyao Wu, Fuyuan Hu, Fan Lyu, Mingkui Tan |
Abstract | Visual Grounding (VG) aims to locate the most relevant object or region in an image, based on a natural language query. The query can be a phrase, a sentence or even a multi-round dialogue. There are three main challenges in VG: 1) what is the main focus in a query; 2) how to understand an image; 3) how to locate an object. Most existing methods combine all the information curtly, which may suffer from the problem of information redundancy (i.e. ambiguous query, complicated image and a large number of objects). In this paper, we formulate these challenges as three attention problems and propose an accumulated attention (A-ATT) mechanism to reason among them jointly. Our A-ATT mechanism can circularly accumulate the attention for useful information in image, query, and objects, while the noises are ignored gradually. We evaluate the performance of A-ATT on four popular datasets (namely ReferCOCO, ReferCOCO+, ReferCOCOg, and Guesswhat?!), and the experimental results show the superiority of the proposed method in term of accuracy. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Deng_Visual_Grounding_via_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Deng_Visual_Grounding_via_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/visual-grounding-via-accumulated-attention |
Repo | |
Framework | |
Semantic Pleonasm Detection
Title | Semantic Pleonasm Detection |
Authors | Omid Kashefi, Andrew T. Lucas, Rebecca Hwa |
Abstract | Pleonasms are words that are redundant. To aid the development of systems that detect pleonasms in text, we introduce an annotated corpus of semantic pleonasms. We validate the integrity of the corpus with interannotator agreement analyses. We also compare it against alternative resources in terms of their effects on several automatic redundancy detection methods. |
Tasks | Machine Translation |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2036/ |
https://www.aclweb.org/anthology/N18-2036 | |
PWC | https://paperswithcode.com/paper/semantic-pleonasm-detection |
Repo | |
Framework | |
Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset
Title | Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset |
Authors | Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, Lorenzo Torresani, Manohar Paluri |
Abstract | This paper introduces a large-scale, multi-label and multitask video dataset named Scenes-Objects-Actions (SOA). Most prior video datasets are based on a predened taxonomy, which is used to de- ne the keyword queries issued to search engines. The videos retrieved by the search engines are then veried for correctness by human annotators. Datasets collected in this manner tend to generate high classication accuracy as search engines typically rank easy” videos rst. The SOA dataset adopts a dierent approach. We rely on uniform sampling to get a better representations of videos on the Web. Trained annotators are asked to provide free-form text labels describing each video in three dierent aspects: scene, object and action. These raw labels are then merged, split and renamed to generate a taxonomy for SOA. All the annotations are veried again based on the taxonomy. The nal dataset includes 562K videos with 3.64M annotations spanning 49 categories for scenes, 356 for objects, and 148 for actions. We show that datasets collected in this way are quite challenging by evaluating existing popular video models on SOA. We provide in-depth analysis about the performance of dierent models on SOA, and highlight potential new directions in video classication. A key-feature of SOA is that it enables the empirical study of correlation among scene, object and action recognition in video. We present results of this study and further analyze the potential of using the information learned from one task to improve the others. We compare SOA with existing datasets in the context of transfer learning and demonstrate that pre-training on SOA consistently improves the accuracy on a wide variety of datasets. We believe that the challenges presented by SOA oer the opportunity for further advancement in video analysis as we progress from single-label classication towards a more comprehensive understanding of video data. |
Tasks | Temporal Action Localization, Transfer Learning |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Heng_Wang_Scenes-Objects-Actions_A_Multi-Task_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Heng_Wang_Scenes-Objects-Actions_A_Multi-Task_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/scenes-objects-actions-a-multi-task-multi |
Repo | |
Framework | |
Gated ConvNets for Letter-Based ASR
Title | Gated ConvNets for Letter-Based ASR |
Authors | Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert |
Abstract | In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model. The acoustic model requires only audio transcription for training – no alignment annotations, nor any forced alignment step is needed. At inference, our decoder takes only a word list and a language model, and is fed with letter scores from the acoustic model – no phonetic word lexicon is needed. Key ingredients for the acoustic model are Gated Linear Units and high dropout. We show near state-of-the-art results in word error rate on the LibriSpeech corpus with MFSC features, both on the clean and other configurations. |
Tasks | Language Modelling, Speech Recognition |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Hyig0zb0Z |
https://openreview.net/pdf?id=Hyig0zb0Z | |
PWC | https://paperswithcode.com/paper/gated-convnets-for-letter-based-asr |
Repo | |
Framework | |
A Self-Organizing Memory Network
Title | A Self-Organizing Memory Network |
Authors | Callie Federer, Joel Zylberberg |
Abstract | Working memory requires information about external stimuli to be represented in the brain even after those stimuli go away. This information is encoded in the activities of neurons, and neural activities change over timescales of tens of milliseconds. Information in working memory, however, is retained for tens of seconds, suggesting the question of how time-varying neural activities maintain stable representations. Prior work shows that, if the neural dynamics are in the ` null space’ of the representation - so that changes to neural activity do not affect the downstream read-out of stimulus information - then information can be retained for periods much longer than the time-scale of individual-neuronal activities. The prior work, however, requires precisely constructed synaptic connectivity matrices, without explaining how this would arise in a biological neural network. To identify mechanisms through which biological networks can self-organize to learn memory function, we derived biologically plausible synaptic plasticity rules that dynamically modify the connectivity matrix to enable information storing. Networks implementing this plasticity rule can successfully learn to form memory representations even if only 10% of the synapses are plastic, they are robust to synaptic noise, and they can represent information about multiple stimuli. | |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Syl3_2JCZ |
https://openreview.net/pdf?id=Syl3_2JCZ | |
PWC | https://paperswithcode.com/paper/a-self-organizing-memory-network |
Repo | |
Framework | |
Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation
Title | Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation |
Authors | Kai Li, Junliang Xing, Chi Su, Weiming Hu, Yundong Zhang, Stephen Maybank |
Abstract | Facial age estimation from a face image is an important yet very challenging task in computer vision, since humans with different races and/or genders, exhibit quite different patterns in their facial aging processes. To deal with the influence of race and gender, previous methods perform age estimation within each population separately. In practice, however, it is often very difficult to collect and label sufficient data for each population. Therefore, it would be helpful to exploit an existing large labeled dataset of one (source) population to improve the age estimation performance on another (target) population with only a small labeled dataset available. In this work, we propose a Deep Cross-Population (DCP) age estimation model to achieve this goal. In particular, our DCP model develops a two-stage training strategy. First, a novel cost-sensitive multi-task loss function is designed to learn transferable aging features by training on the source population. Second, a novel order-preserving pair-wise loss function is designed to align the aging features of the two populations. By doing so, our DCP model can transfer the knowledge encoded in the source population to the target population. Extensive experiments on the two of the largest benchmark datasets show that our DCP model outperforms several strong baseline methods and many state-of-the-art methods. |
Tasks | Age Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Deep_Cost-Sensitive_and_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Deep_Cost-Sensitive_and_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/deep-cost-sensitive-and-order-preserving |
Repo | |
Framework | |
Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering
Title | Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering |
Authors | Philipp Koehn, Huda Khayrallah, Kenneth Heafield, Mikel L. Forcada |
Abstract | We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1{%} and 10{%} of high-quality data to be used to train machine translation systems. Seventeen participants from companies, national research labs, and universities participated in this task. |
Tasks | Machine Translation, Outlier Detection |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6453/ |
https://www.aclweb.org/anthology/W18-6453 | |
PWC | https://paperswithcode.com/paper/findings-of-the-wmt-2018-shared-task-on-2 |
Repo | |
Framework | |
CRIM at SemEval-2018 Task 9: A Hybrid Approach to Hypernym Discovery
Title | CRIM at SemEval-2018 Task 9: A Hybrid Approach to Hypernym Discovery |
Authors | Gabriel Bernier-Colborne, Caroline Barri{`e}re |
Abstract | This report describes the system developed by the CRIM team for the hypernym discovery task at SemEval 2018. This system exploits a combination of supervised projection learning and unsupervised pattern-based hypernym discovery. It was ranked first on the 3 sub-tasks for which we submitted results. |
Tasks | Hypernym Discovery, Relation Extraction |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1116/ |
https://www.aclweb.org/anthology/S18-1116 | |
PWC | https://paperswithcode.com/paper/crim-at-semeval-2018-task-9-a-hybrid-approach |
Repo | |
Framework | |
Domain Generalization With Adversarial Feature Learning
Title | Domain Generalization With Adversarial Feature Learning |
Authors | Haoliang Li, Sinno Jialin Pan, Shiqi Wang, Alex C. Kot |
Abstract | In this paper, we tackle the problem of domain generalization: how to learn a generalized feature representation for an âunseenâ target domain by taking the advantage of multiple seen source-domain data. We present a novel framework based on adversarial autoencoders to learn a generalized latent feature representation across domains for domain generalization. To be specific, we extend adversarial autoencoders by imposing the Maximum Mean Discrepancy (MMD) measure to align the distributions among different domains, and matching the aligned distribution to an arbitrary prior distribution via adversarial feature learning. In this way, the learned feature representation is supposed to be universal to the seen source domains because of the MMD regularization, and is expected to generalize well on the target domain because of the introduction of the prior distribution. We proposed an algorithm to jointly train different components of our proposed framework. Extensive experiments on various vision tasks demonstrate that our proposed framework can learn better generalized features for the unseen target domain compared with state of-the-art domain generalization methods. |
Tasks | Domain Generalization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Domain_Generalization_With_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Domain_Generalization_With_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/domain-generalization-with-adversarial |
Repo | |
Framework | |
Adversarial reading networks for machine comprehension
Title | Adversarial reading networks for machine comprehension |
Authors | Quentin Grail, Julien Perez |
Abstract | Machine reading has recently shown remarkable progress thanks to differentiable reasoning models. In this context, End-to-End trainable Memory Networks (MemN2N) have demonstrated promising performance on simple natural language based reasoning tasks such as factual reasoning and basic deduction. However, the task of machine comprehension is currently bounded to a supervised setting and available question answering dataset. In this paper we explore the paradigm of adversarial learning and self-play for the task of machine reading comprehension. Inspired by the successful propositions in the domain of game learning, we present a novel approach of training for this task that is based on the definition of a coupled attention-based memory model. On one hand, a reader network is in charge of finding answers regarding a passage of text and a question. On the other hand, a narrator network is in charge of obfuscating spans of text in order to minimize the probability of success of the reader. We experimented the model on several question-answering corpora. The proposed learning paradigm and associated models present encouraging results. |
Tasks | Machine Reading Comprehension, Question Answering, Reading Comprehension |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Hy3MvSlRW |
https://openreview.net/pdf?id=Hy3MvSlRW | |
PWC | https://paperswithcode.com/paper/adversarial-reading-networks-for-machine |
Repo | |
Framework | |
Learning Typed Entailment Graphs with Global Soft Constraints
Title | Learning Typed Entailment Graphs with Global Soft Constraints |
Authors | Mohammad Javad Hosseini, Nathanael Chambers, Siva Reddy, Xavier R. Holt, Shay B. Cohen, Mark Johnson, Mark Steedman |
Abstract | This paper presents a new method for learning typed entailment graphs from text. We extract predicate-argument structures from multiple-source news corpora, and compute local distributional similarity scores to learn entailments between predicates with typed arguments (e.g., person contracted disease). Previous work has used transitivity constraints to improve local decisions, but these constraints are intractable on large graphs. We instead propose a scalable method that learns globally consistent similarity scores based on new soft constraints that consider both the structures across typed entailment graphs and inside each graph. Learning takes only a few hours to run over 100K predicates and our results show large improvements over local similarity scores on two entailment data sets. We further show improvements over paraphrases and entailments from the Paraphrase Database, and prior state-of-the-art entailment graphs. We show that the entailment graphs improve performance in a downstream task. |
Tasks | |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/Q18-1048/ |
https://www.aclweb.org/anthology/Q18-1048 | |
PWC | https://paperswithcode.com/paper/learning-typed-entailment-graphs-with-global |
Repo | |
Framework | |
Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms
Title | Practical Black-box Attacks on Deep Neural Networks using Efficient Query Mechanisms |
Authors | Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song |
Abstract | Existing black-box attacks on deep neural networks (DNNs) have largely focused on transferability, where an adversarial instance generated for a locally trained model can âtransferâ to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target modelâs class probabilities, which do not rely on transferability. We also propose strategies to decouple the number of queries required to generate each adversarial sample from the dimensionality of the input. An iterative variant of our attack achieves close to 100% attack success rates for both targeted and untargeted attacks on DNNs. We carry out a thorough comparative evaluation of black-box attacks and show that Gradient Estimation attacks achieve attack success rates similar to state-of-the-art white-box attacks on the MNIST and CIFAR-10 datasets. We also apply the Gradient Estimation attacks successfully against real-world classiï¬ers hosted by Clarifai. Further, we evaluate black-box attacks against state-of-the-art defenses based on adversarial training and show that the Gradient Estimation attacks are very eï¬ective even against these defenses. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Arjun_Nitin_Bhagoji_Practical_Black-box_Attacks_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Arjun_Nitin_Bhagoji_Practical_Black-box_Attacks_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/practical-black-box-attacks-on-deep-neural |
Repo | |
Framework | |
Learning to Select: Problem, Solution, and Applications
Title | Learning to Select: Problem, Solution, and Applications |
Authors | Heechang Ryu, Donghyun Kim, Hayong Shin |
Abstract | We propose a “Learning to Select” problem that selects the best among the flexible size candidates. This makes decisions based not only on the properties of the candidate, but also on the environment in which they belong to. For example, job dispatching in the manufacturing factory is a typical “Learning to Select” problem. We propose Variable-Length CNN which combines the classification power using hidden features from CNN and the idea of flexible input from Learning to Rank algorithms. This not only can handles flexible candidates using Dynamic Computation Graph, but also is computationally efficient because it only builds a network with the necessary sizes to fit the situation. We applied the algorithm to the job dispatching problem which uses the dispatching log data obtained from the virtual fine-tuned factory. Our proposed algorithm shows considerably better performance than other comparable algorithms. |
Tasks | Learning-To-Rank |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJVHY9lCb |
https://openreview.net/pdf?id=SJVHY9lCb | |
PWC | https://paperswithcode.com/paper/learning-to-select-problem-solution-and |
Repo | |
Framework | |
Very Large-Scale Global SfM by Distributed Motion Averaging
Title | Very Large-Scale Global SfM by Distributed Motion Averaging |
Authors | Siyu Zhu, Runze Zhang, Lei Zhou, Tianwei Shen, Tian Fang, Ping Tan, Long Quan |
Abstract | Global Structure-from-Motion (SfM) techniques have demonstrated superior efficiency and accuracy than the conventional incremental approach in many recent studies. This work proposes a divide-and-conquer framework to solve very large global SfM at the scale of millions of images. Specifically, we first divide all images into multiple partitions that preserve strong data association for well posed and parallel local motion averaging. Then, we solve a global motion averaging that determines cameras at partition boundaries and a similarity transformation per partition to register all cameras in a single coordinate frame. Finally, local and global motion averaging are iterated until convergence. Since local camera poses are fixed during the global motion average, we can avoid caching the whole reconstruction in memory at once. This distributed framework significantly enhances the efficiency and robustness of large-scale motion averaging. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhu_Very_Large-Scale_Global_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhu_Very_Large-Scale_Global_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/very-large-scale-global-sfm-by-distributed |
Repo | |
Framework | |
Uncertainty-aware generative models for inferring document class prevalence
Title | Uncertainty-aware generative models for inferring document class prevalence |
Authors | Katherine Keith, Brendan O{'}Connor |
Abstract | Prevalence estimation is the task of inferring the relative frequency of classes of unlabeled examples in a group{—}for example, the proportion of a document collection with positive sentiment. Previous work has focused on aggregating and adjusting discriminative individual classifiers to obtain prevalence point estimates. But imperfect classifier accuracy ought to be reflected in uncertainty over the predicted prevalence for scientifically valid inference. In this work, we present (1) a generative probabilistic modeling approach to prevalence estimation, and (2) the construction and evaluation of prevalence confidence intervals; in particular, we demonstrate that an off-the-shelf discriminative classifier can be given a generative re-interpretation, by backing out an implicit individual-level likelihood function, which can be used to conduct fast and simple group-level Bayesian inference. Empirically, we demonstrate our approach provides better confidence interval coverage than an alternative, and is dramatically more robust to shifts in the class prior between training and testing. |
Tasks | Bayesian Inference, Epidemiology |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1487/ |
https://www.aclweb.org/anthology/D18-1487 | |
PWC | https://paperswithcode.com/paper/uncertainty-aware-generative-models-for |
Repo | |
Framework | |