February 1, 2020

3338 words 16 mins read

Paper Group AWR 225

Paper Group AWR 225

CommonGen: A Constrained Text Generation Dataset Towards Generative Commonsense Reasoning. Correction of Electron Back-scattered Diffraction datasets using an evolutionary algorithm. Are We Consistently Biased? Multidimensional Analysis of Biases in Distributional Word Vectors. GA-Net: Guided Aggregation Net for End-to-end Stereo Matching. Image Re …

CommonGen: A Constrained Text Generation Dataset Towards Generative Commonsense Reasoning

Title CommonGen: A Constrained Text Generation Dataset Towards Generative Commonsense Reasoning
Authors Bill Yuchen Lin, Ming Shen, Yu Xing, Pei Zhou, Xiang Ren
Abstract Rational humans can generate sentences that cover a certain set of concepts while describing natural and common scenes. For example, given {apple(noun), tree(noun), pick(verb)}, humans can easily come up with scenes like “a boy is picking an apple from a tree” via their generative commonsense reasoning ability. However, we find this capacity has not been well learned by machines. Most prior works in machine commonsense focus on discriminative reasoning tasks with a multi-choice question answering setting. Herein, we present CommonGen: a challenging dataset for testing generative commonsense reasoning with a constrained text generation task. We collect 37k concept-sets as inputs and 90k human-written sentences as associated outputs. Additionally, we also provide high-quality rationales behind the reasoning process for the development and test sets from the human annotators. We demonstrate the difficulty of the task by examining a wide range of sequence generation methods with both automatic metrics and human evaluation. The state-of-the-art pre-trained generation model, UniLM, is still far from human performance in this task. Our data and code is publicly available at http://inklab.usc.edu/CommonGen/ .
Tasks Question Answering, Text Generation
Published 2019-11-09
URL https://arxiv.org/abs/1911.03705v1
PDF https://arxiv.org/pdf/1911.03705v1.pdf
PWC https://paperswithcode.com/paper/commongen-a-constrained-text-generation
Repo https://github.com/INK-USC/CommonGen
Framework pytorch

Correction of Electron Back-scattered Diffraction datasets using an evolutionary algorithm

Title Correction of Electron Back-scattered Diffraction datasets using an evolutionary algorithm
Authors Florian Strub, Marie-Agathe Charpagne, Tresa M. Pollock
Abstract In materials science and particularly electron microscopy, Electron Back-scatter Diffraction (EBSD) is a common and powerful mapping technique for collecting local crystallographic data at the sub-micron scale. The quality of the reconstruction of the maps is critical to study the spatial distribution of phases and crystallographic orientation relationships between phases, a key interest in materials science. However, EBSD data is known to suffer from distortions that arise from several instrument and detector artifacts. In this paper, we present an unsupervised method that corrects those distortions, and enables or enhances phase differentiation in EBSD data. The method uses a segmented electron image of the phases of interest (laths, precipitates, voids, inclusions) gathered using detectors that generate less distorted data, of the same area than the EBSD map, and then searches for the best transformation to correct the distortions of the initial EBSD data. To do so, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is implemented to distort the EBSD until it matches the reference electron image. Fast and versatile, this method does not require any human annotation and can be applied to large datasets and wide areas, where the distortions are important. Besides, this method requires very little assumption concerning the shape of the distortion function. Some application examples in multiphase materials with feature sizes down to 1 $\mu$m are presented, including a Titanium alloy and a Nickel-base superalloy.
Tasks
Published 2019-03-07
URL http://arxiv.org/abs/1903.02982v1
PDF http://arxiv.org/pdf/1903.02982v1.pdf
PWC https://paperswithcode.com/paper/correction-of-electron-back-scattered
Repo https://github.com/MLmicroscopy/distortions
Framework none

Are We Consistently Biased? Multidimensional Analysis of Biases in Distributional Word Vectors

Title Are We Consistently Biased? Multidimensional Analysis of Biases in Distributional Word Vectors
Authors Anne Lauscher, Goran Glavaš
Abstract Word embeddings have recently been shown to reflect many of the pronounced societal biases (e.g., gender bias or racial bias). Existing studies are, however, limited in scope and do not investigate the consistency of biases across relevant dimensions like embedding models, types of texts, and different languages. In this work, we present a systematic study of biases encoded in distributional word vector spaces: we analyze how consistent the bias effects are across languages, corpora, and embedding models. Furthermore, we analyze the cross-lingual biases encoded in bilingual embedding spaces, indicative of the effects of bias transfer encompassed in cross-lingual transfer of NLP models. Our study yields some unexpected findings, e.g., that biases can be emphasized or downplayed by different embedding models or that user-generated content may be less biased than encyclopedic text. We hope our work catalyzes bias research in NLP and informs the development of bias reduction techniques.
Tasks Cross-Lingual Transfer, Word Embeddings
Published 2019-04-26
URL http://arxiv.org/abs/1904.11783v2
PDF http://arxiv.org/pdf/1904.11783v2.pdf
PWC https://paperswithcode.com/paper/are-we-consistently-biased-multidimensional
Repo https://github.com/umanlp/XWEAT
Framework none

GA-Net: Guided Aggregation Net for End-to-end Stereo Matching

Title GA-Net: Guided Aggregation Net for End-to-end Stereo Matching
Authors Feihu Zhang, Victor Prisacariu, Ruigang Yang, Philip H. S. Torr
Abstract In the stereo matching task, matching cost aggregation is crucial in both traditional methods and deep neural network models in order to accurately estimate disparities. We propose two novel neural net layers, aimed at capturing local and the whole-image cost dependencies respectively. The first is a semi-global aggregation layer which is a differentiable approximation of the semi-global matching, the second is the local guided aggregation layer which follows a traditional cost filtering strategy to refine thin structures. These two layers can be used to replace the widely used 3D convolutional layer which is computationally costly and memory-consuming as it has cubic computational/memory complexity. In the experiments, we show that nets with a two-layer guided aggregation block easily outperform the state-of-the-art GC-Net which has nineteen 3D convolutional layers. We also train a deep guided aggregation network (GA-Net) which gets better accuracies than state-of-the-art methods on both Scene Flow dataset and KITTI benchmarks.
Tasks Stereo Matching
Published 2019-04-13
URL http://arxiv.org/abs/1904.06587v1
PDF http://arxiv.org/pdf/1904.06587v1.pdf
PWC https://paperswithcode.com/paper/ga-net-guided-aggregation-net-for-end-to-end
Repo https://github.com/HKBU-HPML/FADNet
Framework pytorch

Image Restoration Using Deep Regulated Convolutional Networks

Title Image Restoration Using Deep Regulated Convolutional Networks
Authors Peng Liu, Xiaoxiao Zhou, Junyi Yang, El Basha Mohammad D, Ruogu Fang
Abstract While the depth of convolutional neural networks has attracted substantial attention in the deep learning research, the width of these networks has recently received greater interest. The width of networks, defined as the size of the receptive fields and the density of the channels, has demonstrated crucial importance in low-level vision tasks such as image denoising and restoration. However, the limited generalization ability, due to the increased width of networks, creates a bottleneck in designing wider networks. In this paper, we propose the Deep Regulated Convolutional Network (RC-Net), a deep network composed of regulated sub-network blocks cascaded by skip-connections, to overcome this bottleneck. Specifically, the Regulated Convolution block (RC-block), featured by a combination of large and small convolution filters, balances the effectiveness of prominent feature extraction and the generalization ability of the network. RC-Nets have several compelling advantages: they embrace diversified features through large-small filter combinations, alleviate the hazy boundary and blurred details in image denoising and super-resolution problems, and stabilize the learning process. Our proposed RC-Nets outperform state-of-the-art approaches with significant performance gains in various image restoration tasks while demonstrating promising generalization ability. The code is available at https://github.com/cswin/RC-Nets.
Tasks Denoising, Image Denoising, Image Restoration, Super-Resolution
Published 2019-10-19
URL https://arxiv.org/abs/1910.08853v1
PDF https://arxiv.org/pdf/1910.08853v1.pdf
PWC https://paperswithcode.com/paper/image-restoration-using-deep-regulated
Repo https://github.com/cswin/RC-Nets
Framework none

Short Text Language Identification for Under Resourced Languages

Title Short Text Language Identification for Under Resourced Languages
Authors Bernardt Duvenhage
Abstract The paper presents a hierarchical naive Bayesian and lexicon based classifier for short text language identification (LID) useful for under resourced languages. The algorithm is evaluated on short pieces of text for the 11 official South African languages some of which are similar languages. The algorithm is compared to recent approaches using test sets from previous works on South African languages as well as the Discriminating between Similar Languages (DSL) shared tasks’ datasets. Remaining research opportunities and pressing concerns in evaluating and comparing LID approaches are also discussed.
Tasks Language Identification
Published 2019-11-18
URL https://arxiv.org/abs/1911.07555v2
PDF https://arxiv.org/pdf/1911.07555v2.pdf
PWC https://paperswithcode.com/paper/short-text-language-identification-for-under
Repo https://github.com/praekelt/feersum-lid-shared-task
Framework none

Better Rewards Yield Better Summaries: Learning to Summarise Without References

Title Better Rewards Yield Better Summaries: Learning to Summarise Without References
Authors Florian Böhm, Yang Gao, Christian M. Meyer, Ori Shapira, Ido Dagan, Iryna Gurevych
Abstract Reinforcement Learning (RL) based document summarisation systems yield state-of-the-art performance in terms of ROUGE scores, because they directly use ROUGE as the rewards during training. However, summaries with high ROUGE scores often receive low human judgement. To find a better reward function that can guide RL to generate human-appealing summaries, we learn a reward function from human ratings on 2,500 summaries. Our reward function only takes the document and system summary as input. Hence, once trained, it can be used to train RL-based summarisation systems without using any reference summaries. We show that our learned rewards have significantly higher correlation with human ratings than previous approaches. Human evaluation experiments show that, compared to the state-of-the-art supervised-learning systems and ROUGE-as-rewards RL summarisation systems, the RL systems using our learned rewards during training generate summarieswith higher human ratings. The learned reward function and our source code are available at https://github.com/yg211/summary-reward-no-reference.
Tasks
Published 2019-09-03
URL https://arxiv.org/abs/1909.01214v1
PDF https://arxiv.org/pdf/1909.01214v1.pdf
PWC https://paperswithcode.com/paper/better-rewards-yield-better-summaries
Repo https://github.com/UKPLab/emnlp2019-summary-reward
Framework pytorch

Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation

Title Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation
Authors Daniel Loureiro, Alipio Jorge
Abstract Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.
Tasks Language Modelling, Word Embeddings, Word Sense Disambiguation
Published 2019-06-24
URL https://arxiv.org/abs/1906.10007v1
PDF https://arxiv.org/pdf/1906.10007v1.pdf
PWC https://paperswithcode.com/paper/language-modelling-makes-sense-propagating
Repo https://github.com/danlou/LMMS
Framework none

Technical Report of the Video Event Reconstruction and Analysis (VERA) System – Shooter Localization, Models, Interface, and Beyond

Title Technical Report of the Video Event Reconstruction and Analysis (VERA) System – Shooter Localization, Models, Interface, and Beyond
Authors Junwei Liang, Jay D. Aronson, Alexander Hauptmann
Abstract Every minute, hundreds of hours of video are uploaded to social media sites and the Internet from around the world. This material creates a visual record of the experiences of a significant percentage of humanity and can help illuminate how we live in the present moment. When properly analyzed, this video can also help analysts to reconstruct events of interest, including war crimes, human rights violations, and terrorist acts. Machine learning and computer vision can play a crucial role in this process. In this technical report, we describe the Video Event Reconstruction and Analysis (VERA) system. This new tool brings together a variety of capabilities we have developed over the past few years (including video synchronization and geolocation to order unstructured videos lacking metadata over time and space, and sound recognition algorithms) to enable the reconstruction and analysis of events captured on video. Among other uses, VERA enables the localization of a shooter from just a few videos that include the sound of gunshots. To demonstrate the efficacy of this suite of tools, we present the results of estimating the shooter’s location of the Las Vegas Shooting in 2017 and show that VERA accurately predicts the shooter’s location using only the first few gunshots. We then point out future directions that can help improve the system and further reduce unnecessary human labor in the process. All of the components of VERA run through a web interface that enables human-in-the-loop verification to ensure accurate estimations. All relevant source code, including the web interface and machine learning models, is freely available on Github. We hope that researchers and software developers will be inspired to improve and expand this system moving forward to better meet the needs of human rights and public safety.
Tasks Gunshot Detection, Shooter Localization, Temporal Localization, Video Synchronization
Published 2019-05-26
URL https://arxiv.org/abs/1905.13313v5
PDF https://arxiv.org/pdf/1905.13313v5.pdf
PWC https://paperswithcode.com/paper/190513313
Repo https://github.com/JunweiLiang/VERA_Shooter_Localization
Framework tf

Reliable and Efficient Image Cropping: A Grid Anchor based Approach

Title Reliable and Efficient Image Cropping: A Grid Anchor based Approach
Authors Hui Zeng, Lida Li, Zisheng Cao, Lei Zhang
Abstract Image cropping aims to improve the composition as well as aesthetic quality of an image by removing extraneous content from it. Existing image cropping databases provide only one or several human-annotated bounding boxes as the groundtruth, which cannot reflect the non-uniqueness and flexibility of image cropping in practice. The employed evaluation metrics such as intersection-over-union cannot reliably reflect the real performance of cropping models, either. This work revisits the problem of image cropping, and presents a grid anchor based formulation by considering the special properties and requirements (e.g., local redundancy, content preservation, aspect ratio) of image cropping. Our formulation reduces the searching space of candidate crops from millions to less than one hundred. Consequently, a grid anchor based cropping benchmark is constructed, where all crops of each image are annotated and more reliable evaluation metrics are defined. We also design an effective and lightweight network module, which simultaneously considers the region of interest and region of discard for more accurate image cropping. Our model can stably output visually pleasing crops for images of different scenes and run at a speed of 125 FPS. Code and dataset are available at: https://github.com/HuiZeng/Grid-Anchor-based-Image-Cropping.
Tasks Image Cropping
Published 2019-04-09
URL http://arxiv.org/abs/1904.04441v1
PDF http://arxiv.org/pdf/1904.04441v1.pdf
PWC https://paperswithcode.com/paper/reliable-and-efficient-image-cropping-a-grid
Repo https://github.com/HuiZeng/Grid-Anchor-based-Image-Cropping
Framework pytorch

ShrinkTeaNet: Million-scale Lightweight Face Recognition via Shrinking Teacher-Student Networks

Title ShrinkTeaNet: Million-scale Lightweight Face Recognition via Shrinking Teacher-Student Networks
Authors Chi Nhan Duong, Khoa Luu, Kha Gia Quach, Ngan Le
Abstract Large-scale face recognition in-the-wild has been recently achieved matured performance in many real work applications. However, such systems are built on GPU platforms and mostly deploy heavy deep network architectures. Given a high-performance heavy network as a teacher, this work presents a simple and elegant teacher-student learning paradigm, namely ShrinkTeaNet, to train a portable student network that has significantly fewer parameters and competitive accuracy against the teacher network. Far apart from prior teacher-student frameworks mainly focusing on accuracy and compression ratios in closed-set problems, our proposed teacher-student network is proved to be more robust against open-set problem, i.e. large-scale face recognition. In addition, this work introduces a novel Angular Distillation Loss for distilling the feature direction and the sample distributions of the teacher’s hypersphere to its student. Then ShrinkTeaNet framework can efficiently guide the student’s learning process with the teacher’s knowledge presented in both intermediate and last stages of the feature embedding. Evaluations on LFW, CFP-FP, AgeDB, IJB-B and IJB-C Janus, and MegaFace with one million distractors have demonstrated the efficiency of the proposed approach to learn robust student networks which have satisfying accuracy and compact sizes. Our ShrinkTeaNet is able to support the light-weight architecture achieving high performance with 99.77% on LFW and 95.64% on large-scale Megaface protocols.
Tasks Face Recognition
Published 2019-05-25
URL https://arxiv.org/abs/1905.10620v1
PDF https://arxiv.org/pdf/1905.10620v1.pdf
PWC https://paperswithcode.com/paper/shrinkteanet-million-scale-lightweight-face
Repo https://github.com/david-svitov/margindistillation
Framework none

Multilinear Compressive Learning

Title Multilinear Compressive Learning
Authors Dat Thanh Tran, Mehmet Yamac, Aysen Degerli, Moncef Gabbouj, Alexandros Iosifidis
Abstract Compressive Learning is an emerging topic that combines signal acquisition via compressive sensing and machine learning to perform inference tasks directly on a small number of measurements. Many data modalities naturally have a multi-dimensional or tensorial format, with each dimension or tensor mode representing different features such as the spatial and temporal information in video sequences or the spatial and spectral information in hyperspectral images. However, in existing compressive learning frameworks, the compressive sensing component utilizes either random or learned linear projection on the vectorized signal to perform signal acquisition, thus discarding the multi-dimensional structure of the signals. In this paper, we propose Multilinear Compressive Learning, a framework that takes into account the tensorial nature of multi-dimensional signals in the acquisition step and builds the subsequent inference model on the structurally sensed measurements. Our theoretical complexity analysis shows that the proposed framework is more efficient compared to its vector-based counterpart in both memory and computation requirement. With extensive experiments, we also empirically show that our Multilinear Compressive Learning framework outperforms the vector-based framework in object classification and face recognition tasks, and scales favorably when the dimensionalities of the original signals increase, making it highly efficient for high-dimensional multi-dimensional signals.
Tasks Compressive Sensing, Face Recognition, Object Classification
Published 2019-05-17
URL https://arxiv.org/abs/1905.07481v2
PDF https://arxiv.org/pdf/1905.07481v2.pdf
PWC https://paperswithcode.com/paper/multilinear-compressive-learning
Repo https://github.com/viebboy/MultilinearCompressiveLearningFramework
Framework tf

Additive Adversarial Learning for Unbiased Authentication

Title Additive Adversarial Learning for Unbiased Authentication
Authors Jian Liang, Yuren Cao, Chenbin Zhang, Shiyu Chang, Kun Bai, Zenglin Xu
Abstract Authentication is a task aiming to confirm the truth between data instances and personal identities. Typical authentication applications include face recognition, person re-identification, authentication based on mobile devices and so on. The recently-emerging data-driven authentication process may encounter undesired biases, i.e., the models are often trained in one domain (e.g., for people wearing spring outfits) while required to apply in other domains (e.g., they change the clothes to summer outfits). To address this issue, we propose a novel two-stage method that disentangles the class/identity from domain-differences, and we consider multiple types of domain-difference. In the first stage, we learn disentangled representations by a one-versus-rest disentangle learning (OVRDL) mechanism. In the second stage, we improve the disentanglement by an additive adversarial learning (AAL) mechanism. Moreover, we discuss the necessity to avoid a learning dilemma due to disentangling causally related types of domain-difference. Comprehensive evaluation results demonstrate the effectiveness and superiority of the proposed method.
Tasks Face Recognition, Person Re-Identification
Published 2019-05-16
URL https://arxiv.org/abs/1905.06517v3
PDF https://arxiv.org/pdf/1905.06517v3.pdf
PWC https://paperswithcode.com/paper/additive-adversarial-learning-for-unbiased
Repo https://github.com/langlrsw/AAL-unbiased-authentication
Framework tf

DIVA: Domain Invariant Variational Autoencoders

Title DIVA: Domain Invariant Variational Autoencoders
Authors Maximilian Ilse, Jakub M. Tomczak, Christos Louizos, Max Welling
Abstract We consider the problem of domain generalization, namely, how to learn representations given data from a set of domains that generalize to data from a previously unseen domain. We propose the Domain Invariant Variational Autoencoder (DIVA), a generative model that tackles this problem by learning three independent latent subspaces, one for the domain, one for the class, and one for any residual variations. We highlight that due to the generative nature of our model we can also incorporate unlabeled data from known or previously unseen domains. To the best of our knowledge this has not been done before in a domain generalization setting. This property is highly desirable in fields like medical imaging where labeled data is scarce. We experimentally evaluate our model on the rotated MNIST benchmark and a malaria cell images dataset where we show that (i) the learned subspaces are indeed complementary to each other, (ii) we improve upon recent works on this task and (iii) incorporating unlabelled data can boost the performance even further.
Tasks Domain Generalization
Published 2019-05-24
URL https://arxiv.org/abs/1905.10427v2
PDF https://arxiv.org/pdf/1905.10427v2.pdf
PWC https://paperswithcode.com/paper/diva-domain-invariant-variational
Repo https://github.com/AMLab-Amsterdam/DIVA
Framework pytorch

COBS: a Compact Bit-Sliced Signature Index

Title COBS: a Compact Bit-Sliced Signature Index
Authors Timo Bingmann, Phelim Bradley, Florian Gauger, Zamin Iqbal
Abstract We present COBS, a COmpact Bit-sliced Signature index, which is a cross-over between an inverted index and Bloom filters. Our target application is to index $k$-mers of DNA samples or $q$-grams from text documents and process approximate pattern matching queries on the corpus with a user-chosen coverage threshold. Query results may contain a number of false positives which decreases exponentially with the query length. We compare COBS to seven other index software packages on 100000 microbial DNA samples. COBS’ compact but simple data structure outperforms the other indexes in construction time and query performance with Mantis by Pandey et al. in second place. However, unlike Mantis and other previous work, COBS does not need the complete index in RAM and is thus designed to scale to larger document sets.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09624v2
PDF https://arxiv.org/pdf/1905.09624v2.pdf
PWC https://paperswithcode.com/paper/cobs-a-compact-bit-sliced-signature-index
Repo https://github.com/bingmann/cobs
Framework none
comments powered by Disqus