Paper Group AWR 356
Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs. The Sensitivity of Counterfactual Fairness to Unmeasured Confounding. Alternating Roles Dialog Model with Large-scale Pre-trained Language Models. FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. Representation of Constituents in Neural Language Mode …
Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs
Title | Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs |
Authors | Alex Warstadt, Yu Cao, Ioana Grosu, Wei Peng, Hagen Blix, Yining Nie, Anna Alsop, Shikha Bordia, Haokun Liu, Alicia Parrish, Sheng-Fu Wang, Jason Phang, Anhad Mohananey, Phu Mon Htut, Paloma Jeretič, Samuel R. Bowman |
Abstract | Though state-of-the-art sentence representation models can perform tasks requiring significant knowledge of grammar, it is an open question how best to evaluate their grammatical knowledge. We explore five experimental methods inspired by prior work evaluating pretrained sentence representation models. We use a single linguistic phenomenon, negative polarity item (NPI) licensing in English, as a case study for our experiments. NPIs like “any” are grammatical only if they appear in a licensing environment like negation (“Sue doesn’t have any cats” vs. “Sue has any cats”). This phenomenon is challenging because of the variety of NPI licensing environments that exist. We introduce an artificially generated dataset that manipulates key features of NPI licensing for the experiments. We find that BERT has significant knowledge of these features, but its success varies widely across different experimental methods. We conclude that a variety of methods is necessary to reveal all relevant aspects of a model’s grammatical knowledge in a given domain. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02597v2 |
https://arxiv.org/pdf/1909.02597v2.pdf | |
PWC | https://paperswithcode.com/paper/investigating-berts-knowledge-of-language |
Repo | https://github.com/alexwarstadt/data_generation |
Framework | none |
The Sensitivity of Counterfactual Fairness to Unmeasured Confounding
Title | The Sensitivity of Counterfactual Fairness to Unmeasured Confounding |
Authors | Niki Kilbertus, Philip J. Ball, Matt J. Kusner, Adrian Weller, Ricardo Silva |
Abstract | Causal approaches to fairness have seen substantial recent interest, both from the machine learning community and from wider parties interested in ethical prediction algorithms. In no small part, this has been due to the fact that causal models allow one to simultaneously leverage data and expert knowledge to remove discriminatory effects from predictions. However, one of the primary assumptions in causal modeling is that you know the causal graph. This introduces a new opportunity for bias, caused by misspecifying the causal model. One common way for misspecification to occur is via unmeasured confounding: the true causal effect between variables is partially described by unobserved quantities. In this work we design tools to assess the sensitivity of fairness measures to this confounding for the popular class of non-linear additive noise models (ANMs). Specifically, we give a procedure for computing the maximum difference between two counterfactually fair predictors, where one has become biased due to confounding. For the case of bivariate confounding our technique can be swiftly computed via a sequence of closed-form updates. For multivariate confounding we give an algorithm that can be efficiently solved via automatic differentiation. We demonstrate our new sensitivity analysis tools in real-world fairness scenarios to assess the bias arising from confounding. |
Tasks | |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.01040v1 |
https://arxiv.org/pdf/1907.01040v1.pdf | |
PWC | https://paperswithcode.com/paper/the-sensitivity-of-counterfactual-fairness-to |
Repo | https://github.com/nikikilbertus/cf-fairness-sensitivity |
Framework | none |
Alternating Roles Dialog Model with Large-scale Pre-trained Language Models
Title | Alternating Roles Dialog Model with Large-scale Pre-trained Language Models |
Authors | Qingyang Wu, Yichi Zhang, Yu Li, Zhou Yu |
Abstract | Existing dialog system models require extensive human annotations and are difficult to generalize to different tasks. The recent success of large pre-trained language models such as BERT and GPT-2 (Devlin et al., 2019; Radford et al., 2019) have suggested the effectiveness of incorporating language priors in down-stream NLP tasks. However, how much pre-trained language models can help dialog response generation is still under exploration. In this paper, we propose a simple, general, and effective framework: Alternating Roles Dialog Model (ARDM). ARDM models each speaker separately and takes advantage of the large pre-trained language model. It requires no supervision from human annotations such as belief states or dialog acts to achieve effective conversations. ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ. Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as persuasion. In persuasion tasks, ARDM is capable of generating human-like responses to persuade people to donate to a charity. |
Tasks | Language Modelling |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03756v2 |
https://arxiv.org/pdf/1910.03756v2.pdf | |
PWC | https://paperswithcode.com/paper/alternating-recurrent-dialog-model-with-large |
Repo | https://github.com/budzianowski/multiwoz |
Framework | pytorch |
FinBERT: Financial Sentiment Analysis with Pre-trained Language Models
Title | FinBERT: Financial Sentiment Analysis with Pre-trained Language Models |
Authors | Dogu Araci |
Abstract | Financial sentiment analysis is a challenging task due to the specialized language and lack of labeled data in that domain. General-purpose models are not effective enough because of the specialized language used in a financial context. We hypothesize that pre-trained language models can help with this problem because they require fewer labeled examples and they can be further trained on domain-specific corpora. We introduce FinBERT, a language model based on BERT, to tackle NLP tasks in the financial domain. Our results show improvement in every measured metric on current state-of-the-art results for two financial sentiment analysis datasets. We find that even with a smaller training set and fine-tuning only a part of the model, FinBERT outperforms state-of-the-art machine learning methods. |
Tasks | Language Modelling, Sentiment Analysis |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10063v1 |
https://arxiv.org/pdf/1908.10063v1.pdf | |
PWC | https://paperswithcode.com/paper/finbert-financial-sentiment-analysis-with-pre |
Repo | https://github.com/ProsusAI/finBERT |
Framework | none |
Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study
Title | Representation of Constituents in Neural Language Models: Coordination Phrase as a Case Study |
Authors | Aixiu An, Peng Qian, Ethan Wilcox, Roger Levy |
Abstract | Neural language models have achieved state-of-the-art performances on many NLP tasks, and recently have been shown to learn a number of hierarchically-sensitive syntactic dependencies between individual words. However, equally important for language processing is the ability to combine words into phrasal constituents, and use constituent-level features to drive downstream expectations. Here we investigate neural models’ ability to represent constituent-level features, using coordinated noun phrases as a case study. We assess whether different neural language models trained on English and French represent phrase-level number and gender features, and use those features to drive downstream expectations. Our results suggest that models use a linear combination of NP constituent number to drive CoordNP/verb number agreement. This behavior is highly regular and even sensitive to local syntactic context, however it differs crucially from observed human behavior. Models have less success with gender agreement. Models trained on large corpora perform best, and there is no obvious advantage for models trained using explicit syntactic supervision. |
Tasks | |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04625v1 |
https://arxiv.org/pdf/1909.04625v1.pdf | |
PWC | https://paperswithcode.com/paper/representation-of-constituents-in-neural |
Repo | https://github.com/cpllab/rnn_psycholing_coordination |
Framework | none |
Utilizing Temporal Information in Deep Convolutional Network for Efficient Soccer Ball Detection and Tracking
Title | Utilizing Temporal Information in Deep Convolutional Network for Efficient Soccer Ball Detection and Tracking |
Authors | Anna Kukleva, Mohammad Asif Khan, Hafez Farazi, Sven Behnke |
Abstract | Soccer ball detection is identified as one of the critical challenges in the RoboCup competition. It requires an efficient vision system capable of handling the task of detection with high precision and recall and providing robust and low inference time. In this work, we present a novel convolutional neural network (CNN) approach to detect the soccer ball in an image sequence. In contrast to the existing methods where only the current frame or an image is used for the detection, we make use of the history of frames. Using history allows to efficiently track the ball in situations where the ball disappears or gets partially occluded in some of the frames. Our approach exploits spatio-temporal correlation and detects the ball based on the trajectory of its movements. We present our results with three convolutional methods, namely temporal convolutional networks (TCN), ConvLSTM, and ConvGRU. We first solve the detection task for an image using fully convolutional encoder-decoder architecture, and later, we use it as an input to our temporal models and jointly learn the detection task in sequences of images. We evaluate all our experiments on a novel dataset prepared as a part of this work. Furthermore, we present empirical results to support the effectiveness of using the history of the ball in challenging scenarios. |
Tasks | Game of Football |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02406v2 |
https://arxiv.org/pdf/1909.02406v2.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-temporal-information-in |
Repo | https://github.com/AIS-Bonn/TemporalBallDetection |
Framework | pytorch |
Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts
Title | Labeling, Cutting, Grouping: an Efficient Text Line Segmentation Method for Medieval Manuscripts |
Authors | Michele Alberti, Lars Vögtlin, Vinaychandran Pondenkandath, Mathias Seuret, Rolf Ingold, Marcus Liwicki |
Abstract | This paper introduces a new way for text-line extraction by integrating deep-learning based pre-classification and state-of-the-art segmentation methods. Text-line extraction in complex handwritten documents poses a significant challenge, even to the most modern computer vision algorithms. Historical manuscripts are a particularly hard class of documents as they present several forms of noise, such as degradation, bleed-through, interlinear glosses, and elaborated scripts. In this work, we propose a novel method which uses semantic segmentation at pixel level as intermediate task, followed by a text-line extraction step. We measured the performance of our method on a recent dataset of challenging medieval manuscripts and surpassed state-of-the-art results by reducing the error by 80.7%. Furthermore, we demonstrate the effectiveness of our approach on various other datasets written in different scripts. Hence, our contribution is two-fold. First, we demonstrate that semantic pixel segmentation can be used as strong denoising pre-processing step before performing text line extraction. Second, we introduce a novel, simple and robust algorithm that leverages the high-quality semantic segmentation to achieve a text-line extraction performance of 99.42% line IU on a challenging dataset. |
Tasks | Denoising, Semantic Segmentation |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.11894v2 |
https://arxiv.org/pdf/1906.11894v2.pdf | |
PWC | https://paperswithcode.com/paper/labeling-cutting-grouping-an-efficient-text |
Repo | https://github.com/DIVA-DIA/Text-Line-Segmentation-Method-for-Medieval-Manuscripts |
Framework | none |
PointAtrousGraph: Deep Hierarchical Encoder-Decoder with Point Atrous Convolution for Unorganized 3D Points
Title | PointAtrousGraph: Deep Hierarchical Encoder-Decoder with Point Atrous Convolution for Unorganized 3D Points |
Authors | Liang Pan, Chee-Meng Chew, Gim Hee Lee |
Abstract | Motivated by the success of encoding multi-scale contextual information for image analysis, we propose our PointAtrousGraph (PAG) - a deep permutation-invariant hierarchical encoder-decoder for efficiently exploiting multi-scale edge features in point clouds. Our PAG is constructed by several novel modules, such as Point Atrous Convolution (PAC), Edge-preserved Pooling (EP) and Edge-preserved Unpooling (EU). Similar with atrous convolution, our PAC can effectively enlarge receptive fields of filters and thus densely learn multi-scale point features. Following the idea of non-overlapping max-pooling operations, we propose our EP to preserve critical edge features during subsampling. Correspondingly, our EU modules gradually recover spatial information for edge features. In addition, we introduce chained skip subsampling/upsampling modules that directly propagate edge features to the final stage. Particularly, our proposed auxiliary loss functions can further improve our performance. Experimental results show that our PAG outperform previous state-of-the-art methods on various 3D semantic perception applications. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09798v2 |
https://arxiv.org/pdf/1907.09798v2.pdf | |
PWC | https://paperswithcode.com/paper/pointatrousgraph-deep-hierarchical-encoder |
Repo | https://github.com/paul007pl/PointAtrousGraph |
Framework | tf |
Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition
Title | Region Attention Networks for Pose and Occlusion Robust Facial Expression Recognition |
Authors | Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, Yu Qiao |
Abstract | Occlusion and pose variations, which can change facial appearance significantly, are two major obstacles for automatic Facial Expression Recognition (FER). Though automatic FER has made substantial progresses in the past few decades, occlusion-robust and pose-invariant issues of FER have received relatively less attention, especially in real-world scenarios. This paper addresses the real-world pose and occlusion robust FER problem with three-fold contributions. First, to stimulate the research of FER under real-world occlusions and variant poses, we build several in-the-wild facial expression datasets with manual annotations for the community. Second, we propose a novel Region Attention Network (RAN), to adaptively capture the importance of facial regions for occlusion and pose variant FER. The RAN aggregates and embeds varied number of region features produced by a backbone convolutional neural network into a compact fixed-length representation. Last, inspired by the fact that facial expressions are mainly defined by facial action units, we propose a region biased loss to encourage high attention weights for the most important regions. We validate our RAN and region biased loss on both our built test datasets and four popular datasets: FERPlus, AffectNet, RAF-DB, and SFEW. Extensive experiments show that our RAN and region biased loss largely improve the performance of FER with occlusion and variant pose. Our method also achieves state-of-the-art results on FERPlus, AffectNet, RAF-DB, and SFEW. Code and the collected test data will be publicly available. |
Tasks | Facial Expression Recognition |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04075v2 |
https://arxiv.org/pdf/1905.04075v2.pdf | |
PWC | https://paperswithcode.com/paper/region-attention-networks-for-pose-and |
Repo | https://github.com/kaiwang960112/Challenge-condition-FER-dataset |
Framework | pytorch |
Focus-Enhanced Scene Text Recognition with Deformable Convolutions
Title | Focus-Enhanced Scene Text Recognition with Deformable Convolutions |
Authors | Linjie Deng, Yanxiang Gong, Xinchen Lu, Xin Yi, Zheng Ma, Mei Xie |
Abstract | Recently, scene text recognition methods based on deep learning have sprung up in computer vision area. The existing methods achieved great performances, but the recognition of irregular text is still challenging due to the various shapes and distorted patterns. Consider that at the time of reading words in the real world, normally we will not rectify it in our mind but adjust our focus and visual fields. Similarly, through utilizing deformable convolutional layers whose geometric structures are adjustable, we present an enhanced recognition network without the steps of rectification to deal with irregular text in this work. A number of experiments have been applied, where the results on public benchmarks demonstrate the effectiveness of our proposed components and shows that our method has reached satisfactory performances. The code will be publicly available at https://github.com/Alpaca07/dtr soon. |
Tasks | Scene Text Recognition |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.10998v2 |
https://arxiv.org/pdf/1908.10998v2.pdf | |
PWC | https://paperswithcode.com/paper/focus-enhanced-scene-text-recognition-with |
Repo | https://github.com/Alpaca07/dtr |
Framework | pytorch |
Adversarial Self-Defense for Cycle-Consistent GANs
Title | Adversarial Self-Defense for Cycle-Consistent GANs |
Authors | Dina Bashkirova, Ben Usman, Kate Saenko |
Abstract | The goal of unsupervised image-to-image translation is to map images from one domain to another without the ground truth correspondence between the two domains. State-of-art methods learn the correspondence using large numbers of unpaired examples from both domains and are based on generative adversarial networks. In order to preserve the semantics of the input image, the adversarial objective is usually combined with a cycle-consistency loss that penalizes incorrect reconstruction of the input image from the translated one. However, if the target mapping is many-to-one, e.g. aerial photos to maps, such a restriction forces the generator to hide information in low-amplitude structured noise that is undetectable by human eye or by the discriminator. In this paper, we show how such self-attacking behavior of unsupervised translation methods affects their performance and provide two defense techniques. We perform a quantitative evaluation of the proposed techniques and show that making the translation model more robust to the self-adversarial attack increases its generation quality and reconstruction reliability and makes the model less sensitive to low-amplitude perturbations. |
Tasks | Adversarial Attack, Image-to-Image Translation, Unsupervised Image-To-Image Translation |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01517v1 |
https://arxiv.org/pdf/1908.01517v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-self-defense-for-cycle-consistent |
Repo | https://github.com/dbash/pix2pix_cyclegan_guess_noise |
Framework | pytorch |
MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis
Title | MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis |
Authors | Animesh Karnewar, Oliver Wang |
Abstract | While Generative Adversarial Networks (GANs) have seen huge successes in image synthesis tasks, they are notoriously difficult to adapt to different datasets, in part due to instability during training and sensitivity to hyperparameters. One commonly accepted reason for this instability is that gradients passing from the discriminator to the generator become uninformative when there isn’t enough overlap in the supports of the real and fake distributions. In this work, we propose the Multi-Scale Gradient Generative Adversarial Network (MSG-GAN), a simple but effective technique for addressing this by allowing the flow of gradients from the discriminator to the generator at multiple scales. This technique provides a stable approach for high resolution image synthesis, and serves as an alternative to the commonly used progressive growing technique. We show that MSG-GAN converges stably on a variety of image datasets of different sizes, resolutions and domains, as well as different types of loss functions and architectures, all with the same set of fixed hyperparameters. When compared to state-of-the-art GANs, our approach matches or exceeds the performance in most of the cases we tried. |
Tasks | Image Generation |
Published | 2019-03-14 |
URL | https://arxiv.org/abs/1903.06048v3 |
https://arxiv.org/pdf/1903.06048v3.pdf | |
PWC | https://paperswithcode.com/paper/msg-gan-multi-scale-gradients-gan-for-more |
Repo | https://github.com/manicman1999/StyleGAN-Tensorflow-2.0 |
Framework | tf |
Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models
Title | Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models |
Authors | Grusha Prasad, Marten van Schijndel, Tal Linzen |
Abstract | Neural language models (LMs) perform well on tasks that require sensitivity to syntactic structure. Drawing on the syntactic priming paradigm from psycholinguistics, we propose a novel technique to analyze the representations that enable such success. By establishing a gradient similarity metric between structures, this technique allows us to reconstruct the organization of the LMs’ syntactic representational space. We use this technique to demonstrate that LSTM LMs’ representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that the LMs track abstract properties of the sentence. |
Tasks | |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10579v1 |
https://arxiv.org/pdf/1909.10579v1.pdf | |
PWC | https://paperswithcode.com/paper/using-priming-to-uncover-the-organization-of |
Repo | https://github.com/grushaprasad/RNN-Priming |
Framework | none |
Invariant Transform Experience Replay
Title | Invariant Transform Experience Replay |
Authors | Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Juan Rojas, Paul Weng |
Abstract | Deep Reinforcement Learning (RL) is a promising approach for adaptive robot control, but its current application to robotics is currently hindered by high sample requirements. To alleviate this issue, we propose to exploit the symmetries present in robotic tasks. Intuitively, symmetries from observed trajectories define transformations that leave the space of feasible RL trajectories invariant and can be used to generate new feasible trajectories, which could be used for training. Based on this data augmentation idea, we formulate a general framework, called Invariant Transform Experience Replay that we present with two techniques. First, Kaleidoscope Experience Replay exploits reflectional symmetries. Second, Goal-augmented Experience Replay takes advantage of lax goal definitions. In the Fetch tasks from OpenAI Gym, our experimental results show significant increases in learning rates and success rates. Particularly, we attain a 13, 3, and 5 times speedup in the pushing, sliding, and pick-and-place tasks respectively in the multi-goal setting. Invariant transformations on RL trajectories are a promising methodology to speed up learning in deep RL. |
Tasks | Data Augmentation |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.10707v4 |
https://arxiv.org/pdf/1909.10707v4.pdf | |
PWC | https://paperswithcode.com/paper/invariant-transform-experience-replay |
Repo | https://github.com/YijiongLin/ITER_KER_GER |
Framework | none |
Use of a Capsule Network to Detect Fake Images and Videos
Title | Use of a Capsule Network to Detect Fake Images and Videos |
Authors | Huy H. Nguyen, Junichi Yamagishi, Isao Echizen |
Abstract | The revolution in computer hardware, especially in graphics processing units and tensor processing units, has enabled significant advances in computer graphics and artificial intelligence algorithms. In addition to their many beneficial applications in daily life and business, computer-generated/manipulated images and videos can be used for malicious purposes that violate security systems, privacy, and social trust. The deepfake phenomenon and its variations enable a normal user to use his or her personal computer to easily create fake videos of anybody from a short real online video. Several countermeasures have been introduced to deal with attacks using such videos. However, most of them are targeted at certain domains and are ineffective when applied to other domains or new attacks. In this paper, we introduce a capsule network that can detect various kinds of attacks, from presentation attacks using printed images and replayed videos to attacks using fake videos created using deep learning. It uses many fewer parameters than traditional convolutional neural networks with similar performance. Moreover, we explain, for the first time ever in the literature, the theory behind the application of capsule networks to the forensics problem through detailed analysis and visualization. |
Tasks | Detect Forged Images And Videos |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12467v2 |
https://arxiv.org/pdf/1910.12467v2.pdf | |
PWC | https://paperswithcode.com/paper/use-of-a-capsule-network-to-detect-fake |
Repo | https://github.com/nii-yamagishilab/Capsule-Forensics |
Framework | pytorch |