Paper Group AWR 22
Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning. STConvS2S: Spatiotemporal Convolutional Sequence to Sequence Network for Weather Forecasting. Sensitivity Analysis of Deep Neural Networks. Scaling and Benchmarking Self-Supervised Visual Representation Learning. Global Reactions to the Cambridge Analytica …
Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning
Title | Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning |
Authors | Titipat Achakulvisut, Chandra Bhagavatula, Daniel Acuna, Konrad Kording |
Abstract | Claims are a fundamental unit of scientific discourse. The exponential growth in the number of scientific publications makes automatic claim extraction an important problem for researchers who are overwhelmed by this information overload. Such an automated claim extraction system is useful for both manual and programmatic exploration of scientific knowledge. In this paper, we introduce a new dataset of 1,500 scientific abstracts from the biomedical domain with expert annotations for each sentence indicating whether the sentence presents a scientific claim. We introduce a new model for claim extraction and compare it to several baseline models including rule-based and deep learning techniques. Moreover, we show that using a transfer learning approach with a fine-tuning step allows us to improve performance from a large discourse-annotated dataset. Our final model increases F1-score by over 14 percent points compared to a baseline model without transfer learning. We release a publicly accessible tool for discourse and claims prediction along with an annotation tool. We discuss further applications beyond biomedical literature. |
Tasks | Transfer Learning |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00962v2 |
https://arxiv.org/pdf/1907.00962v2.pdf | |
PWC | https://paperswithcode.com/paper/claim-extraction-in-biomedical-publications |
Repo | https://github.com/titipata/detecting-scientific-claim |
Framework | pytorch |
STConvS2S: Spatiotemporal Convolutional Sequence to Sequence Network for Weather Forecasting
Title | STConvS2S: Spatiotemporal Convolutional Sequence to Sequence Network for Weather Forecasting |
Authors | Rafaela C. Nascimento, Yania M. Souto, Eduardo Ogasawara, Fabio Porto, Eduardo Bezerra |
Abstract | Applying machine learning models to meteorological data brings many opportunities to the Geosciences field, such as predicting future weather conditions more accurately. In recent years, modeling meteorological data with deep neural networks has become a relevant area of investigation. These works apply either recurrent neural networks (RNNs) or some hybrid approach mixing RNNs and convolutional neural networks (CNNs). In this work, we propose STConvS2S (short for Spatiotemporal Convolutional Sequence to Sequence Network), a new deep learning architecture built for learning both spatial and temporal data dependencies in weather data, using fully convolutional layers. Computational experiments using observations of air temperature and rainfall show that our architecture captures spatiotemporal context and outperforms baseline models and the state-of-art architecture for weather forecasting task. |
Tasks | Weather Forecasting |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.00134v3 |
https://arxiv.org/pdf/1912.00134v3.pdf | |
PWC | https://paperswithcode.com/paper/stconvs2s-spatiotemporal-convolutional |
Repo | https://github.com/MLRG-CEFET-RJ/stconvs2s |
Framework | pytorch |
Sensitivity Analysis of Deep Neural Networks
Title | Sensitivity Analysis of Deep Neural Networks |
Authors | Hai Shu, Hongtu Zhu |
Abstract | Deep neural networks (DNNs) have achieved superior performance in various prediction tasks, but can be very vulnerable to adversarial examples or perturbations. Therefore, it is crucial to measure the sensitivity of DNNs to various forms of perturbations in real applications. We introduce a novel perturbation manifold and its associated influence measure to quantify the effects of various perturbations on DNN classifiers. Such perturbations include various external and internal perturbations to input samples and network parameters. The proposed measure is motivated by information geometry and provides desirable invariance properties. We demonstrate that our influence measure is useful for four model building tasks: detecting potential ‘outliers’, analyzing the sensitivity of model architectures, comparing network sensitivity between training and test sets, and locating vulnerable areas. Experiments show reasonably good performance of the proposed measure for the popular DNN models ResNet50 and DenseNet121 on CIFAR10 and MNIST datasets. |
Tasks | |
Published | 2019-01-22 |
URL | http://arxiv.org/abs/1901.07152v1 |
http://arxiv.org/pdf/1901.07152v1.pdf | |
PWC | https://paperswithcode.com/paper/sensitivity-analysis-of-deep-neural-networks |
Repo | https://github.com/shu-hai/SA_DNN |
Framework | tf |
Scaling and Benchmarking Self-Supervised Visual Representation Learning
Title | Scaling and Benchmarking Self-Supervised Visual Representation Learning |
Authors | Priya Goyal, Dhruv Mahajan, Abhinav Gupta, Ishan Misra |
Abstract | Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning - the ability to scale to large amount of data because self-supervision requires no manual labels. In this work, we revisit this principle and scale two popular self-supervised approaches to 100 million images. We show that by scaling on various axes (including data size and problem ‘hardness’), one can largely match or even exceed the performance of supervised pre-training on a variety of tasks such as object detection, surface normal estimation (3D) and visual navigation using reinforcement learning. Scaling these methods also provides many interesting insights into the limitations of current self-supervised techniques and evaluations. We conclude that current self-supervised methods are not ‘hard’ enough to take full advantage of large scale data and do not seem to learn effective high level semantic representations. We also introduce an extensive benchmark across 9 different datasets and tasks. We believe that such a benchmark along with comparable evaluation settings is necessary to make meaningful progress. Code is at: this https URL. |
Tasks | Object Detection, Representation Learning, Visual Navigation |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01235v2 |
https://arxiv.org/pdf/1905.01235v2.pdf | |
PWC | https://paperswithcode.com/paper/scaling-and-benchmarking-self-supervised |
Repo | https://github.com/facebookresearch/fair_self_supervision_benchmark |
Framework | pytorch |
Global Reactions to the Cambridge Analytica Scandal: An Inter-Language Social Media Study
Title | Global Reactions to the Cambridge Analytica Scandal: An Inter-Language Social Media Study |
Authors | Felipe González, Yihan Yu, Andrea Figueroa, Claudia López, Cecilia Aragon |
Abstract | Currently, there is a limited understanding of how data privacy concerns vary across the world. The Cambridge Analytica scandal triggered a wide-ranging discussion on social media about user data collection and use practices. We conducted an inter-language study of this online conversation to compare how people speaking different languages react to data privacy breaches. We collected tweets about the scandal written in Spanish and English between April and July 2018. We used the Meaning Extraction Method in both datasets to identify their main topics. They reveal a similar emphasis on Zuckerberg’s hearing in the US Congress and the scandal’s impact on political issues. However, our analysis also shows that while English speakers tend to attribute responsibilities to companies, Spanish speakers are more likely to connect them to people. These findings show the potential of inter-language comparisons of social media data to deepen the understanding of cultural differences in data privacy perspectives. |
Tasks | |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06213v1 |
https://arxiv.org/pdf/1910.06213v1.pdf | |
PWC | https://paperswithcode.com/paper/global-reactions-to-the-cambridge-analytica |
Repo | https://github.com/gonzalezf/LA-WEB-Paper |
Framework | none |
Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation
Title | Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation |
Authors | Ori Shapira, David Gabay, Yang Gao, Hadar Ronen, Ramakanth Pasunuru, Mohit Bansal, Yael Amsterdamer, Ido Dagan |
Abstract | Conducting a manual evaluation is considered an essential part of summary evaluation methodology. Traditionally, the Pyramid protocol, which exhaustively compares system summaries to references, has been perceived as very reliable, providing objective scores. Yet, due to the high cost of the Pyramid method and the required expertise, researchers resorted to cheaper and less thorough manual evaluation methods, such as Responsiveness and pairwise comparison, attainable via crowdsourcing. We revisit the Pyramid approach, proposing a lightweight sampling-based version that is crowdsourcable. We analyze the performance of our method in comparison to original expert-based Pyramid evaluations, showing higher correlation relative to the common Responsiveness method. We release our crowdsourced Summary-Content-Units, along with all crowdsourcing scripts, for future evaluations. |
Tasks | |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05929v1 |
http://arxiv.org/pdf/1904.05929v1.pdf | |
PWC | https://paperswithcode.com/paper/crowdsourcing-lightweight-pyramids-for-manual |
Repo | https://github.com/OriShapira/LitePyramids |
Framework | none |
Discriminative Adversarial Domain Adaptation
Title | Discriminative Adversarial Domain Adaptation |
Authors | Hui Tang, Kui Jia |
Abstract | Given labeled instances on a source domain and unlabeled ones on a target domain, unsupervised domain adaptation aims to learn a task classifier that can well classify target instances. Recent advances rely on domain-adversarial training of deep networks to learn domain-invariant features. However, due to an issue of mode collapse induced by the separate design of task and domain classifiers, these methods are limited in aligning the joint distributions of feature and category across domains. To overcome it, we propose a novel adversarial learning method termed Discriminative Adversarial Domain Adaptation (DADA). Based on an integrated category and domain classifier, DADA has a novel adversarial objective that encourages a mutually inhibitory relation between category and domain predictions for any input instance. We show that under practical conditions, it defines a minimax game that can promote the joint distribution alignment. Except for the traditional closed set domain adaptation, we also extend DADA for extremely challenging problem settings of partial and open set domain adaptation. Experiments show the efficacy of our proposed methods and we achieve the new state of the art for all the three settings on benchmark datasets. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12036v2 |
https://arxiv.org/pdf/1911.12036v2.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-adversarial-domain-adaptation |
Repo | https://github.com/huitangtang/DADA-AAAI2020 |
Framework | pytorch |
Quantum Wasserstein Generative Adversarial Networks
Title | Quantum Wasserstein Generative Adversarial Networks |
Authors | Shouvanik Chakrabarti, Yiming Huang, Tongyang Li, Soheil Feizi, Xiaodi Wu |
Abstract | The study of quantum generative models is well-motivated, not only because of its importance in quantum machine learning and quantum chemistry but also because of the perspective of its implementation on near-term quantum machines. Inspired by previous studies on the adversarial training of classical and quantum generative models, we propose the first design of quantum Wasserstein Generative Adversarial Networks (WGANs), which has been shown to improve the robustness and the scalability of the adversarial training of quantum generative models even on noisy quantum hardware. Specifically, we propose a definition of the Wasserstein semimetric between quantum data, which inherits a few key theoretical merits of its classical counterpart. We also demonstrate how to turn the quantum Wasserstein semimetric into a concrete design of quantum WGANs that can be efficiently implemented on quantum machines. Our numerical study, via classical simulation of quantum systems, shows the more robust and scalable numerical performance of our quantum WGANs over other quantum GAN proposals. As a surprising application, our quantum WGAN has been used to generate a 3-qubit quantum circuit of ~50 gates that well approximates a 3-qubit 1-d Hamiltonian simulation circuit that requires over 10k gates using standard techniques. |
Tasks | Quantum Machine Learning |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1911.00111v1 |
https://arxiv.org/pdf/1911.00111v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-wasserstein-generative-adversarial |
Repo | https://github.com/yiminghwang/qWGAN |
Framework | none |
The PlayStation Reinforcement Learning Environment (PSXLE)
Title | The PlayStation Reinforcement Learning Environment (PSXLE) |
Authors | Carlos Purves, Cătălina Cangea, Petar Veličković |
Abstract | We propose a new benchmark environment for evaluating Reinforcement Learning (RL) algorithms: the PlayStation Learning Environment (PSXLE), a PlayStation emulator modified to expose a simple control API that enables rich game-state representations. We argue that the PlayStation serves as a suitable progression for agent evaluation and propose a framework for such an evaluation. We build an action-driven abstraction for a PlayStation game with support for the OpenAI Gym interface and demonstrate its use by running OpenAI Baselines. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.06101v1 |
https://arxiv.org/pdf/1912.06101v1.pdf | |
PWC | https://paperswithcode.com/paper/the-playstation-reinforcement-learning |
Repo | https://github.com/carlospurves/psxle |
Framework | none |
BERT for Joint Intent Classification and Slot Filling
Title | BERT for Joint Intent Classification and Slot Filling |
Authors | Qian Chen, Zhu Zhuo, Wen Wang |
Abstract | Intent classification and slot filling are two essential tasks for natural language understanding. They often suffer from small-scale human-labeled training data, resulting in poor generalization capability, especially for rare words. Recently a new language representation model, BERT (Bidirectional Encoder Representations from Transformers), facilitates pre-training deep bidirectional representations on large-scale unlabeled corpora, and has created state-of-the-art models for a wide variety of natural language processing tasks after simple fine-tuning. However, there has not been much effort on exploring BERT for natural language understanding. In this work, we propose a joint intent classification and slot filling model based on BERT. Experimental results demonstrate that our proposed model achieves significant improvement on intent classification accuracy, slot filling F1, and sentence-level semantic frame accuracy on several public benchmark datasets, compared to the attention-based recurrent neural network models and slot-gated models. |
Tasks | Intent Classification, Slot Filling |
Published | 2019-02-28 |
URL | http://arxiv.org/abs/1902.10909v1 |
http://arxiv.org/pdf/1902.10909v1.pdf | |
PWC | https://paperswithcode.com/paper/bert-for-joint-intent-classification-and-slot |
Repo | https://github.com/asadovsky/nn |
Framework | tf |
Ego-Pose Estimation and Forecasting as Real-Time PD Control
Title | Ego-Pose Estimation and Forecasting as Real-Time PD Control |
Authors | Ye Yuan, Kris Kitani |
Abstract | We propose the use of a proportional-derivative (PD) control based policy learned via reinforcement learning (RL) to estimate and forecast 3D human pose from egocentric videos. The method learns directly from unsegmented egocentric videos and motion capture data consisting of various complex human motions (e.g., crouching, hopping, bending, and motion transitions). We propose a video-conditioned recurrent control technique to forecast physically-valid and stable future motions of arbitrary length. We also introduce a value function based fail-safe mechanism which enables our method to run as a single pass algorithm over the video data. Experiments with both controlled and in-the-wild data show that our approach outperforms previous art in both quantitative metrics and visual quality of the motions, and is also robust enough to transfer directly to real-world scenarios. Additionally, our time analysis shows that the combined use of our pose estimation and forecasting can run at 30 FPS, making it suitable for real-time applications. |
Tasks | Egocentric Pose Estimation, Human Pose Forecasting, Motion Capture, Pose Estimation |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03173v2 |
https://arxiv.org/pdf/1906.03173v2.pdf | |
PWC | https://paperswithcode.com/paper/ego-pose-estimation-and-forecasting-as-real |
Repo | https://github.com/Khrylx/EgoPose |
Framework | pytorch |
See and Read: Detecting Depression Symptoms in Higher Education Students Using Multimodal Social Media Data
Title | See and Read: Detecting Depression Symptoms in Higher Education Students Using Multimodal Social Media Data |
Authors | Paulo Mann, Aline Paes, Elton H. Matsushima |
Abstract | Mental disorders such as depression and anxiety have been increasing at alarming rates in the worldwide population. Notably, the major depressive disorder has become a common problem among higher education students, aggravated, and maybe even occasioned, by the academic pressures they must face. While the reasons for this alarming situation remain unclear (although widely investigated), the student already facing this problem must receive treatment. To that, it is first necessary to screen the symptoms. The traditional way for that is relying on clinical consultations or answering questionnaires. However, nowadays, the data shared at social media is a ubiquitous source that can be used to detect the depression symptoms even when the student is not able to afford or search for professional care. Previous works have already relied on social media data to detect depression on the general population, usually focusing on either posted images or texts or relying on metadata. In this work, we focus on detecting the severity of the depression symptoms in higher education students, by comparing deep learning to feature engineering models induced from both the pictures and their captions posted on Instagram. The experimental results show that students presenting a BDI score higher or equal than 20 can be detected with 0.92 of recall and 0.69 of precision in the best case, reached by a fusion model. Our findings show the potential of large-scale depression screening, which could shed light upon students at-risk. |
Tasks | Feature Engineering |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01131v2 |
https://arxiv.org/pdf/1912.01131v2.pdf | |
PWC | https://paperswithcode.com/paper/see-and-read-detecting-depression-symptoms-in |
Repo | https://github.com/paulomann/ReadOrSee |
Framework | pytorch |
A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models
Title | A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models |
Authors | Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho |
Abstract | Undirected neural sequence models such as BERT (Devlin et al., 2019) have received renewed interest due to their success on discriminative natural language understanding tasks such as question-answering and natural language inference. The problem of generating sequences directly from these models has received relatively little attention, in part because generating from undirected models departs significantly from conventional monotonic generation in directed sequence models. We investigate this problem by proposing a generalized model of sequence generation that unifies decoding in directed and undirected models. The proposed framework models the process of generation rather than the resulting sequence, and under this framework, we derive various neural sequence models as special cases, such as autoregressive, semi-autoregressive, and refinement-based non-autoregressive models. This unification enables us to adapt decoding algorithms originally developed for directed sequence models to undirected sequence models. We demonstrate this by evaluating various handcrafted and learned decoding strategies on a BERT-like machine translation model (Lample & Conneau, 2019). The proposed approach achieves constant-time translation results on par with linear-time translation results from the same undirected sequence model, while both are competitive with the state-of-the-art on WMT’14 English-German translation. |
Tasks | Machine Translation, Natural Language Inference, Question Answering |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12790v2 |
https://arxiv.org/pdf/1905.12790v2.pdf | |
PWC | https://paperswithcode.com/paper/a-generalized-framework-of-sequence |
Repo | https://github.com/nyu-dl/dl4mt-seqgen |
Framework | pytorch |
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
Title | Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA |
Authors | Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach |
Abstract | Many visual scenes contain text that carries crucial information, and it is thus essential to understand text in images for downstream reasoning tasks. For example, a deep water label on a warning sign warns people about the danger in the scene. Recent work has explored the TextVQA task that requires reading and understanding text in images to answer a question. However, existing approaches for TextVQA are mostly based on custom pairwise fusion mechanisms between a pair of two modalities and are restricted to a single prediction step by casting TextVQA as a classification task. In this work, we propose a novel model for the TextVQA task based on a multimodal transformer architecture accompanied by a rich representation for text in images. Our model naturally fuses different modalities homogeneously by embedding them into a common semantic space where self-attention is applied to model inter- and intra- modality context. Furthermore, it enables iterative answer decoding with a dynamic pointer network, allowing the model to form an answer through multi-step prediction instead of one-step classification. Our model outperforms existing approaches on three benchmark datasets for the TextVQA task by a large margin. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06258v3 |
https://arxiv.org/pdf/1911.06258v3.pdf | |
PWC | https://paperswithcode.com/paper/iterative-answer-prediction-with-pointer |
Repo | https://github.com/xinke-wang/Awesome-Text-VQA |
Framework | none |
3D Appearance Super-Resolution with Deep Learning
Title | 3D Appearance Super-Resolution with Deep Learning |
Authors | Yawei Li, Vagia Tsiminaki, Radu Timofte, Marc Pollefeys, Luc van Gool |
Abstract | We tackle the problem of retrieving high-resolution (HR) texture maps of objects that are captured from multiple view points. In the multi-view case, model-based super-resolution (SR) methods have been recently proved to recover high quality texture maps. On the other hand, the advent of deep learning-based methods has already a significant impact on the problem of video and image SR. Yet, a deep learning-based approach to super-resolve the appearance of 3D objects is still missing. The main limitation of exploiting the power of deep learning techniques in the multi-view case is the lack of data. We introduce a 3D appearance SR (3DASR) dataset based on the existing ETH3D [42], SyB3R [31], MiddleBury, and our Collection of 3D scenes from TUM [21], Fountain [51] and Relief [53]. We provide the high- and low-resolution texture maps, the 3D geometric model, images and projection matrices. We exploit the power of 2D learning-based SR methods and design networks suitable for the 3D multi-view case. We incorporate the geometric information by introducing normal maps and further improve the learning process. Experimental results demonstrate that our proposed networks successfully incorporate the 3D geometric information and super-resolve the texture maps. |
Tasks | Super-Resolution |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00925v2 |
https://arxiv.org/pdf/1906.00925v2.pdf | |
PWC | https://paperswithcode.com/paper/190600925 |
Repo | https://github.com/ofsoundof/3D_Appearance_SR |
Framework | pytorch |