Paper Group AWR 73
Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer!. Adaptive Propagation Graph Convolutional Network. Constant-Delay Enumeration for Nondeterministic Document Spanners. The Two-Pass Softmax Algorithm. Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting. Predict …
Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer!
Title | Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer! |
Authors | Claudia Schulz, Damir Juric |
Abstract | A large number of embeddings trained on medical data have emerged, but it remains unclear how well they represent medical terminology, in particular whether the close relationship of semantically similar medical terms is encoded in these embeddings. To date, only small datasets for testing medical term similarity are available, not allowing to draw conclusions about the generalisability of embeddings to the enormous amount of medical terms used by doctors. We present multiple automatically created large-scale medical term similarity datasets and confirm their high quality in an annotation study with doctors. We evaluate state-of-the-art word and contextual embeddings on our new datasets, comparing multiple vector similarity metrics and word vector aggregation techniques. Our results show that current embeddings are limited in their ability to adequately encode medical terms. The novel datasets thus form a challenging new benchmark for the development of medical embeddings able to accurately represent the whole medical terminology. |
Tasks | |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.11082v1 |
https://arxiv.org/pdf/2003.11082v1.pdf | |
PWC | https://paperswithcode.com/paper/can-embeddings-adequately-represent-medical |
Repo | https://github.com/babylonhealth/medisim |
Framework | none |
Adaptive Propagation Graph Convolutional Network
Title | Adaptive Propagation Graph Convolutional Network |
Authors | Indro Spinelli, Simone Scardapane, Aurelio Uncini |
Abstract | Graph convolutional networks (GCNs) are a family of neural network models that perform inference on graph data by interleaving vertex-wise operations and message-passing exchanges across nodes. Concerning the latter, two key questions arise: (i) how to design a differentiable exchange protocol (e.g., a 1-hop Laplacian smoothing in the original GCN), and (ii) how to characterize the trade-off in complexity with respect to the local updates. In this paper, we show that state-of-the-art results can be achieved by adapting the number of communication steps independently at every node. In particular, we endow each node with a halting unit (inspired by Graves’ adaptive computation time) that after every exchange decides whether to continue communicating or not. We show that the proposed adaptive propagation GCN (AP-GCN) achieves superior or similar results to the best proposed models so far on a number of benchmarks, while requiring a small overhead in terms of additional parameters. We also investigate a regularization term to enforce an explicit trade-off between communication and accuracy. The code for the AP-GCN experiments is released as an open-source library. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10306v1 |
https://arxiv.org/pdf/2002.10306v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-propagation-graph-convolutional |
Repo | https://github.com/spindro/AP-GCN |
Framework | pytorch |
Constant-Delay Enumeration for Nondeterministic Document Spanners
Title | Constant-Delay Enumeration for Nondeterministic Document Spanners |
Authors | Antoine Amarilli, Pierre Bourhis, Stefan Mengel, Matthias Niewerth |
Abstract | We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the results with a small delay between any two consecutive results. Our goal is to have an algorithm which is tractable in combined complexity, i.e., in the sizes of the input document and the VA; while ensuring the best possible data complexity bounds in the input document size, i.e., constant delay in the document size. Several recent works at PODS’18 proposed such algorithms but with linear delay in the document size or with an exponential dependency in size of the (generally nondeterministic) input VA. In particular, Florenzano et al. suggest that our desired runtime guarantees cannot be met for general sequential VAs. We refute this and show that, given a nondeterministic sequential VA and an input document, we can enumerate the mappings of the VA on the document with the following bounds: the preprocessing is linear in the document size and polynomial in the size of the VA, and the delay is independent of the document and polynomial in the size of the VA. The resulting algorithm thus achieves tractability in combined complexity and the best possible data complexity bounds. Moreover, it is rather easy to describe, in particular for the restricted case of so-called extended VAs. Finally, we evaluate our algorithm empirically using a prototype implementation. |
Tasks | |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02576v1 |
https://arxiv.org/pdf/2003.02576v1.pdf | |
PWC | https://paperswithcode.com/paper/constant-delay-enumeration-for-1 |
Repo | https://github.com/PoDMR/enum-spanner-rs |
Framework | none |
The Two-Pass Softmax Algorithm
Title | The Two-Pass Softmax Algorithm |
Authors | Marat Dukhan, Artsiom Ablavatski |
Abstract | The softmax (also called softargmax) function is widely used in machine learning models to normalize real-valued scores into a probability distribution. To avoid floating-point overflow, the softmax function is conventionally implemented in three passes: the first pass to compute the normalization constant, and two other passes to compute outputs from normalized inputs. We analyze two variants of the Three-Pass algorithm and demonstrate that in a well-optimized implementation on HPC-class processors performance of all three passes is limited by memory bandwidth. We then present a novel algorithm for softmax computation in just two passes. The proposed Two-Pass algorithm avoids both numerical overflow and the extra normalization pass by employing an exotic representation for intermediate values, where each value is represented as a pair of floating-point numbers: one representing the “mantissa” and another representing the “exponent”. Performance evaluation demonstrates that on out-of-cache inputs on an Intel Skylake-X processor the new Two-Pass algorithm outperforms the traditional Three-Pass algorithm by up to 28% in AVX512 implementation, and by up to 18% in AVX2 implementation. The proposed Two-Pass algorithm also outperforms the traditional Three-Pass algorithm on Intel Broadwell and AMD Zen 2 processors. To foster reproducibility, we released an open-source implementation of the new Two-Pass Softmax algorithm and other experiments in this paper as a part of XNNPACK library at GitHub.com/google/XNNPACK. |
Tasks | |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04438v1 |
https://arxiv.org/pdf/2001.04438v1.pdf | |
PWC | https://paperswithcode.com/paper/the-two-pass-softmax-algorithm |
Repo | https://github.com/google/XNNPACK |
Framework | tf |
Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting
Title | Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting |
Authors | Qiquan Shi, Jiaming Yin, Jiajun Cai, Andrzej Cichocki, Tatsuya Yokota, Lei Chen, Mingxuan Yuan, Jia Zeng |
Abstract | This work proposes a novel approach for multiple time series forecasting. At first, multi-way delay embedding transform (MDT) is employed to represent time series as low-rank block Hankel tensors (BHT). Then, the higher-order tensors are projected to compressed core tensors by applying Tucker decomposition. At the same time, the generalized tensor Autoregressive Integrated Moving Average (ARIMA) is explicitly used on consecutive core tensors to predict future samples. In this manner, the proposed approach tactically incorporates the unique advantages of MDT tensorization (to exploit mutual correlations) and tensor ARIMA coupled with low-rank Tucker decomposition into a unified framework. This framework exploits the low-rank structure of block Hankel tensors in the embedded space and captures the intrinsic correlations among multiple TS, which thus can improve the forecasting results, especially for multiple short time series. Experiments conducted on three public datasets and two industrial datasets verify that the proposed BHT-ARIMA effectively improves forecasting accuracy and reduces computational cost compared with the state-of-the-art methods. |
Tasks | Time Series, Time Series Forecasting |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.12135v1 |
https://arxiv.org/pdf/2002.12135v1.pdf | |
PWC | https://paperswithcode.com/paper/block-hankel-tensor-arima-for-multiple-short |
Repo | https://github.com/yokotatsuya/BHT-ARIMA |
Framework | none |
Predictive analysis of Bitcoin price considering social sentiments
Title | Predictive analysis of Bitcoin price considering social sentiments |
Authors | Pratikkumar Prajapati |
Abstract | We report on the use of sentiment analysis on news and social media to analyze and predict the price of Bitcoin. Bitcoin is the leading cryptocurrency and has the highest market capitalization among digital currencies. Predicting Bitcoin values may help understand and predict potential market movement and future growth of the technology. Unlike (mostly) repeating phenomena like weather, cryptocurrency values do not follow a repeating pattern and mere past value of Bitcoin does not reveal any secret of future Bitcoin value. Humans follow general sentiments and technical analysis to invest in the market. Hence considering people’s sentiment can give a good degree of prediction. We focus on using social sentiment as a feature to predict future Bitcoin value, and in particular, consider Google News and Reddit posts. We find that social sentiment gives a good estimate of how future Bitcoin values may move. We achieve the lowest test RMSE of 434.87 using an LSTM that takes as inputs the historical price of various cryptocurrencies, the sentiment of news articles and the sentiment of Reddit posts. |
Tasks | Sentiment Analysis |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.10343v1 |
https://arxiv.org/pdf/2001.10343v1.pdf | |
PWC | https://paperswithcode.com/paper/predictive-analysis-of-bitcoin-price |
Repo | https://github.com/pratikpv/predicting_bitcoin_market |
Framework | none |
Semantically-Guided Representation Learning for Self-Supervised Monocular Depth
Title | Semantically-Guided Representation Learning for Self-Supervised Monocular Depth |
Authors | Vitor Guizilini, Rui Hou, Jie Li, Rares Ambrus, Adrien Gaidon |
Abstract | Self-supervised learning is showing great promise for monocular depth estimation, using geometry as the only source of supervision. Depth networks are indeed capable of learning representations that relate visual appearance to 3D properties by implicitly leveraging category-level patterns. In this work we investigate how to leverage more directly this semantic structure to guide geometric representation learning, while remaining in the self-supervised regime. Instead of using semantic labels and proxy losses in a multi-task approach, we propose a new architecture leveraging fixed pretrained semantic segmentation networks to guide self-supervised representation learning via pixel-adaptive convolutions. Furthermore, we propose a two-stage training process to overcome a common semantic bias on dynamic objects via resampling. Our method improves upon the state of the art for self-supervised monocular depth prediction over all pixels, fine-grained details, and per semantic categories. |
Tasks | Depth Estimation, Monocular Depth Estimation, Representation Learning, Semantic Segmentation |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12319v1 |
https://arxiv.org/pdf/2002.12319v1.pdf | |
PWC | https://paperswithcode.com/paper/semantically-guided-representation-learning-1 |
Repo | https://github.com/TRI-ML/packnet-sfm |
Framework | none |
PM2.5-GNN: A Domain Knowledge Enhanced Graph Neural Network For PM2.5 Forecasting
Title | PM2.5-GNN: A Domain Knowledge Enhanced Graph Neural Network For PM2.5 Forecasting |
Authors | Shuo Wang, Yanran Li, Jiang Zhang, Qingye Meng, Lingwei Meng, Fei Gao |
Abstract | When predicting PM2.5 concentrations, it is necessary to consider complex information sources since the concentrations are influenced by various factors within a long period. In this paper, we identify a set of critical domain knowledge for PM2.5 forecasting and develop a novel graph based model, PM2.5-GNN, being capable of capturing long-term dependencies. On a real-world dataset, we validate the effectiveness of the proposed model and examine its abilities of capturing both fine-grained and long-term influences in PM2.5 process. The proposed PM2.5-GNN has also been deployed online to provide free forecasting service. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.12898v1 |
https://arxiv.org/pdf/2002.12898v1.pdf | |
PWC | https://paperswithcode.com/paper/pm25-gnn-a-domain-knowledge-enhanced-graph |
Repo | https://github.com/shawnwang-tech/PM2.5-GNN |
Framework | pytorch |
Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem
Title | Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem |
Authors | Liu Liu, Dylan Campbell, Hongdong Li, Dingfu Zhou, Xibin Song, Ruigang Yang |
Abstract | Conventional absolute camera pose via a Perspective-n-Point (PnP) solver often assumes that the correspondences between 2D image pixels and 3D points are given. When the correspondences between 2D and 3D points are not known a priori, the task becomes the much more challenging blind PnP problem. This paper proposes a deep CNN model which simultaneously solves for both the 6-DoF absolute camera pose and 2D–3D correspondences. Our model comprises three neural modules connected in sequence. First, a two-stream PointNet-inspired network is applied directly to both the 2D image keypoints and the 3D scene points in order to extract discriminative point-wise features harnessing both local and contextual information. Second, a global feature matching module is employed to estimate a matchability matrix among all 2D–3D pairs. Third, the obtained matchability matrix is fed into a classification module to disambiguate inlier matches. The entire network is trained end-to-end, followed by a robust model fitting (P3P-RANSAC) at test time only to recover the 6-DoF camera pose. Extensive tests on both real and simulated data have shown that our method substantially outperforms existing approaches, and is capable of processing thousands of points a second with the state-of-the-art accuracy. |
Tasks | |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.06752v1 |
https://arxiv.org/pdf/2003.06752v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-2d-3d-correspondences-to-solve-the |
Repo | https://github.com/Liumouliu/Deep_blind_PnP |
Framework | none |
Torch-Struct: Deep Structured Prediction Library
Title | Torch-Struct: Deep Structured Prediction Library |
Authors | Alexander M. Rush |
Abstract | The literature on structured prediction for NLP describes a rich collection of distributions and algorithms over sequences, segmentations, alignments, and trees; however, these algorithms are difficult to utilize in deep learning frameworks. We introduce Torch-Struct, a library for structured prediction designed to take advantage of and integrate with vectorized, auto-differentiation based frameworks. Torch-Struct includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API that connects to any deep learning model. The library utilizes batched, vectorized operations and exploits auto-differentiation to produce readable, fast, and testable code. Internally, we also include a number of general-purpose optimizations to provide cross-algorithm efficiency. Experiments show significant performance gains over fast baselines and case-studies demonstrate the benefits of the library. Torch-Struct is available at https://github.com/harvardnlp/pytorch-struct. |
Tasks | Structured Prediction |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00876v1 |
https://arxiv.org/pdf/2002.00876v1.pdf | |
PWC | https://paperswithcode.com/paper/torch-struct-deep-structured-prediction |
Repo | https://github.com/harvardnlp/pytorch-struct |
Framework | pytorch |
FeatureNMS: Non-Maximum Suppression by Learning Feature Embeddings
Title | FeatureNMS: Non-Maximum Suppression by Learning Feature Embeddings |
Authors | Niels Ole Salscheider |
Abstract | Most state of the art object detectors output multiple detections per object. The duplicates are removed in a post-processing step called Non-Maximum Suppression. Classical Non-Maximum Suppression has shortcomings in scenes that contain objects with high overlap: The idea of this heuristic is that a high bounding box overlap corresponds to a high probability of having a duplicate. We propose FeatureNMS to solve this problem. FeatureNMS recognizes duplicates not only based on the intersection over union between bounding boxes, but also based on the difference of feature vectors. These feature vectors can encode more information like visual appearance. Our approach outperforms classical NMS and derived approaches and achieves state of the art performance. |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07662v1 |
https://arxiv.org/pdf/2002.07662v1.pdf | |
PWC | https://paperswithcode.com/paper/featurenms-non-maximum-suppression-by |
Repo | https://github.com/fzi-forschungszentrum-informatik/NNAD |
Framework | none |
Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization
Title | Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization |
Authors | Lukas P. Fröhlich, Edgar D. Klenske, Julia Vinogradska, Christian Daniel, Melanie N. Zeilinger |
Abstract | We consider the problem of robust optimization within the well-established Bayesian optimization (BO) framework. While BO is intrinsically robust to noisy evaluations of the objective function, standard approaches do not consider the case of uncertainty about the input parameters. In this paper, we propose Noisy-Input Entropy Search (NES), a novel information-theoretic acquisition function that is designed to find robust optima for problems with both input and measurement noise. NES is based on the key insight that the robust objective in many cases can be modeled as a Gaussian process, however, it cannot be observed directly. We evaluate NES on several benchmark problems from the optimization literature and from engineering. The results show that NES reliably finds robust optima, outperforming existing methods from the literature on all benchmarks. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02820v1 |
https://arxiv.org/pdf/2002.02820v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-input-entropy-search-for-efficient |
Repo | https://github.com/boschresearch/NoisyInputEntropySearch |
Framework | none |
Unsupervised Question Decomposition for Question Answering
Title | Unsupervised Question Decomposition for Question Answering |
Authors | Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela |
Abstract | We aim to improve question answering (QA) by decomposing hard questions into easier sub-questions that existing QA systems can answer. Since collecting labeled decompositions is cumbersome, we propose an unsupervised approach to produce sub-questions. Specifically, by leveraging >10M questions from Common Crawl, we learn to map from the distribution of multi-hop questions to the distribution of single-hop sub-questions. We answer sub-questions with an off-the-shelf QA model and incorporate the resulting answers in a downstream, multi-hop QA system. On a popular multi-hop QA dataset, HotpotQA, we show large improvements over a strong baseline, especially on adversarial and out-of-domain questions. Our method is generally applicable and automatically learns to decompose questions of different classes, while matching the performance of decomposition methods that rely heavily on hand-engineering and annotation. |
Tasks | Question Answering |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.09758v2 |
https://arxiv.org/pdf/2002.09758v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-question-decomposition-for |
Repo | https://github.com/facebookresearch/UnsupervisedDecomposition |
Framework | pytorch |
ABSent: Cross-Lingual Sentence Representation Mapping with Bidirectional GANs
Title | ABSent: Cross-Lingual Sentence Representation Mapping with Bidirectional GANs |
Authors | Zuohui Fu, Yikun Xian, Shijie Geng, Yingqiang Ge, Yuting Wang, Xin Dong, Guang Wang, Gerard de Melo |
Abstract | A number of cross-lingual transfer learning approaches based on neural networks have been proposed for the case when large amounts of parallel text are at our disposal. However, in many real-world settings, the size of parallel annotated training data is restricted. Additionally, prior cross-lingual mapping research has mainly focused on the word level. This raises the question of whether such techniques can also be applied to effortlessly obtain cross-lingually aligned sentence representations. To this end, we propose an Adversarial Bi-directional Sentence Embedding Mapping (ABSent) framework, which learns mappings of cross-lingual sentence representations from limited quantities of parallel data. |
Tasks | Cross-Lingual Transfer, Sentence Embedding, Transfer Learning |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2001.11121v1 |
https://arxiv.org/pdf/2001.11121v1.pdf | |
PWC | https://paperswithcode.com/paper/absent-cross-lingual-sentence-representation |
Repo | https://github.com/zuohuif/ABSent |
Framework | none |
Synthetic Magnetic Resonance Images with Generative Adversarial Networks
Title | Synthetic Magnetic Resonance Images with Generative Adversarial Networks |
Authors | Antoine Delplace |
Abstract | Data augmentation is essential for medical research to increase the size of training datasets and achieve better results. In this work, we experiment three GAN architectures with different loss functions to generate new brain MRIs. The results show the importance of hyperparameter tuning and the use of mini-batch similarity layer in the Discriminator and gradient penalty in the loss function to achieve convergence with high quality and realism. Moreover, huge computation time is needed to generate indistinguishable images from the original dataset. |
Tasks | Data Augmentation |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2002.02527v1 |
https://arxiv.org/pdf/2002.02527v1.pdf | |
PWC | https://paperswithcode.com/paper/synthetic-magnetic-resonance-images-with |
Repo | https://github.com/antoinedelplace/MRI-Generation |
Framework | tf |