February 1, 2020

3196 words 16 mins read

Paper Group AWR 307

Paper Group AWR 307

Geometric Graph Convolutional Neural Networks. Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations. Recurrent Space-time Graph Neural Networks. Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification. One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Cla …

Geometric Graph Convolutional Neural Networks

Title Geometric Graph Convolutional Neural Networks
Authors Przemysław Spurek, Tomasz Danel, Jacek Tabor, Marek Śmieja, Łukasz Struski, Agnieszka Słowik, Łukasz Maziarka
Abstract Graph Convolutional Networks (GCNs) have recently become the primary choice for learning from graph-structured data, superseding hash fingerprints in representing chemical compounds. However, GCNs lack the ability to take into account the ordering of node neighbors, even when there is a geometric interpretation of the graph vertices that provides an order based on their spatial positions. To remedy this issue, we propose Geometric Graph Convolutional Network (geo-GCN) which uses spatial features to efficiently learn from graphs that can be naturally located in space. Our contribution is threefold: we propose a GCN-inspired architecture which (i) leverages node positions, (ii) is a proper generalisation of both GCNs and Convolutional Neural Networks (CNNs), (iii) benefits from augmentation which further improves the performance and assures invariance with respect to the desired properties. Empirically, geo-GCN outperforms state-of-the-art graph-based methods on image classification and chemical tasks.
Tasks Image Classification
Published 2019-09-11
URL https://arxiv.org/abs/1909.05310v1
PDF https://arxiv.org/pdf/1909.05310v1.pdf
PWC https://paperswithcode.com/paper/geometric-graph-convolutional-neural-networks
Repo https://github.com/gmum/geo-gcn
Framework pytorch

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Title Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations
Authors Aarne Talman, Antti Suni, Hande Celikkanat, Sofoklis Kakouros, Jörg Tiedemann, Martti Vainio
Abstract In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models are publicly available.
Tasks Prosody Prediction
Published 2019-08-06
URL https://arxiv.org/abs/1908.02262v1
PDF https://arxiv.org/pdf/1908.02262v1.pdf
PWC https://paperswithcode.com/paper/predicting-prosodic-prominence-from-text-with
Repo https://github.com/Helsinki-NLP/prosody
Framework pytorch

Recurrent Space-time Graph Neural Networks

Title Recurrent Space-time Graph Neural Networks
Authors Andrei Nicolicioiu, Iulia Duta, Marius Leordeanu
Abstract Learning in the space-time domain remains a very challenging problem in machine learning and computer vision. Current computational models for understanding spatio-temporal visual data are heavily rooted in the classical single-image based paradigm. It is not yet well understood how to integrate information in space and time into a single, general model. We propose a neural graph model, recurrent in space and time, suitable for capturing both the local appearance and the complex higher-level interactions of different entities and objects within the changing world scene. Nodes and edges in our graph have dedicated neural networks for processing information. Nodes operate over features extracted from local parts in space and time and previous memory states. Edges process messages between connected nodes at different locations and spatial scales or between past and present time. Messages are passed iteratively in order to transmit information globally and establish long range interactions. Our model is general and could learn to recognize a variety of high level spatio-temporal concepts and be applied to different learning tasks. We demonstrate, through extensive experiments and ablation studies, that our model outperforms strong baselines and top published methods on recognizing complex activities in video. Moreover, we obtain state-of-the-art performance on the challenging Something-Something human-object interaction dataset.
Tasks Action Recognition In Videos, Human-Object Interaction Detection, Video Understanding
Published 2019-04-11
URL https://arxiv.org/abs/1904.05582v4
PDF https://arxiv.org/pdf/1904.05582v4.pdf
PWC https://paperswithcode.com/paper/recurrent-space-time-graphs-for-video
Repo https://github.com/IuliaDuta/RSTG
Framework tf

Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification

Title Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification
Authors Xia Yuan, Liao xiaoli, Li Shilei, Shi Qinwen, Wu Jinfa, Li Ke
Abstract The core of evidence-based medicine is to read and analyze numerous papers in the medical literature on a specific clinical problem and summarize the authoritative answers to that problem. Currently, to formulate a clear and focused clinical problem, the popular PICO framework is usually adopted, in which each clinical problem is considered to consist of four parts: patient/problem (P), intervention (I), comparison (C) and outcome (O). In this study, we compared several classification models that are commonly used in traditional machine learning. Next, we developed a multitask classification model based on a soft-margin SVM with a specialized feature engineering method that combines 1-2gram analysis with TF-IDF analysis. Finally, we trained and tested several generic models on an open-source data set from BioNLP 2018. The results show that the proposed multitask SVM classification model based on 1-2gram TF-IDF features exhibits the best performance among the tested models.
Tasks Feature Engineering
Published 2019-01-24
URL http://arxiv.org/abs/1901.08351v1
PDF http://arxiv.org/pdf/1901.08351v1.pdf
PWC https://paperswithcode.com/paper/extracting-pico-elements-from-rct-abstracts
Repo https://github.com/brucexia6116/PICO_elements_classification
Framework none

One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Classification

Title One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Classification
Authors Dhrubojyoti Roy, Sangeeta Srivastava, Aditya Kusupati, Pranshu Jain, Manik Varma, Anish Arora
Abstract Edge sensing with micro-power pulse-Doppler radars is an emergent domain in monitoring and surveillance with several smart city applications. Existing solutions for the clutter versus multi-source radar classification task are limited in terms of either accuracy or efficiency, and in some cases, struggle with a trade-off between false alarms and recall of sources. We find that this problem can be resolved by learning the classifier across multiple time-scales. We propose a multi-scale, cascaded recurrent neural network architecture, MSC-RNN, comprised of an efficient multi-instance learning (MIL) Recurrent Neural Network (RNN) for clutter discrimination at a lower tier, and a more complex RNN classifier for source classification at the upper tier. By controlling the invocation of the upper RNN with the help of the lower tier conditionally, MSC-RNN achieves an overall accuracy of 0.972. Our approach holistically improves the accuracy and per-class recalls over ML models suitable for radar inferencing. Notably, we outperform cross-domain handcrafted feature engineering with time-domain deep feature learning, while also being up to $\sim$3$\times$ more efficient than a competitive solution.
Tasks Feature Engineering
Published 2019-09-06
URL https://arxiv.org/abs/1909.03082v1
PDF https://arxiv.org/pdf/1909.03082v1.pdf
PWC https://paperswithcode.com/paper/one-size-does-not-fit-all-multi-scale
Repo https://github.com/dhruboroy29/MSCRNN
Framework tf

Unsupervised Inductive Graph-Level Representation Learning via Graph-Graph Proximity

Title Unsupervised Inductive Graph-Level Representation Learning via Graph-Graph Proximity
Authors Yunsheng Bai, Hao Ding, Yang Qiao, Agustin Marinovic, Ken Gu, Ting Chen, Yizhou Sun, Wei Wang
Abstract We introduce a novel approach to graph-level representation learning, which is to embed an entire graph into a vector space where the embeddings of two graphs preserve their graph-graph proximity. Our approach, UGRAPHEMB, is a general framework that provides a novel means to performing graph-level embedding in a completely unsupervised and inductive manner. The learned neural network can be considered as a function that receives any graph as input, either seen or unseen in the training set, and transforms it into an embedding. A novel graph-level embedding generation mechanism called Multi-Scale Node Attention (MSNA), is proposed. Experiments on five real graph datasets show that UGRAPHEMB achieves competitive accuracy in the tasks of graph classification, similarity ranking, and graph visualization.
Tasks Graph Classification, Graph Embedding, Graph Similarity, Representation Learning
Published 2019-04-01
URL https://arxiv.org/abs/1904.01098v2
PDF https://arxiv.org/pdf/1904.01098v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-inductive-whole-graph-embedding
Repo https://github.com/yunshengb/UGraphEmb
Framework tf

Robust Invisible Video Watermarking with Attention

Title Robust Invisible Video Watermarking with Attention
Authors Kevin Alex Zhang, Lei Xu, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Abstract The goal of video watermarking is to embed a message within a video file in a way such that it minimally impacts the viewing experience but can be recovered even if the video is redistributed and modified, allowing media producers to assert ownership over their content. This paper presents RivaGAN, a novel architecture for robust video watermarking which features a custom attention-based mechanism for embedding arbitrary data as well as two independent adversarial networks which critique the video quality and optimize for robustness. Using this technique, we are able to achieve state-of-the-art results in deep learning-based video watermarking and produce watermarked videos which have minimal visual distortion and are robust against common video processing operations.
Tasks
Published 2019-09-03
URL https://arxiv.org/abs/1909.01285v1
PDF https://arxiv.org/pdf/1909.01285v1.pdf
PWC https://paperswithcode.com/paper/robust-invisible-video-watermarking-with
Repo https://github.com/DAI-Lab/RivaGAN
Framework pytorch

SemBleu: A Robust Metric for AMR Parsing Evaluation

Title SemBleu: A Robust Metric for AMR Parsing Evaluation
Authors Linfeng Song, Daniel Gildea
Abstract Evaluating AMR parsing accuracy involves comparing pairs of AMR graphs. The major evaluation metric, SMATCH (Cai and Knight, 2013), searches for one-to-one mappings between the nodes of two AMRs with a greedy hill-climbing algorithm, which leads to search errors. We propose SEMBLEU, a robust metric that extends BLEU (Papineni et al., 2002) to AMRs. It does not suffer from search errors and considers non-local correspondences in addition to local ones. SEMBLEU is fully content-driven and punishes situations where a system’s output does not preserve most information from the input. Preliminary experiments on both sentence and corpus levels show that SEMBLEU has slightly higher consistency with human judgments than SMATCH. Our code is available at http://github.com/freesunshine0316/sembleu.
Tasks Amr Parsing
Published 2019-05-26
URL https://arxiv.org/abs/1905.10726v2
PDF https://arxiv.org/pdf/1905.10726v2.pdf
PWC https://paperswithcode.com/paper/sembleu-a-robust-metric-for-amr-parsing
Repo https://github.com/freesunshine0316/sembleu
Framework none

Which Contrast Does Matter? Towards a Deep Understanding of MR Contrast using Collaborative GAN

Title Which Contrast Does Matter? Towards a Deep Understanding of MR Contrast using Collaborative GAN
Authors Dongwook Lee, Won-Jin Moon, Jong Chul Ye
Abstract Thanks to the recent success of generative adversarial network (GAN) for image synthesis, there are many exciting GAN approaches that successfully synthesize MR image contrast from other images with different contrasts. These approaches are potentially important for image imputation problems, where complete set of data is often difficult to obtain and image synthesis is one of the key solutions for handling the missing data problem. Unfortunately, the lack of the scalability of the existing GAN-based image translation approaches poses a fundamental challenge to understand the nature of the MR contrast imputation problem: which contrast does matter? Here, we present a systematic approach using Collaborative Generative Adversarial Networks (CollaGAN), which enable the learning of the joint image manifold of multiple MR contrasts to investigate which contrasts are essential. Our experimental results showed that the exogenous contrast from contrast agents is not replaceable, but other endogenous contrast such as T1, T2, etc can be synthesized from other contrast. These findings may give important guidance to the acquisition protocol design for MR in real clinical environment.
Tasks Image Generation, Image Imputation, Imputation
Published 2019-05-10
URL https://arxiv.org/abs/1905.04105v1
PDF https://arxiv.org/pdf/1905.04105v1.pdf
PWC https://paperswithcode.com/paper/which-contrast-does-matter-towards-a-deep
Repo https://github.com/jongcye/CollaGAN_MRI
Framework tf

A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists

Title A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists
Authors Yun He, Jianling Wang, Wei Niu, James Caverlee
Abstract User-generated item lists are a popular feature of many different platforms. Examples include lists of books on Goodreads, playlists on Spotify and YouTube, collections of images on Pinterest, and lists of answers on question-answer sites like Zhihu. Recommending item lists is critical for increasing user engagement and connecting users to new items, but many approaches are designed for the item-based recommendation, without careful consideration of the complex relationships between items and lists. Hence, in this paper, we propose a novel user-generated list recommendation model called AttList. Two unique features of AttList are careful modeling of (i) hierarchical user preference, which aggregates items to characterize the list that they belong to, and then aggregates these lists to estimate the user preference, naturally fitting into the hierarchical structure of item lists; and (ii) item and list consistency, through a novel self-attentive aggregation layer designed for capturing the consistency of neighboring items and lists to better model user preference. Through experiments over three real-world datasets reflecting different kinds of user-generated item lists, we find that AttList results in significant improvements in NDCG, Precision@k, and Recall@k versus a suite of state-of-the-art baselines. Furthermore, all code and data are available at https://github.com/heyunh2015/AttList.
Tasks
Published 2019-12-30
URL https://arxiv.org/abs/1912.13023v1
PDF https://arxiv.org/pdf/1912.13023v1.pdf
PWC https://paperswithcode.com/paper/a-hierarchical-self-attentive-model-for
Repo https://github.com/heyunh2015/AttList
Framework none

Learning Individual Causal Effects from Networked Observational Data

Title Learning Individual Causal Effects from Networked Observational Data
Authors Ruocheng Guo, Jundong Li, Huan Liu
Abstract Convenient access to observational data enables us to learn causal effects without randomized experiments. This research direction draws increasing attention in research areas such as economics, healthcare, and education. For example, we can study how a medicine (the treatment) causally affects the health condition (the outcome) of a patient using existing electronic health records. To validate causal effects learned from observational data, we have to control confounding bias – the influence of variables which causally influence both the treatment and the outcome. Existing work along this line overwhelmingly relies on the unconfoundedness assumption that there do not exist unobserved confounders. However, this assumption is untestable and can even be untenable. In fact, an important fact ignored by the majority of previous work is that observational data can come with network information that can be utilized to infer hidden confounders. For example, in an observational study of the individual-level treatment effect of a medicine, instead of randomized experiments, the medicine is often assigned to each individual based on a series of factors. Some of the factors (e.g., socioeconomic status) can be challenging to measure and therefore become hidden confounders. Fortunately, the socioeconomic status of an individual can be reflected by whom she is connected in social networks. With this fact in mind, we aim to exploit the network information to recognize patterns of hidden confounders which would further allow us to learn valid individual causal effects from observational data. In this work, we propose a novel causal inference framework, the network deconfounder, which learns representations to unravel patterns of hidden confounders from the network information. Empirically, we perform extensive experiments to validate the effectiveness of the network deconfounder on various datasets.
Tasks Causal Inference
Published 2019-06-08
URL https://arxiv.org/abs/1906.03485v3
PDF https://arxiv.org/pdf/1906.03485v3.pdf
PWC https://paperswithcode.com/paper/learning-individual-treatment-effects-from
Repo https://github.com/rguo12/network-deconfounder-wsdm20
Framework pytorch

Detecting the Starting Frame of Actions in Video

Title Detecting the Starting Frame of Actions in Video
Authors Iljung S. Kwak, Jian-Zhong Guo, Adam Hantman, David Kriegman, Kristin Branson
Abstract In this work, we address the problem of precisely localizing key frames of an action, for example, the precise time that a pitcher releases a baseball, or the precise time that a crowd begins to applaud. Key frame localization is a largely overlooked and important action-recognition problem, for example in the field of neuroscience, in which we would like to understand the neural activity that produces the start of a bout of an action. To address this problem, we introduce a novel structured loss function that properly weights the types of errors that matter in such applications: it more heavily penalizes extra and missed action start detections over small misalignments. Our structured loss is based on the best matching between predicted and labeled action starts. We train recurrent neural networks (RNNs) to minimize differentiable approximations of this loss. To evaluate these methods, we introduce the Mouse Reach Dataset, a large, annotated video dataset of mice performing a sequence of actions. The dataset was collected and labeled by experts for the purpose of neuroscience research. On this dataset, we demonstrate that our method outperforms related approaches and baseline methods using an unstructured loss.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03340v2
PDF https://arxiv.org/pdf/1906.03340v2.pdf
PWC https://paperswithcode.com/paper/detecting-the-starting-frame-of-actions-in
Repo https://github.com/iskwak/DetetctingActionStarts
Framework pytorch
Title Spring-Electrical Models For Link Prediction
Authors Yana Kashinskaya, Egor Samosvat, Akmal Artikov
Abstract We propose a link prediction algorithm that is based on spring-electrical models. The idea to study these models came from the fact that spring-electrical models have been successfully used for networks visualization. A good network visualization usually implies that nodes similar in terms of network topology, e.g., connected and/or belonging to one cluster, tend to be visualized close to each other. Therefore, we assumed that the Euclidean distance between nodes in the obtained network layout correlates with a probability of a link between them. We evaluate the proposed method against several popular baselines and demonstrate its flexibility by applying it to undirected, directed and bipartite networks.
Tasks Link Prediction
Published 2019-05-24
URL https://arxiv.org/abs/1906.04548v1
PDF https://arxiv.org/pdf/1906.04548v1.pdf
PWC https://paperswithcode.com/paper/spring-electrical-models-for-link-prediction
Repo https://github.com/KashinYana/link-prediction
Framework none

Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning

Title Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning
Authors Joseph Y. Cheng, Feiyu Chen, Christopher Sandino, Morteza Mardani, John M. Pauly, Shreyas S. Vasanawala
Abstract Compressed sensing in MRI enables high subsampling factors while maintaining diagnostic image quality. This technique enables shortened scan durations and/or improved image resolution. Further, compressed sensing can increase the diagnostic information and value from each scan performed. Overall, compressed sensing has significant clinical impact in improving the diagnostic quality and patient experience for imaging exams. However, a number of challenges exist when moving compressed sensing from research to the clinic. These challenges include hand-crafted image priors, sensitive tuning parameters, and long reconstruction times. Data-driven learning provides a solution to address these challenges. As a result, compressed sensing can have greater clinical impact. In this tutorial, we will review the compressed sensing formulation and outline steps needed to transform this formulation to a deep learning framework. Supplementary open source code in python will be used to demonstrate this approach with open databases. Further, we will discuss considerations in applying data-driven compressed sensing in the clinical setting.
Tasks
Published 2019-03-19
URL http://arxiv.org/abs/1903.07824v1
PDF http://arxiv.org/pdf/1903.07824v1.pdf
PWC https://paperswithcode.com/paper/compressed-sensing-from-research-to-clinical
Repo https://github.com/MRSRL/dl-cs
Framework tf

Divide and Conquer the Embedding Space for Metric Learning

Title Divide and Conquer the Embedding Space for Metric Learning
Authors Artsiom Sanakoyeu, Vadim Tschernezki, Uta Büchler, Björn Ommer
Abstract Learning the embedding space, where semantically similar objects are located close together and dissimilar objects far apart, is a cornerstone of many computer vision applications. Existing approaches usually learn a single metric in the embedding space for all available data points, which may have a very complex non-uniform distribution with different notions of similarity between objects, e.g. appearance, shape, color or semantic meaning. Approaches for learning a single distance metric often struggle to encode all different types of relationships and do not generalize well. In this work, we propose a novel easy-to-implement divide and conquer approach for deep metric learning, which significantly improves the state-of-the-art performance of metric learning. Our approach utilizes the embedding space more efficiently by jointly splitting the embedding space and data into $K$ smaller sub-problems. It divides both, the data and the embedding space into $K$ subsets and learns $K$ separate distance metrics in the non-overlapping subspaces of the embedding space, defined by groups of neurons in the embedding layer of the neural network. The proposed approach increases the convergence speed and improves generalization since the complexity of each sub-problem is reduced compared to the original one. We show that our approach outperforms the state-of-the-art by a large margin in retrieval, clustering and re-identification tasks on CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes and PKU VehicleID datasets.
Tasks Metric Learning
Published 2019-06-14
URL https://arxiv.org/abs/1906.05990v1
PDF https://arxiv.org/pdf/1906.05990v1.pdf
PWC https://paperswithcode.com/paper/divide-and-conquer-the-embedding-space-for-1
Repo https://github.com/CompVis/metric-learning-divide-and-conquer
Framework pytorch
comments powered by Disqus