February 1, 2020

3196 words 16 mins read

Paper Group AWR 307

Geometric Graph Convolutional Neural Networks. Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations. Recurrent Space-time Graph Neural Networks. Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification. One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Cla …

Geometric Graph Convolutional Neural Networks


Title	Geometric Graph Convolutional Neural Networks
Authors	Przemysław Spurek, Tomasz Danel, Jacek Tabor, Marek Śmieja, Łukasz Struski, Agnieszka Słowik, Łukasz Maziarka
Abstract	Graph Convolutional Networks (GCNs) have recently become the primary choice for learning from graph-structured data, superseding hash fingerprints in representing chemical compounds. However, GCNs lack the ability to take into account the ordering of node neighbors, even when there is a geometric interpretation of the graph vertices that provides an order based on their spatial positions. To remedy this issue, we propose Geometric Graph Convolutional Network (geo-GCN) which uses spatial features to efficiently learn from graphs that can be naturally located in space. Our contribution is threefold: we propose a GCN-inspired architecture which (i) leverages node positions, (ii) is a proper generalisation of both GCNs and Convolutional Neural Networks (CNNs), (iii) benefits from augmentation which further improves the performance and assures invariance with respect to the desired properties. Empirically, geo-GCN outperforms state-of-the-art graph-based methods on image classification and chemical tasks.
Tasks	Image Classification
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05310v1
PDF	https://arxiv.org/pdf/1909.05310v1.pdf
PWC	https://paperswithcode.com/paper/geometric-graph-convolutional-neural-networks
Repo	https://github.com/gmum/geo-gcn
Framework	pytorch

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations


Title	Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations
Authors	Aarne Talman, Antti Suni, Hande Celikkanat, Sofoklis Kakouros, Jörg Tiedemann, Martti Vainio
Abstract	In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models are publicly available.
Tasks	Prosody Prediction
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02262v1
PDF	https://arxiv.org/pdf/1908.02262v1.pdf
PWC	https://paperswithcode.com/paper/predicting-prosodic-prominence-from-text-with
Repo	https://github.com/Helsinki-NLP/prosody
Framework	pytorch

Recurrent Space-time Graph Neural Networks


Title	Recurrent Space-time Graph Neural Networks
Authors	Andrei Nicolicioiu, Iulia Duta, Marius Leordeanu
Abstract	Learning in the space-time domain remains a very challenging problem in machine learning and computer vision. Current computational models for understanding spatio-temporal visual data are heavily rooted in the classical single-image based paradigm. It is not yet well understood how to integrate information in space and time into a single, general model. We propose a neural graph model, recurrent in space and time, suitable for capturing both the local appearance and the complex higher-level interactions of different entities and objects within the changing world scene. Nodes and edges in our graph have dedicated neural networks for processing information. Nodes operate over features extracted from local parts in space and time and previous memory states. Edges process messages between connected nodes at different locations and spatial scales or between past and present time. Messages are passed iteratively in order to transmit information globally and establish long range interactions. Our model is general and could learn to recognize a variety of high level spatio-temporal concepts and be applied to different learning tasks. We demonstrate, through extensive experiments and ablation studies, that our model outperforms strong baselines and top published methods on recognizing complex activities in video. Moreover, we obtain state-of-the-art performance on the challenging Something-Something human-object interaction dataset.
Tasks	Action Recognition In Videos, Human-Object Interaction Detection, Video Understanding
Published	2019-04-11
URL	https://arxiv.org/abs/1904.05582v4
PDF	https://arxiv.org/pdf/1904.05582v4.pdf
PWC	https://paperswithcode.com/paper/recurrent-space-time-graphs-for-video
Repo	https://github.com/IuliaDuta/RSTG
Framework	tf

Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification


Title	Extracting PICO elements from RCT abstracts using 1-2gram analysis and multitask classification
Authors	Xia Yuan, Liao xiaoli, Li Shilei, Shi Qinwen, Wu Jinfa, Li Ke
Abstract	The core of evidence-based medicine is to read and analyze numerous papers in the medical literature on a specific clinical problem and summarize the authoritative answers to that problem. Currently, to formulate a clear and focused clinical problem, the popular PICO framework is usually adopted, in which each clinical problem is considered to consist of four parts: patient/problem (P), intervention (I), comparison (C) and outcome (O). In this study, we compared several classification models that are commonly used in traditional machine learning. Next, we developed a multitask classification model based on a soft-margin SVM with a specialized feature engineering method that combines 1-2gram analysis with TF-IDF analysis. Finally, we trained and tested several generic models on an open-source data set from BioNLP 2018. The results show that the proposed multitask SVM classification model based on 1-2gram TF-IDF features exhibits the best performance among the tested models.
Tasks	Feature Engineering
Published	2019-01-24
URL	http://arxiv.org/abs/1901.08351v1
PDF	http://arxiv.org/pdf/1901.08351v1.pdf
PWC	https://paperswithcode.com/paper/extracting-pico-elements-from-rct-abstracts
Repo	https://github.com/brucexia6116/PICO_elements_classification
Framework	none

One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Classification


Title	One Size Does Not Fit All: Multi-Scale, Cascaded RNNs for Radar Classification
Authors	Dhrubojyoti Roy, Sangeeta Srivastava, Aditya Kusupati, Pranshu Jain, Manik Varma, Anish Arora
Abstract	Edge sensing with micro-power pulse-Doppler radars is an emergent domain in monitoring and surveillance with several smart city applications. Existing solutions for the clutter versus multi-source radar classification task are limited in terms of either accuracy or efficiency, and in some cases, struggle with a trade-off between false alarms and recall of sources. We find that this problem can be resolved by learning the classifier across multiple time-scales. We propose a multi-scale, cascaded recurrent neural network architecture, MSC-RNN, comprised of an efficient multi-instance learning (MIL) Recurrent Neural Network (RNN) for clutter discrimination at a lower tier, and a more complex RNN classifier for source classification at the upper tier. By controlling the invocation of the upper RNN with the help of the lower tier conditionally, MSC-RNN achieves an overall accuracy of 0.972. Our approach holistically improves the accuracy and per-class recalls over ML models suitable for radar inferencing. Notably, we outperform cross-domain handcrafted feature engineering with time-domain deep feature learning, while also being up to $\sim$3$\times$ more efficient than a competitive solution.
Tasks	Feature Engineering
Published	2019-09-06
URL	https://arxiv.org/abs/1909.03082v1
PDF	https://arxiv.org/pdf/1909.03082v1.pdf
PWC	https://paperswithcode.com/paper/one-size-does-not-fit-all-multi-scale
Repo	https://github.com/dhruboroy29/MSCRNN
Framework	tf

Unsupervised Inductive Graph-Level Representation Learning via Graph-Graph Proximity


Title	Unsupervised Inductive Graph-Level Representation Learning via Graph-Graph Proximity
Authors	Yunsheng Bai, Hao Ding, Yang Qiao, Agustin Marinovic, Ken Gu, Ting Chen, Yizhou Sun, Wei Wang
Abstract	We introduce a novel approach to graph-level representation learning, which is to embed an entire graph into a vector space where the embeddings of two graphs preserve their graph-graph proximity. Our approach, UGRAPHEMB, is a general framework that provides a novel means to performing graph-level embedding in a completely unsupervised and inductive manner. The learned neural network can be considered as a function that receives any graph as input, either seen or unseen in the training set, and transforms it into an embedding. A novel graph-level embedding generation mechanism called Multi-Scale Node Attention (MSNA), is proposed. Experiments on five real graph datasets show that UGRAPHEMB achieves competitive accuracy in the tasks of graph classification, similarity ranking, and graph visualization.
Tasks	Graph Classification, Graph Embedding, Graph Similarity, Representation Learning
Published	2019-04-01
URL	https://arxiv.org/abs/1904.01098v2
PDF	https://arxiv.org/pdf/1904.01098v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-inductive-whole-graph-embedding
Repo	https://github.com/yunshengb/UGraphEmb
Framework	tf

Robust Invisible Video Watermarking with Attention


Title	Robust Invisible Video Watermarking with Attention
Authors	Kevin Alex Zhang, Lei Xu, Alfredo Cuesta-Infante, Kalyan Veeramachaneni
Abstract	The goal of video watermarking is to embed a message within a video file in a way such that it minimally impacts the viewing experience but can be recovered even if the video is redistributed and modified, allowing media producers to assert ownership over their content. This paper presents RivaGAN, a novel architecture for robust video watermarking which features a custom attention-based mechanism for embedding arbitrary data as well as two independent adversarial networks which critique the video quality and optimize for robustness. Using this technique, we are able to achieve state-of-the-art results in deep learning-based video watermarking and produce watermarked videos which have minimal visual distortion and are robust against common video processing operations.
Tasks
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01285v1
PDF	https://arxiv.org/pdf/1909.01285v1.pdf
PWC	https://paperswithcode.com/paper/robust-invisible-video-watermarking-with
Repo	https://github.com/DAI-Lab/RivaGAN
Framework	pytorch

SemBleu: A Robust Metric for AMR Parsing Evaluation


Title	SemBleu: A Robust Metric for AMR Parsing Evaluation
Authors	Linfeng Song, Daniel Gildea
Abstract	Evaluating AMR parsing accuracy involves comparing pairs of AMR graphs. The major evaluation metric, SMATCH (Cai and Knight, 2013), searches for one-to-one mappings between the nodes of two AMRs with a greedy hill-climbing algorithm, which leads to search errors. We propose SEMBLEU, a robust metric that extends BLEU (Papineni et al., 2002) to AMRs. It does not suffer from search errors and considers non-local correspondences in addition to local ones. SEMBLEU is fully content-driven and punishes situations where a system’s output does not preserve most information from the input. Preliminary experiments on both sentence and corpus levels show that SEMBLEU has slightly higher consistency with human judgments than SMATCH. Our code is available at http://github.com/freesunshine0316/sembleu.
Tasks	Amr Parsing
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10726v2
PDF	https://arxiv.org/pdf/1905.10726v2.pdf
PWC	https://paperswithcode.com/paper/sembleu-a-robust-metric-for-amr-parsing
Repo	https://github.com/freesunshine0316/sembleu
Framework	none

Which Contrast Does Matter? Towards a Deep Understanding of MR Contrast using Collaborative GAN


Title	Which Contrast Does Matter? Towards a Deep Understanding of MR Contrast using Collaborative GAN
Authors	Dongwook Lee, Won-Jin Moon, Jong Chul Ye
Abstract	Thanks to the recent success of generative adversarial network (GAN) for image synthesis, there are many exciting GAN approaches that successfully synthesize MR image contrast from other images with different contrasts. These approaches are potentially important for image imputation problems, where complete set of data is often difficult to obtain and image synthesis is one of the key solutions for handling the missing data problem. Unfortunately, the lack of the scalability of the existing GAN-based image translation approaches poses a fundamental challenge to understand the nature of the MR contrast imputation problem: which contrast does matter? Here, we present a systematic approach using Collaborative Generative Adversarial Networks (CollaGAN), which enable the learning of the joint image manifold of multiple MR contrasts to investigate which contrasts are essential. Our experimental results showed that the exogenous contrast from contrast agents is not replaceable, but other endogenous contrast such as T1, T2, etc can be synthesized from other contrast. These findings may give important guidance to the acquisition protocol design for MR in real clinical environment.
Tasks	Image Generation, Image Imputation, Imputation
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04105v1
PDF	https://arxiv.org/pdf/1905.04105v1.pdf
PWC	https://paperswithcode.com/paper/which-contrast-does-matter-towards-a-deep
Repo	https://github.com/jongcye/CollaGAN_MRI
Framework	tf

A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists


Title	A Hierarchical Self-Attentive Model for Recommending User-Generated Item Lists
Authors	Yun He, Jianling Wang, Wei Niu, James Caverlee
Abstract	User-generated item lists are a popular feature of many different platforms. Examples include lists of books on Goodreads, playlists on Spotify and YouTube, collections of images on Pinterest, and lists of answers on question-answer sites like Zhihu. Recommending item lists is critical for increasing user engagement and connecting users to new items, but many approaches are designed for the item-based recommendation, without careful consideration of the complex relationships between items and lists. Hence, in this paper, we propose a novel user-generated list recommendation model called AttList. Two unique features of AttList are careful modeling of (i) hierarchical user preference, which aggregates items to characterize the list that they belong to, and then aggregates these lists to estimate the user preference, naturally fitting into the hierarchical structure of item lists; and (ii) item and list consistency, through a novel self-attentive aggregation layer designed for capturing the consistency of neighboring items and lists to better model user preference. Through experiments over three real-world datasets reflecting different kinds of user-generated item lists, we find that AttList results in significant improvements in NDCG, Precision@k, and Recall@k versus a suite of state-of-the-art baselines. Furthermore, all code and data are available at https://github.com/heyunh2015/AttList.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.13023v1
PDF	https://arxiv.org/pdf/1912.13023v1.pdf
PWC	https://paperswithcode.com/paper/a-hierarchical-self-attentive-model-for
Repo	https://github.com/heyunh2015/AttList
Framework	none

Learning Individual Causal Effects from Networked Observational Data


Title	Learning Individual Causal Effects from Networked Observational Data
Authors	Ruocheng Guo, Jundong Li, Huan Liu
Abstract	Convenient access to observational data enables us to learn causal effects without randomized experiments. This research direction draws increasing attention in research areas such as economics, healthcare, and education. For example, we can study how a medicine (the treatment) causally affects the health condition (the outcome) of a patient using existing electronic health records. To validate causal effects learned from observational data, we have to control confounding bias – the influence of variables which causally influence both the treatment and the outcome. Existing work along this line overwhelmingly relies on the unconfoundedness assumption that there do not exist unobserved confounders. However, this assumption is untestable and can even be untenable. In fact, an important fact ignored by the majority of previous work is that observational data can come with network information that can be utilized to infer hidden confounders. For example, in an observational study of the individual-level treatment effect of a medicine, instead of randomized experiments, the medicine is often assigned to each individual based on a series of factors. Some of the factors (e.g., socioeconomic status) can be challenging to measure and therefore become hidden confounders. Fortunately, the socioeconomic status of an individual can be reflected by whom she is connected in social networks. With this fact in mind, we aim to exploit the network information to recognize patterns of hidden confounders which would further allow us to learn valid individual causal effects from observational data. In this work, we propose a novel causal inference framework, the network deconfounder, which learns representations to unravel patterns of hidden confounders from the network information. Empirically, we perform extensive experiments to validate the effectiveness of the network deconfounder on various datasets.
Tasks	Causal Inference
Published	2019-06-08
URL	https://arxiv.org/abs/1906.03485v3
PDF	https://arxiv.org/pdf/1906.03485v3.pdf
PWC	https://paperswithcode.com/paper/learning-individual-treatment-effects-from
Repo	https://github.com/rguo12/network-deconfounder-wsdm20
Framework	pytorch

Detecting the Starting Frame of Actions in Video


Title	Detecting the Starting Frame of Actions in Video
Authors	Iljung S. Kwak, Jian-Zhong Guo, Adam Hantman, David Kriegman, Kristin Branson
Abstract	In this work, we address the problem of precisely localizing key frames of an action, for example, the precise time that a pitcher releases a baseball, or the precise time that a crowd begins to applaud. Key frame localization is a largely overlooked and important action-recognition problem, for example in the field of neuroscience, in which we would like to understand the neural activity that produces the start of a bout of an action. To address this problem, we introduce a novel structured loss function that properly weights the types of errors that matter in such applications: it more heavily penalizes extra and missed action start detections over small misalignments. Our structured loss is based on the best matching between predicted and labeled action starts. We train recurrent neural networks (RNNs) to minimize differentiable approximations of this loss. To evaluate these methods, we introduce the Mouse Reach Dataset, a large, annotated video dataset of mice performing a sequence of actions. The dataset was collected and labeled by experts for the purpose of neuroscience research. On this dataset, we demonstrate that our method outperforms related approaches and baseline methods using an unstructured loss.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03340v2
PDF	https://arxiv.org/pdf/1906.03340v2.pdf
PWC	https://paperswithcode.com/paper/detecting-the-starting-frame-of-actions-in
Repo	https://github.com/iskwak/DetetctingActionStarts
Framework	pytorch

Spring-Electrical Models For Link Prediction


Title	Spring-Electrical Models For Link Prediction
Authors	Yana Kashinskaya, Egor Samosvat, Akmal Artikov
Abstract	We propose a link prediction algorithm that is based on spring-electrical models. The idea to study these models came from the fact that spring-electrical models have been successfully used for networks visualization. A good network visualization usually implies that nodes similar in terms of network topology, e.g., connected and/or belonging to one cluster, tend to be visualized close to each other. Therefore, we assumed that the Euclidean distance between nodes in the obtained network layout correlates with a probability of a link between them. We evaluate the proposed method against several popular baselines and demonstrate its flexibility by applying it to undirected, directed and bipartite networks.
Tasks	Link Prediction
Published	2019-05-24
URL	https://arxiv.org/abs/1906.04548v1
PDF	https://arxiv.org/pdf/1906.04548v1.pdf
PWC	https://paperswithcode.com/paper/spring-electrical-models-for-link-prediction
Repo	https://github.com/KashinYana/link-prediction
Framework	none

Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning


Title	Compressed Sensing: From Research to Clinical Practice with Data-Driven Learning
Authors	Joseph Y. Cheng, Feiyu Chen, Christopher Sandino, Morteza Mardani, John M. Pauly, Shreyas S. Vasanawala
Abstract	Compressed sensing in MRI enables high subsampling factors while maintaining diagnostic image quality. This technique enables shortened scan durations and/or improved image resolution. Further, compressed sensing can increase the diagnostic information and value from each scan performed. Overall, compressed sensing has significant clinical impact in improving the diagnostic quality and patient experience for imaging exams. However, a number of challenges exist when moving compressed sensing from research to the clinic. These challenges include hand-crafted image priors, sensitive tuning parameters, and long reconstruction times. Data-driven learning provides a solution to address these challenges. As a result, compressed sensing can have greater clinical impact. In this tutorial, we will review the compressed sensing formulation and outline steps needed to transform this formulation to a deep learning framework. Supplementary open source code in python will be used to demonstrate this approach with open databases. Further, we will discuss considerations in applying data-driven compressed sensing in the clinical setting.
Tasks
Published	2019-03-19
URL	http://arxiv.org/abs/1903.07824v1
PDF	http://arxiv.org/pdf/1903.07824v1.pdf
PWC	https://paperswithcode.com/paper/compressed-sensing-from-research-to-clinical
Repo	https://github.com/MRSRL/dl-cs
Framework	tf

Divide and Conquer the Embedding Space for Metric Learning


Title	Divide and Conquer the Embedding Space for Metric Learning
Authors	Artsiom Sanakoyeu, Vadim Tschernezki, Uta Büchler, Björn Ommer
Abstract	Learning the embedding space, where semantically similar objects are located close together and dissimilar objects far apart, is a cornerstone of many computer vision applications. Existing approaches usually learn a single metric in the embedding space for all available data points, which may have a very complex non-uniform distribution with different notions of similarity between objects, e.g. appearance, shape, color or semantic meaning. Approaches for learning a single distance metric often struggle to encode all different types of relationships and do not generalize well. In this work, we propose a novel easy-to-implement divide and conquer approach for deep metric learning, which significantly improves the state-of-the-art performance of metric learning. Our approach utilizes the embedding space more efficiently by jointly splitting the embedding space and data into $K$ smaller sub-problems. It divides both, the data and the embedding space into $K$ subsets and learns $K$ separate distance metrics in the non-overlapping subspaces of the embedding space, defined by groups of neurons in the embedding layer of the neural network. The proposed approach increases the convergence speed and improves generalization since the complexity of each sub-problem is reduced compared to the original one. We show that our approach outperforms the state-of-the-art by a large margin in retrieval, clustering and re-identification tasks on CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes and PKU VehicleID datasets.
Tasks	Metric Learning
Published	2019-06-14
URL	https://arxiv.org/abs/1906.05990v1
PDF	https://arxiv.org/pdf/1906.05990v1.pdf
PWC	https://paperswithcode.com/paper/divide-and-conquer-the-embedding-space-for-1
Repo	https://github.com/CompVis/metric-learning-divide-and-conquer
Framework	pytorch