April 3, 2020

3093 words 15 mins read

Paper Group AWR 6

Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization. Measuring Social Biases in Grounded Vision and Language Embeddings. RNA Secondary Structure Prediction By Learning Unrolled Algorithms. Stacked DeBERT: All Attention in Incomplete Data for Text Classification. Unifying Graph Convolutional Neural Networks …

Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization


Title	Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization
Authors	Yoonsik Kim, Jae Woong Soh, Gu Yong Park, Nam Ik Cho
Abstract	Real-noise denoising is a challenging task because the statistics of real-noise do not follow the normal distribution, and they are also spatially and temporally changing. In order to cope with various and complex real-noise, we propose a well-generalized denoising architecture and a transfer learning scheme. Specifically, we adopt an adaptive instance normalization to build a denoiser, which can regularize the feature map and prevent the network from overfitting to the training set. We also introduce a transfer learning scheme that transfers knowledge learned from synthetic-noise data to the real-noise denoiser. From the proposed transfer learning, the synthetic-noise denoiser can learn general features from various synthetic-noise data, and the real-noise denoiser can learn the real-noise characteristics from real data. From the experiments, we find that the proposed denoising method has great generalization ability, such that our network trained with synthetic-noise achieves the best performance for Darmstadt Noise Dataset (DND) among the methods from published papers. We can also see that the proposed transfer learning scheme robustly works for real-noise images through the learning with a very small number of labeled data.
Tasks	Denoising, Transfer Learning
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11244v2
PDF	https://arxiv.org/pdf/2002.11244v2.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-from-synthetic-to-real-2
Repo	https://github.com/terryoo/AINDNet
Framework	tf


Title	Measuring Social Biases in Grounded Vision and Language Embeddings
Authors	Candace Ross, Boris Katz, Andrei Barbu
Abstract	We generalize the notion of social biases from language embeddings to grounded vision and language embeddings. Biases are present in grounded embeddings, and indeed seem to be equally or more significant than for ungrounded embeddings. This is despite the fact that vision and language can suffer from different biases, which one might hope could attenuate the biases in both. Multiple ways exist to generalize metrics measuring bias in word embeddings to this new setting. We introduce the space of generalizations (Grounded-WEAT and Grounded-SEAT) and demonstrate that three generalizations answer different yet important questions about how biases, language, and vision interact. These metrics are used on a new dataset, the first for grounded bias, created by augmenting extending standard linguistic bias benchmarks with 10,228 images from COCO, Conceptual Captions, and Google Images. Dataset construction is challenging because vision datasets are themselves very biased. The presence of these biases in systems will begin to have real-world consequences as they are deployed, making carefully measuring bias and then mitigating it critical to building a fair society.
Tasks	Word Embeddings
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08911v1
PDF	https://arxiv.org/pdf/2002.08911v1.pdf
PWC	https://paperswithcode.com/paper/measuring-social-biases-in-grounded-vision
Repo	https://github.com/candacelax/bias-in-vision-and-language
Framework	pytorch

RNA Secondary Structure Prediction By Learning Unrolled Algorithms


Title	RNA Secondary Structure Prediction By Learning Unrolled Algorithms
Authors	Xinshi Chen, Yu Li, Ramzan Umarov, Xin Gao, Le Song
Abstract	In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold: it predicts significantly better structures compared to previous SOTA (especially for pseudoknotted structures), while being as efficient as the fastest algorithms in terms of inference time.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05810v1
PDF	https://arxiv.org/pdf/2002.05810v1.pdf
PWC	https://paperswithcode.com/paper/rna-secondary-structure-prediction-by-1
Repo	https://github.com/ml4bio/e2efold
Framework	pytorch

Stacked DeBERT: All Attention in Incomplete Data for Text Classification


Title	Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Authors	Gwenaelle Cunha Sergio, Minho Lee
Abstract	In this paper, we propose Stacked DeBERT, short for Stacked Denoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel encoding scheme in BERT, a powerful language representation model solely based on attention mechanisms. Incomplete data in natural language processing refer to text with missing or incorrect words, and its presence can hinder the performance of current models that were not implemented to withstand such noises, but must still perform well even under duress. This is due to the fact that current approaches are built for and trained with clean and complete data, and thus are not able to extract features that can adequately represent incomplete data. Our proposed approach consists of obtaining intermediate input representations by applying an embedding layer to the input tokens followed by vanilla transformers. These intermediate features are given as input to novel denoising transformers which are responsible for obtaining richer input representations. The proposed approach takes advantage of stacks of multilayer perceptrons for the reconstruction of missing words’ embeddings by extracting more abstract and meaningful hidden feature vectors, and bidirectional transformers for improved embedding representation. We consider two datasets for training and evaluation: the Chatbot Natural Language Understanding Evaluation Corpus and Kaggle’s Twitter Sentiment Corpus. Our model shows improved F1-scores and better robustness in informal/incorrect texts present in tweets and in texts with Speech-to-Text error in the sentiment and intent classification tasks.
Tasks	Chatbot, Denoising, Intent Classification, Text Classification
Published	2020-01-01
URL	https://arxiv.org/abs/2001.00137v1
PDF	https://arxiv.org/pdf/2001.00137v1.pdf
PWC	https://paperswithcode.com/paper/stacked-debert-all-attention-in-incomplete
Repo	https://github.com/gcunhase/StackedDeBERT
Framework	pytorch

Unifying Graph Convolutional Neural Networks and Label Propagation


Title	Unifying Graph Convolutional Neural Networks and Label Propagation
Authors	Hongwei Wang, Jure Leskovec
Abstract	Label Propagation (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relation between LPA and GCN has not yet been investigated. Here we study the relationship between LPA and GCN in terms of two aspects: (1) feature/label smoothing where we analyze how the feature/label of one node is spread over its neighbors; And, (2) feature/label influence of how much the initial feature/label of one node influences the final feature/label of another node. Based on our theoretical analysis, we propose an end-to-end model that unifies GCN and LPA for node classification. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved classification performance. Our model can also be seen as learning attention weights based on node labels, which is more task-oriented than existing feature-based attention models. In a number of experiments on real-world graphs, our model shows superiority over state-of-the-art GCN-based methods in terms of node classification accuracy.
Tasks	Node Classification
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06755v1
PDF	https://arxiv.org/pdf/2002.06755v1.pdf
PWC	https://paperswithcode.com/paper/unifying-graph-convolutional-neural-networks-1
Repo	https://github.com/hwwang55/GCN-LPA
Framework	tf

BayesFlow: Learning complex stochastic models with invertible neural networks


Title	BayesFlow: Learning complex stochastic models with invertible neural networks
Authors	Stefan T. Radev, Ulf K. Mertens, Andreass Voss, Lynton Ardizzone, Ullrich Köthe
Abstract	Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks which we call BayesFlow. The method uses simulation to learn a global estimator for the probabilistic mapping from observed data to underlying model parameters. A neural network pre-trained in this way can then, without additional training or optimization, infer full posteriors on arbitrary many real data sets involving the same model family. In addition, our method incorporates a summary network trained to embed the observed data into maximally informative summary statistics. Learning summary statistics from data makes the method applicable to modeling scenarios where standard inference techniques with hand-crafted summary statistics fail. We demonstrate the utility of BayesFlow on challenging intractable models from population dynamics, epidemiology, cognitive science and ecology. We argue that BayesFlow provides a general framework for building reusable Bayesian parameter estimation machines for any process model from which data can be simulated.
Tasks	Bayesian Inference, Epidemiology
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06281v2
PDF	https://arxiv.org/pdf/2003.06281v2.pdf
PWC	https://paperswithcode.com/paper/bayesflow-learning-complex-stochastic-models
Repo	https://github.com/stefanradev93/cINN
Framework	tf

Improved Baselines with Momentum Contrastive Learning


Title	Improved Baselines with Momentum Contrastive Learning
Authors	Xinlei Chen, Haoqi Fan, Ross Girshick, Kaiming He
Abstract	Contrastive unsupervised learning has recently shown encouraging progress, e.g., in Momentum Contrast (MoCo) and SimCLR. In this note, we verify the effectiveness of two of SimCLR’s design improvements by implementing them in the MoCo framework. With simple modifications to MoCo—namely, using an MLP projection head and more data augmentation—we establish stronger baselines that outperform SimCLR and do not require large training batches. We hope this will make state-of-the-art unsupervised learning research more accessible. Code will be made public.
Tasks	Data Augmentation, Representation Learning, Self-Supervised Image Classification
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04297v1
PDF	https://arxiv.org/pdf/2003.04297v1.pdf
PWC	https://paperswithcode.com/paper/improved-baselines-with-momentum-contrastive
Repo	https://github.com/ppwwyyxx/moco.tensorflow
Framework	tf

Bridging Ordinary-Label Learning and Complementary-Label Learning


Title	Bridging Ordinary-Label Learning and Complementary-Label Learning
Authors	Yasuhiro Katsura, Masato Uchida
Abstract	Unlike ordinary supervised pattern recognition, in a newly proposed framework namely complementary-label learning, each label specifies one class that the pattern does not belong to. In this paper, we propose the natural generalization of learning from an ordinary label and a complementary label, specifically focused on one-versus-all and pairwise classification. We assume that annotation with a bag of complementary labels is equivalent to providing the rest of all the labels as the candidates of the one true class. Our derived classification risk is in a comprehensive form that includes those in the literature, and succeeded to explicitly show the relationship between the single and multiple ordinary/complementary labels. We further show both theoretically and experimentally that the classification error bound monotonically decreases corresponding to the number of complementary labels. This is consistent because the more complementary labels are provided, the less supervision becomes ambiguous.
Tasks
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02158v3
PDF	https://arxiv.org/pdf/2002.02158v3.pdf
PWC	https://paperswithcode.com/paper/bridging-ordinary-label-learning-and
Repo	https://github.com/YasuhiroKatsura/comp-labels
Framework	pytorch

Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition


Title	Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition
Authors	Jongmin Yu, Yongsang Yoon, Moongu Jeon
Abstract	In skeleton-based action recognition, graph convolutional networks (GCNs), which model human body skeletons using graphical components such as nodes and connections, have achieved remarkable performance recently. However, current state-of-the-art methods for skeleton-based action recognition usually work on the assumption that the completely observed skeletons will be provided. This may be problematic to apply this assumption in real scenarios since there is always a possibility that captured skeletons are incomplete or noisy. In this work, we propose a skeleton-based action recognition method which is robust to noise information of given skeleton features. The key insight of our approach is to train a model by maximizing the mutual information between normal and noisy skeletons using a predictive coding manner. We have conducted comprehensive experiments about skeleton-based action recognition with defected skeletons using NTU-RGB+D and Kinetics-Skeleton datasets. The experimental results demonstrate that our approach achieves outstanding performance when skeleton samples are noised compared with existing state-of-the-art methods.
Tasks	Skeleton Based Action Recognition
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07514v1
PDF	https://arxiv.org/pdf/2003.07514v1.pdf
PWC	https://paperswithcode.com/paper/predictively-encoded-graph-convolutional
Repo	https://github.com/andreYoo/PeGCNs
Framework	pytorch

Large-scale biometry with interpretable neural network regression on UK Biobank body MRI


Title	Large-scale biometry with interpretable neural network regression on UK Biobank body MRI
Authors	Taro Langner, Håkan Ahlström, Joel Kullberg
Abstract	Objective: Automated analysis of MRI with deep regression has the potential to provide medical research with a wide range of biological metrics, inferred at high speed and accuracy. Methods: The UK Biobank study has successfully imaged more than 32,000 volunteer participants with neck-to-knee body MRI. Each scan is linked to extensive metadata, providing a comprehensive survey of imaged anatomy and related health states. Despite its potential for research, this vast amount of data presents a challenge to established methods of evaluation, which often rely on manual input. In this work, neural networks were trained for regression to infer various biological metrics from the neck-to-knee body MRI automatically, with a ResNet50 in 7-fold cross-validation. No manual intervention or ground truth segmentations are required for training. The examined fields span 64 variables derived from anthropometric measurements, dual-energy X-ray absorptiometry (DXA), atlas-based segmentations, and dedicated liver scans. Results: The standardized framework achieved a close fit to the target values (median R^2 > 0.97). Interpretation of aggregated saliency maps indicates that the network correctly targets specific body regions and limbs, and learned to emulate different modalities. On several body composition metrics, the quality of the predictions is within the range of variability observed between established gold standard techniques. Conclusion and Significance: The deep regression framework robustly inferred a wide range of medically relevant metrics from the image data. In practice, this technique could provide accurate, image-based measurements for medical research months or years before the more established reference methods have been fully applied.
Tasks
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06862v2
PDF	https://arxiv.org/pdf/2002.06862v2.pdf
PWC	https://paperswithcode.com/paper/large-scale-biometry-with-interpretable
Repo	https://github.com/tarolangner/mri-biometry
Framework	pytorch

Frosting Weights for Better Continual Training


Title	Frosting Weights for Better Continual Training
Authors	Xiaofeng Zhu, Feng Liu, Goce Trajcevski, Dingding Wang
Abstract	Training a neural network model can be a lifelong learning process and is a computationally intensive one. A severe adverse effect that may occur in deep neural network models is that they can suffer from catastrophic forgetting during retraining on new data. To avoid such disruptions in the continuous learning, one appealing property is the additive nature of ensemble models. In this paper, we propose two generic ensemble approaches, gradient boosting and meta-learning, to solve the catastrophic forgetting problem in tuning pre-trained neural network models.
Tasks	Meta-Learning
Published	2020-01-07
URL	https://arxiv.org/abs/2001.01829v1
PDF	https://arxiv.org/pdf/2001.01829v1.pdf
PWC	https://paperswithcode.com/paper/frosting-weights-for-better-continual
Repo	https://github.com/XiaofengZhu/frosting_weights
Framework	tf

A Tool for Conducting User Studies on Mobile Devices


Title	A Tool for Conducting User Studies on Mobile Devices
Authors	Luca Costa, Mohammad Aliannejadi, Fabio Crestani
Abstract	With the ever-growing interest in the area of mobile information retrieval and the ongoing fast development of mobile devices and, as a consequence, mobile apps, an active research area lies in studying users’ behavior and search queries users submit on mobile devices. However, many researchers require to develop an app that collects useful information from users while they search on their phones or participate in a user study. In this paper, we aim to address this need by providing a comprehensive Android app, called Omicron, which can be used to collect mobile query logs and perform user studies on mobile devices. Omicron, at its current version, can collect users’ mobile queries, relevant documents, sensor data as well as user activity and interaction data in various study settings. Furthermore, we designed Omicron in such a way that it is conveniently extendable to conduct more specific studies and collect other types of sensor data. Finally, we provide a tool to monitor the participants and their data both during and after the collection process.
Tasks	Information Retrieval
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11913v1
PDF	https://arxiv.org/pdf/2001.11913v1.pdf
PWC	https://paperswithcode.com/paper/a-tool-for-conducting-user-studies-on-mobile
Repo	https://github.com/aliannejadi/Omicron
Framework	none

Channel Pruning via Optimal Thresholding


Title	Channel Pruning via Optimal Thresholding
Authors	Yun Ye, Ganmei You, Jong-Kae Fwu, Xia Zhu, Qing Yang, Yuan Zhu
Abstract	Structured pruning, especially channel pruning is widely used for the reduced computational cost and the compatibility with off-the-shelf hardware devices. Among existing works, weights are typically removed using a predefined global threshold, or a threshold computed from a predefined metric. The predefined global threshold based designs ignore the variation among different layers and weights distribution, therefore, they may often result in sub-optimal performance caused by over-pruning or under-pruning. In this paper, we present a simple yet effective method, termed Optimal Thresholding (OT), to prune channels with layer dependent thresholds that optimally separate important from negligible channels. By using OT, most negligible or unimportant channels are pruned to achieve high sparsity while minimizing performance degradation. Since most important weights are preserved, the pruned model can be further fine-tuned and quickly converge with very few iterations. Our method demonstrates superior performance, especially when compared to the state-of-the-art designs at high levels of sparsity. On CIFAR-100, a pruned and fine-tuned DenseNet-121 by using OT achieves 75.99% accuracy with only 1.46e8 FLOPs and 0.71M parameters. code is available at: https://github.com/yeyun11/netslim.
Tasks
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04566v2
PDF	https://arxiv.org/pdf/2003.04566v2.pdf
PWC	https://paperswithcode.com/paper/channel-pruning-via-optimal-thresholding
Repo	https://github.com/yeyun11/netslim
Framework	pytorch

RAFT: Recurrent All-Pairs Field Transforms for Optical Flow


Title	RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Authors	Zachary Teed, Jia Deng
Abstract	We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field through a recurrent unit that performs lookups on the correlation volumes. RAFT achieves state-of-the-art performance, with strong cross-dataset generalization and high efficiency in inference time, training speed, and parameter count. Code is available \url{https://github.com/princeton-vl/RAFT}.
Tasks	Optical Flow Estimation
Published	2020-03-26
URL	https://arxiv.org/abs/2003.12039v1
PDF	https://arxiv.org/pdf/2003.12039v1.pdf
PWC	https://paperswithcode.com/paper/raft-recurrent-all-pairs-field-transforms-for
Repo	https://github.com/princeton-vl/RAFT
Framework	pytorch

The troublesome kernel: why deep learning for inverse problems is typically unstable


Title	The troublesome kernel: why deep learning for inverse problems is typically unstable
Authors	Nina M. Gottschling, Vegard Antun, Ben Adcock, Anders C. Hansen
Abstract	There is overwhelming empirical evidence that Deep Learning (DL) leads to unstable methods in applications ranging from image classification and computer vision to voice recognition and automated diagnosis in medicine. Recently, a similar instability phenomenon has been discovered when DL is used to solve certain problems in computational science, namely, inverse problems in imaging. In this paper we present a comprehensive mathematical analysis explaining the many facets of the instability phenomenon in DL for inverse problems. Our main results not only explain why this phenomenon occurs, they also shed light as to why finding a cure for instabilities is so difficult in practice. Additionally, these theorems show that instabilities are typically not rare events - rather, they can occur even when the measurements are subject to completely random noise - and consequently how easy it can be to destablise certain trained neural networks. We also examine the delicate balance between reconstruction performance and stability, and in particular, how DL methods may outperform state-of-the-art sparse regularization methods, but at the cost of instability. Finally, we demonstrate a counterintuitive phenomenon: training a neural network may generically not yield an optimal reconstruction method for an inverse problem.
Tasks	Image Classification
Published	2020-01-05
URL	https://arxiv.org/abs/2001.01258v1
PDF	https://arxiv.org/pdf/2001.01258v1.pdf
PWC	https://paperswithcode.com/paper/the-troublesome-kernel-why-deep-learning-for
Repo	https://github.com/vegarant/troub_ker
Framework	none