Paper Group AWR 6
Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization. Measuring Social Biases in Grounded Vision and Language Embeddings. RNA Secondary Structure Prediction By Learning Unrolled Algorithms. Stacked DeBERT: All Attention in Incomplete Data for Text Classification. Unifying Graph Convolutional Neural Networks …
Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization
Title | Transfer Learning from Synthetic to Real-Noise Denoising with Adaptive Instance Normalization |
Authors | Yoonsik Kim, Jae Woong Soh, Gu Yong Park, Nam Ik Cho |
Abstract | Real-noise denoising is a challenging task because the statistics of real-noise do not follow the normal distribution, and they are also spatially and temporally changing. In order to cope with various and complex real-noise, we propose a well-generalized denoising architecture and a transfer learning scheme. Specifically, we adopt an adaptive instance normalization to build a denoiser, which can regularize the feature map and prevent the network from overfitting to the training set. We also introduce a transfer learning scheme that transfers knowledge learned from synthetic-noise data to the real-noise denoiser. From the proposed transfer learning, the synthetic-noise denoiser can learn general features from various synthetic-noise data, and the real-noise denoiser can learn the real-noise characteristics from real data. From the experiments, we find that the proposed denoising method has great generalization ability, such that our network trained with synthetic-noise achieves the best performance for Darmstadt Noise Dataset (DND) among the methods from published papers. We can also see that the proposed transfer learning scheme robustly works for real-noise images through the learning with a very small number of labeled data. |
Tasks | Denoising, Transfer Learning |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11244v2 |
https://arxiv.org/pdf/2002.11244v2.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-from-synthetic-to-real-2 |
Repo | https://github.com/terryoo/AINDNet |
Framework | tf |
Measuring Social Biases in Grounded Vision and Language Embeddings
Title | Measuring Social Biases in Grounded Vision and Language Embeddings |
Authors | Candace Ross, Boris Katz, Andrei Barbu |
Abstract | We generalize the notion of social biases from language embeddings to grounded vision and language embeddings. Biases are present in grounded embeddings, and indeed seem to be equally or more significant than for ungrounded embeddings. This is despite the fact that vision and language can suffer from different biases, which one might hope could attenuate the biases in both. Multiple ways exist to generalize metrics measuring bias in word embeddings to this new setting. We introduce the space of generalizations (Grounded-WEAT and Grounded-SEAT) and demonstrate that three generalizations answer different yet important questions about how biases, language, and vision interact. These metrics are used on a new dataset, the first for grounded bias, created by augmenting extending standard linguistic bias benchmarks with 10,228 images from COCO, Conceptual Captions, and Google Images. Dataset construction is challenging because vision datasets are themselves very biased. The presence of these biases in systems will begin to have real-world consequences as they are deployed, making carefully measuring bias and then mitigating it critical to building a fair society. |
Tasks | Word Embeddings |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08911v1 |
https://arxiv.org/pdf/2002.08911v1.pdf | |
PWC | https://paperswithcode.com/paper/measuring-social-biases-in-grounded-vision |
Repo | https://github.com/candacelax/bias-in-vision-and-language |
Framework | pytorch |
RNA Secondary Structure Prediction By Learning Unrolled Algorithms
Title | RNA Secondary Structure Prediction By Learning Unrolled Algorithms |
Authors | Xinshi Chen, Yu Li, Ramzan Umarov, Xin Gao, Le Song |
Abstract | In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold: it predicts significantly better structures compared to previous SOTA (especially for pseudoknotted structures), while being as efficient as the fastest algorithms in terms of inference time. |
Tasks | |
Published | 2020-02-13 |
URL | https://arxiv.org/abs/2002.05810v1 |
https://arxiv.org/pdf/2002.05810v1.pdf | |
PWC | https://paperswithcode.com/paper/rna-secondary-structure-prediction-by-1 |
Repo | https://github.com/ml4bio/e2efold |
Framework | pytorch |
Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Title | Stacked DeBERT: All Attention in Incomplete Data for Text Classification |
Authors | Gwenaelle Cunha Sergio, Minho Lee |
Abstract | In this paper, we propose Stacked DeBERT, short for Stacked Denoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel encoding scheme in BERT, a powerful language representation model solely based on attention mechanisms. Incomplete data in natural language processing refer to text with missing or incorrect words, and its presence can hinder the performance of current models that were not implemented to withstand such noises, but must still perform well even under duress. This is due to the fact that current approaches are built for and trained with clean and complete data, and thus are not able to extract features that can adequately represent incomplete data. Our proposed approach consists of obtaining intermediate input representations by applying an embedding layer to the input tokens followed by vanilla transformers. These intermediate features are given as input to novel denoising transformers which are responsible for obtaining richer input representations. The proposed approach takes advantage of stacks of multilayer perceptrons for the reconstruction of missing words’ embeddings by extracting more abstract and meaningful hidden feature vectors, and bidirectional transformers for improved embedding representation. We consider two datasets for training and evaluation: the Chatbot Natural Language Understanding Evaluation Corpus and Kaggle’s Twitter Sentiment Corpus. Our model shows improved F1-scores and better robustness in informal/incorrect texts present in tweets and in texts with Speech-to-Text error in the sentiment and intent classification tasks. |
Tasks | Chatbot, Denoising, Intent Classification, Text Classification |
Published | 2020-01-01 |
URL | https://arxiv.org/abs/2001.00137v1 |
https://arxiv.org/pdf/2001.00137v1.pdf | |
PWC | https://paperswithcode.com/paper/stacked-debert-all-attention-in-incomplete |
Repo | https://github.com/gcunhase/StackedDeBERT |
Framework | pytorch |
Unifying Graph Convolutional Neural Networks and Label Propagation
Title | Unifying Graph Convolutional Neural Networks and Label Propagation |
Authors | Hongwei Wang, Jure Leskovec |
Abstract | Label Propagation (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relation between LPA and GCN has not yet been investigated. Here we study the relationship between LPA and GCN in terms of two aspects: (1) feature/label smoothing where we analyze how the feature/label of one node is spread over its neighbors; And, (2) feature/label influence of how much the initial feature/label of one node influences the final feature/label of another node. Based on our theoretical analysis, we propose an end-to-end model that unifies GCN and LPA for node classification. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved classification performance. Our model can also be seen as learning attention weights based on node labels, which is more task-oriented than existing feature-based attention models. In a number of experiments on real-world graphs, our model shows superiority over state-of-the-art GCN-based methods in terms of node classification accuracy. |
Tasks | Node Classification |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06755v1 |
https://arxiv.org/pdf/2002.06755v1.pdf | |
PWC | https://paperswithcode.com/paper/unifying-graph-convolutional-neural-networks-1 |
Repo | https://github.com/hwwang55/GCN-LPA |
Framework | tf |
BayesFlow: Learning complex stochastic models with invertible neural networks
Title | BayesFlow: Learning complex stochastic models with invertible neural networks |
Authors | Stefan T. Radev, Ulf K. Mertens, Andreass Voss, Lynton Ardizzone, Ullrich Köthe |
Abstract | Estimating the parameters of mathematical models is a common problem in almost all branches of science. However, this problem can prove notably difficult when processes and model descriptions become increasingly complex and an explicit likelihood function is not available. With this work, we propose a novel method for globally amortized Bayesian inference based on invertible neural networks which we call BayesFlow. The method uses simulation to learn a global estimator for the probabilistic mapping from observed data to underlying model parameters. A neural network pre-trained in this way can then, without additional training or optimization, infer full posteriors on arbitrary many real data sets involving the same model family. In addition, our method incorporates a summary network trained to embed the observed data into maximally informative summary statistics. Learning summary statistics from data makes the method applicable to modeling scenarios where standard inference techniques with hand-crafted summary statistics fail. We demonstrate the utility of BayesFlow on challenging intractable models from population dynamics, epidemiology, cognitive science and ecology. We argue that BayesFlow provides a general framework for building reusable Bayesian parameter estimation machines for any process model from which data can be simulated. |
Tasks | Bayesian Inference, Epidemiology |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.06281v2 |
https://arxiv.org/pdf/2003.06281v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesflow-learning-complex-stochastic-models |
Repo | https://github.com/stefanradev93/cINN |
Framework | tf |
Improved Baselines with Momentum Contrastive Learning
Title | Improved Baselines with Momentum Contrastive Learning |
Authors | Xinlei Chen, Haoqi Fan, Ross Girshick, Kaiming He |
Abstract | Contrastive unsupervised learning has recently shown encouraging progress, e.g., in Momentum Contrast (MoCo) and SimCLR. In this note, we verify the effectiveness of two of SimCLR’s design improvements by implementing them in the MoCo framework. With simple modifications to MoCo—namely, using an MLP projection head and more data augmentation—we establish stronger baselines that outperform SimCLR and do not require large training batches. We hope this will make state-of-the-art unsupervised learning research more accessible. Code will be made public. |
Tasks | Data Augmentation, Representation Learning, Self-Supervised Image Classification |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04297v1 |
https://arxiv.org/pdf/2003.04297v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-baselines-with-momentum-contrastive |
Repo | https://github.com/ppwwyyxx/moco.tensorflow |
Framework | tf |
Bridging Ordinary-Label Learning and Complementary-Label Learning
Title | Bridging Ordinary-Label Learning and Complementary-Label Learning |
Authors | Yasuhiro Katsura, Masato Uchida |
Abstract | Unlike ordinary supervised pattern recognition, in a newly proposed framework namely complementary-label learning, each label specifies one class that the pattern does not belong to. In this paper, we propose the natural generalization of learning from an ordinary label and a complementary label, specifically focused on one-versus-all and pairwise classification. We assume that annotation with a bag of complementary labels is equivalent to providing the rest of all the labels as the candidates of the one true class. Our derived classification risk is in a comprehensive form that includes those in the literature, and succeeded to explicitly show the relationship between the single and multiple ordinary/complementary labels. We further show both theoretically and experimentally that the classification error bound monotonically decreases corresponding to the number of complementary labels. This is consistent because the more complementary labels are provided, the less supervision becomes ambiguous. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02158v3 |
https://arxiv.org/pdf/2002.02158v3.pdf | |
PWC | https://paperswithcode.com/paper/bridging-ordinary-label-learning-and |
Repo | https://github.com/YasuhiroKatsura/comp-labels |
Framework | pytorch |
Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition
Title | Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition |
Authors | Jongmin Yu, Yongsang Yoon, Moongu Jeon |
Abstract | In skeleton-based action recognition, graph convolutional networks (GCNs), which model human body skeletons using graphical components such as nodes and connections, have achieved remarkable performance recently. However, current state-of-the-art methods for skeleton-based action recognition usually work on the assumption that the completely observed skeletons will be provided. This may be problematic to apply this assumption in real scenarios since there is always a possibility that captured skeletons are incomplete or noisy. In this work, we propose a skeleton-based action recognition method which is robust to noise information of given skeleton features. The key insight of our approach is to train a model by maximizing the mutual information between normal and noisy skeletons using a predictive coding manner. We have conducted comprehensive experiments about skeleton-based action recognition with defected skeletons using NTU-RGB+D and Kinetics-Skeleton datasets. The experimental results demonstrate that our approach achieves outstanding performance when skeleton samples are noised compared with existing state-of-the-art methods. |
Tasks | Skeleton Based Action Recognition |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07514v1 |
https://arxiv.org/pdf/2003.07514v1.pdf | |
PWC | https://paperswithcode.com/paper/predictively-encoded-graph-convolutional |
Repo | https://github.com/andreYoo/PeGCNs |
Framework | pytorch |
Large-scale biometry with interpretable neural network regression on UK Biobank body MRI
Title | Large-scale biometry with interpretable neural network regression on UK Biobank body MRI |
Authors | Taro Langner, Håkan Ahlström, Joel Kullberg |
Abstract | Objective: Automated analysis of MRI with deep regression has the potential to provide medical research with a wide range of biological metrics, inferred at high speed and accuracy. Methods: The UK Biobank study has successfully imaged more than 32,000 volunteer participants with neck-to-knee body MRI. Each scan is linked to extensive metadata, providing a comprehensive survey of imaged anatomy and related health states. Despite its potential for research, this vast amount of data presents a challenge to established methods of evaluation, which often rely on manual input. In this work, neural networks were trained for regression to infer various biological metrics from the neck-to-knee body MRI automatically, with a ResNet50 in 7-fold cross-validation. No manual intervention or ground truth segmentations are required for training. The examined fields span 64 variables derived from anthropometric measurements, dual-energy X-ray absorptiometry (DXA), atlas-based segmentations, and dedicated liver scans. Results: The standardized framework achieved a close fit to the target values (median R^2 > 0.97). Interpretation of aggregated saliency maps indicates that the network correctly targets specific body regions and limbs, and learned to emulate different modalities. On several body composition metrics, the quality of the predictions is within the range of variability observed between established gold standard techniques. Conclusion and Significance: The deep regression framework robustly inferred a wide range of medically relevant metrics from the image data. In practice, this technique could provide accurate, image-based measurements for medical research months or years before the more established reference methods have been fully applied. |
Tasks | |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06862v2 |
https://arxiv.org/pdf/2002.06862v2.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-biometry-with-interpretable |
Repo | https://github.com/tarolangner/mri-biometry |
Framework | pytorch |
Frosting Weights for Better Continual Training
Title | Frosting Weights for Better Continual Training |
Authors | Xiaofeng Zhu, Feng Liu, Goce Trajcevski, Dingding Wang |
Abstract | Training a neural network model can be a lifelong learning process and is a computationally intensive one. A severe adverse effect that may occur in deep neural network models is that they can suffer from catastrophic forgetting during retraining on new data. To avoid such disruptions in the continuous learning, one appealing property is the additive nature of ensemble models. In this paper, we propose two generic ensemble approaches, gradient boosting and meta-learning, to solve the catastrophic forgetting problem in tuning pre-trained neural network models. |
Tasks | Meta-Learning |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.01829v1 |
https://arxiv.org/pdf/2001.01829v1.pdf | |
PWC | https://paperswithcode.com/paper/frosting-weights-for-better-continual |
Repo | https://github.com/XiaofengZhu/frosting_weights |
Framework | tf |
A Tool for Conducting User Studies on Mobile Devices
Title | A Tool for Conducting User Studies on Mobile Devices |
Authors | Luca Costa, Mohammad Aliannejadi, Fabio Crestani |
Abstract | With the ever-growing interest in the area of mobile information retrieval and the ongoing fast development of mobile devices and, as a consequence, mobile apps, an active research area lies in studying users’ behavior and search queries users submit on mobile devices. However, many researchers require to develop an app that collects useful information from users while they search on their phones or participate in a user study. In this paper, we aim to address this need by providing a comprehensive Android app, called Omicron, which can be used to collect mobile query logs and perform user studies on mobile devices. Omicron, at its current version, can collect users’ mobile queries, relevant documents, sensor data as well as user activity and interaction data in various study settings. Furthermore, we designed Omicron in such a way that it is conveniently extendable to conduct more specific studies and collect other types of sensor data. Finally, we provide a tool to monitor the participants and their data both during and after the collection process. |
Tasks | Information Retrieval |
Published | 2020-01-31 |
URL | https://arxiv.org/abs/2001.11913v1 |
https://arxiv.org/pdf/2001.11913v1.pdf | |
PWC | https://paperswithcode.com/paper/a-tool-for-conducting-user-studies-on-mobile |
Repo | https://github.com/aliannejadi/Omicron |
Framework | none |
Channel Pruning via Optimal Thresholding
Title | Channel Pruning via Optimal Thresholding |
Authors | Yun Ye, Ganmei You, Jong-Kae Fwu, Xia Zhu, Qing Yang, Yuan Zhu |
Abstract | Structured pruning, especially channel pruning is widely used for the reduced computational cost and the compatibility with off-the-shelf hardware devices. Among existing works, weights are typically removed using a predefined global threshold, or a threshold computed from a predefined metric. The predefined global threshold based designs ignore the variation among different layers and weights distribution, therefore, they may often result in sub-optimal performance caused by over-pruning or under-pruning. In this paper, we present a simple yet effective method, termed Optimal Thresholding (OT), to prune channels with layer dependent thresholds that optimally separate important from negligible channels. By using OT, most negligible or unimportant channels are pruned to achieve high sparsity while minimizing performance degradation. Since most important weights are preserved, the pruned model can be further fine-tuned and quickly converge with very few iterations. Our method demonstrates superior performance, especially when compared to the state-of-the-art designs at high levels of sparsity. On CIFAR-100, a pruned and fine-tuned DenseNet-121 by using OT achieves 75.99% accuracy with only 1.46e8 FLOPs and 0.71M parameters. code is available at: https://github.com/yeyun11/netslim. |
Tasks | |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.04566v2 |
https://arxiv.org/pdf/2003.04566v2.pdf | |
PWC | https://paperswithcode.com/paper/channel-pruning-via-optimal-thresholding |
Repo | https://github.com/yeyun11/netslim |
Framework | pytorch |
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Title | RAFT: Recurrent All-Pairs Field Transforms for Optical Flow |
Authors | Zachary Teed, Jia Deng |
Abstract | We introduce Recurrent All-Pairs Field Transforms (RAFT), a new deep network architecture for optical flow. RAFT extracts per-pixel features, builds multi-scale 4D correlation volumes for all pairs of pixels, and iteratively updates a flow field through a recurrent unit that performs lookups on the correlation volumes. RAFT achieves state-of-the-art performance, with strong cross-dataset generalization and high efficiency in inference time, training speed, and parameter count. Code is available \url{https://github.com/princeton-vl/RAFT}. |
Tasks | Optical Flow Estimation |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.12039v1 |
https://arxiv.org/pdf/2003.12039v1.pdf | |
PWC | https://paperswithcode.com/paper/raft-recurrent-all-pairs-field-transforms-for |
Repo | https://github.com/princeton-vl/RAFT |
Framework | pytorch |
The troublesome kernel: why deep learning for inverse problems is typically unstable
Title | The troublesome kernel: why deep learning for inverse problems is typically unstable |
Authors | Nina M. Gottschling, Vegard Antun, Ben Adcock, Anders C. Hansen |
Abstract | There is overwhelming empirical evidence that Deep Learning (DL) leads to unstable methods in applications ranging from image classification and computer vision to voice recognition and automated diagnosis in medicine. Recently, a similar instability phenomenon has been discovered when DL is used to solve certain problems in computational science, namely, inverse problems in imaging. In this paper we present a comprehensive mathematical analysis explaining the many facets of the instability phenomenon in DL for inverse problems. Our main results not only explain why this phenomenon occurs, they also shed light as to why finding a cure for instabilities is so difficult in practice. Additionally, these theorems show that instabilities are typically not rare events - rather, they can occur even when the measurements are subject to completely random noise - and consequently how easy it can be to destablise certain trained neural networks. We also examine the delicate balance between reconstruction performance and stability, and in particular, how DL methods may outperform state-of-the-art sparse regularization methods, but at the cost of instability. Finally, we demonstrate a counterintuitive phenomenon: training a neural network may generically not yield an optimal reconstruction method for an inverse problem. |
Tasks | Image Classification |
Published | 2020-01-05 |
URL | https://arxiv.org/abs/2001.01258v1 |
https://arxiv.org/pdf/2001.01258v1.pdf | |
PWC | https://paperswithcode.com/paper/the-troublesome-kernel-why-deep-learning-for |
Repo | https://github.com/vegarant/troub_ker |
Framework | none |