January 31, 2020

3146 words 15 mins read

Paper Group AWR 374

Human uncertainty makes classification more robust. Deep Learning How to Fit an Intravoxel Incoherent Motion Model to Diffusion-Weighted MRI. ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity. AMR Normalization for Fairer Evaluation. Unified Generative Adversarial Networks for Controllable Image-t …

Human uncertainty makes classification more robust


Title	Human uncertainty makes classification more robust
Authors	Joshua C. Peterson, Ruairidh M. Battleday, Thomas L. Griffiths, Olga Russakovsky
Abstract	The classification performance of deep neural networks has begun to asymptote at near-perfect levels. However, their ability to generalize outside the training set and their robustness to adversarial attacks have not. In this paper, we make progress on this problem by training with full label distributions that reflect human perceptual uncertainty. We first present a new benchmark dataset which we call CIFAR10H, containing a full distribution of human labels for each image of the CIFAR10 test set. We then show that, while contemporary classifiers fail to exhibit human-like uncertainty on their own, explicit training on our dataset closes this gap, supports improved generalization to increasingly out-of-training-distribution test datasets, and confers robustness to adversarial attacks.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07086v1
PDF	https://arxiv.org/pdf/1908.07086v1.pdf
PWC	https://paperswithcode.com/paper/human-uncertainty-makes-classification-more
Repo	https://github.com/jcpeterson/cifar-10h
Framework	pytorch

Deep Learning How to Fit an Intravoxel Incoherent Motion Model to Diffusion-Weighted MRI


Title	Deep Learning How to Fit an Intravoxel Incoherent Motion Model to Diffusion-Weighted MRI
Authors	Sebastiano Barbieri, Oliver J. Gurney-Champion, Remy Klaassen, Harriet C. Thoeny
Abstract	Purpose: This prospective clinical study assesses the feasibility of training a deep neural network (DNN) for intravoxel incoherent motion (IVIM) model fitting to diffusion-weighted magnetic resonance imaging (DW-MRI) data and evaluates its performance. Methods: In May 2011, ten male volunteers (age range: 29 to 53 years, mean: 37 years) underwent DW-MRI of the upper abdomen on 1.5T and 3.0T magnetic resonance scanners. Regions of interest in the left and right liver lobe, pancreas, spleen, renal cortex, and renal medulla were delineated independently by two readers. DNNs were trained for IVIM model fitting using these data; results were compared to least-squares and Bayesian approaches to IVIM fitting. Intraclass Correlation Coefficients (ICC) were used to assess consistency of measurements between readers. Intersubject variability was evaluated using Coefficients of Variation (CV). The fitting error was calculated based on simulated data and the average fitting time of each method was recorded. Results: DNNs were trained successfully for IVIM parameter estimation. This approach was associated with high consistency between the two readers (ICCs between 50 and 97%), low intersubject variability of estimated parameter values (CVs between 9.2 and 28.4), and the lowest error when compared with least-squares and Bayesian approaches. Fitting by DNNs was several orders of magnitude quicker than the other methods but the networks may need to be re-trained for different acquisition protocols or imaged anatomical regions. Conclusion: DNNs are recommended for accurate and robust IVIM model fitting to DW-MRI data. Suitable software is available at (1).
Tasks
Published	2019-02-28
URL	https://arxiv.org/abs/1903.00095v2
PDF	https://arxiv.org/pdf/1903.00095v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-how-to-fit-an-intravoxel
Repo	https://github.com/sebbarb/deep_ivim
Framework	pytorch

ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity


Title	ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Authors	Otmar Ertl
Abstract	The probability Jaccard similarity was recently proposed as a natural generalization of the Jaccard similarity to measure the proximity of sets whose elements are associated with relative frequencies or probabilities. In combination with a hash algorithm that maps those weighted sets to compact signatures which allow fast estimation of pairwise similarities, it constitutes a valuable method for big data applications such as near-duplicate detection, nearest neighbor search, or clustering. This paper introduces a class of locality-sensitive one-pass hash algorithms that are orders of magnitude faster than the original approach. The performance gain is achieved by calculating signature components not independently, but collectively. Four different algorithms are proposed based on this idea. Two of them are statistically equivalent to the original approach and can be used as direct replacements. The other two may even improve the estimation error by breaking the statistical independence of signature components. Moreover, the presented techniques can be specialized for the conventional Jaccard similarity, resulting in highly efficient algorithms that outperform traditional minwise hashing.
Tasks
Published	2019-11-02
URL	https://arxiv.org/abs/1911.00675v1
PDF	https://arxiv.org/pdf/1911.00675v1.pdf
PWC	https://paperswithcode.com/paper/probminhash-a-class-of-locality-sensitive
Repo	https://github.com/oertl/probminhash
Framework	none

AMR Normalization for Fairer Evaluation


Title	AMR Normalization for Fairer Evaluation
Authors	Michael Wayne Goodman
Abstract	Meaning Representation (AMR; Banarescu et al., 2013) encodes the meaning of sentences as a directed graph and Smatch (Cai and Knight, 2013) is the primary metric for evaluating AMR graphs. Smatch, however, is unaware of some meaning-equivalent variations in graph structure allowed by the AMR Specification and gives different scores for AMRs exhibiting these variations. In this paper I propose four normalization methods for helping to ensure that conceptually equivalent AMRs are evaluated as equivalent. Equivalent AMRs with and without normalization can look quite different—comparing a gold corpus to itself with relation reification alone yields a difference of 25 Smatch points, suggesting that the outputs of two systems may not be directly comparable without normalization. The algorithms described in this paper are implemented on top of an existing open-source Python toolkit for AMR and will be released under the same license.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01568v2
PDF	https://arxiv.org/pdf/1909.01568v2.pdf
PWC	https://paperswithcode.com/paper/amr-normalization-for-fairer-evaluation
Repo	https://github.com/goodmami/penman
Framework	none

Unified Generative Adversarial Networks for Controllable Image-to-Image Translation


Title	Unified Generative Adversarial Networks for Controllable Image-to-Image Translation
Authors	Hao Tang, Hong Liu, Nicu Sebe
Abstract	Controllable image-to-image translation, i.e., transferring an image from a source domain to a target one guided by controllable structures, has attracted much attention in both academia and industry. In this paper, we propose a unified Generative Adversarial Network (GAN) framework for controllable image-to-image translation. In addition to conditioning on a reference image, we show how the model can generate images conditioned on controllable structures, e.g., class labels, object keypoints, human skeletons and scene semantic maps. The proposed GAN framework consists of a single generator and a discriminator taking a conditional image and the target controllable structure as input. In this way, the conditional image can provide appearance information and the controllable structure can provide the structure information for generating the target result. Moreover, the proposed GAN learns the image-to-image mapping through three novel losses, i.e., color loss, controllable structure-guided cycle-consistency loss and controllable structure-guided self-identity preserving loss. Note that the proposed color loss handles the issue of “channel pollution” when back-propagating the gradients. In addition, we present the Fr'echet ResNet Distance (FRD) to evaluate the quality of generated images. Extensive qualitative and quantitative experiments on two challenging image translation tasks with four different datasets demonstrate that the proposed GAN model generates convincing results, and significantly outperforms other state-of-the-art methods on both tasks. Meanwhile, the proposed GAN framework is a unified solution, thus it can be applied to solving other controllable structure-guided image-to-image translation tasks, such as landmark-guided facial expression translation and keypoint-guided person image generation.
Tasks	Image Generation, Image-to-Image Translation
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06112v1
PDF	https://arxiv.org/pdf/1912.06112v1.pdf
PWC	https://paperswithcode.com/paper/unified-generative-adversarial-networks-for
Repo	https://github.com/Ha0Tang/GestureGAN
Framework	pytorch


Title	Efficient Blind Deblurring under High Noise Levels
Authors	Jérémy Anger, Mauricio Delbracio, Gabriele Facciolo
Abstract	The goal of blind image deblurring is to recover a sharp image from a motion blurred one without knowing the camera motion. Current state-of-the-art methods have a remarkably good performance on images with no noise or very low noise levels. However, the noiseless assumption is not realistic considering that low light conditions are the main reason for the presence of motion blur due to requiring longer exposure times. In fact, motion blur and high to moderate noise often appear together. Most works approach this problem by first estimating the blur kernel $k$ and then deconvolving the noisy blurred image. In this work, we first show that current state-of-the-art kernel estimation methods based on the $\ell_0$ gradient prior can be adapted to handle high noise levels while keeping their efficiency. Then, we show that a fast non-blind deconvolution method can be significantly improved by first denoising the blurry image. The proposed approach yields results that are equivalent to those obtained with much more computationally demanding methods.
Tasks	Blind Image Deblurring, Deblurring, Denoising
Published	2019-04-19
URL	https://arxiv.org/abs/1904.09154v2
PDF	https://arxiv.org/pdf/1904.09154v2.pdf
PWC	https://paperswithcode.com/paper/efficient-blind-deblurring-under-high-noise
Repo	https://github.com/kidanger/high-noise-deblurring
Framework	pytorch

FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow


Title	FlowSeq: Non-Autoregressive Conditional Sequence Generation with Generative Flow
Authors	Xuezhe Ma, Chunting Zhou, Xian Li, Graham Neubig, Eduard Hovy
Abstract	Most sequence-to-sequence (seq2seq) models are autoregressive; they generate each token by conditioning on previously generated tokens. In contrast, non-autoregressive seq2seq models generate all tokens in one pass, which leads to increased efficiency through parallel processing on hardware such as GPUs. However, directly modeling the joint distribution of all tokens simultaneously is challenging, and even with increasingly complex model structures accuracy lags significantly behind autoregressive models. In this paper, we propose a simple, efficient, and effective model for non-autoregressive sequence generation using latent variable models. Specifically, we turn to generative flow, an elegant technique to model complex distributions using neural networks, and design several layers of flow tailored for modeling the conditional density of sequential latent variables. We evaluate this model on three neural machine translation (NMT) benchmark datasets, achieving comparable performance with state-of-the-art non-autoregressive NMT models and almost constant decoding time w.r.t the sequence length.
Tasks	Latent Variable Models, Machine Translation
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02480v3
PDF	https://arxiv.org/pdf/1909.02480v3.pdf
PWC	https://paperswithcode.com/paper/flowseq-non-autoregressive-conditional
Repo	https://github.com/XuezheMax/flowseq
Framework	pytorch

Simplifying Graph Convolutional Networks


Title	Simplifying Graph Convolutional Networks
Authors	Felix Wu, Tianyi Zhang, Amauri Holanda de Souza Jr., Christopher Fifty, Tao Yu, Kilian Q. Weinberger
Abstract	Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, may inherit unnecessary complexity and redundant computation. In this paper, we reduce this excess complexity through successively removing nonlinearities and collapsing weight matrices between consecutive layers. We theoretically analyze the resulting linear model and show that it corresponds to a fixed low-pass filter followed by a linear classifier. Notably, our experimental evaluation demonstrates that these simplifications do not negatively impact accuracy in many downstream applications. Moreover, the resulting model scales to larger datasets, is naturally interpretable, and yields up to two orders of magnitude speedup over FastGCN.
Tasks	Graph Regression, Image Classification, Relation Extraction, Sentiment Analysis, Skeleton Based Action Recognition, Text Classification
Published	2019-02-19
URL	https://arxiv.org/abs/1902.07153v2
PDF	https://arxiv.org/pdf/1902.07153v2.pdf
PWC	https://paperswithcode.com/paper/simplifying-graph-convolutional-networks
Repo	https://github.com/cvignac/gnn_statistics
Framework	tf

Should All Temporal Difference Learning Use Emphasis?


Title	Should All Temporal Difference Learning Use Emphasis?
Authors	Xiang Gu, Sina Ghiassian, Richard S. Sutton
Abstract	Emphatic Temporal Difference (ETD) learning has recently been proposed as a convergent off-policy learning method. ETD was proposed mainly to address convergence issues of conventional Temporal Difference (TD) learning under off-policy training but it is different from conventional TD learning even under on-policy training. A simple counterexample provided back in 2017 pointed to a potential class of problems where ETD converges but TD diverges. In this paper, we empirically show that ETD converges on a few other well-known on-policy experiments whereas TD either diverges or performs poorly. We also show that ETD outperforms TD on the mountain car prediction problem. Our results, together with a similar pattern observed under off-policy training in prior works, suggest that ETD might be a good substitute over conventional TD.
Tasks
Published	2019-03-01
URL	http://arxiv.org/abs/1903.00194v1
PDF	http://arxiv.org/pdf/1903.00194v1.pdf
PWC	https://paperswithcode.com/paper/should-all-temporal-difference-learning-use
Repo	https://github.com/Xiang-Gu/Should-ALL-Temporal-Difference-Learning-Use-Emphasis
Framework	none

Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks


Title	Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks
Authors	Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser
Abstract	Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks with the price of high computational demands. Inspired by the observation that spatial correlation exists in CNN output feature maps (ofms), we propose a method to dynamically predict whether ofm activations are zero-valued or not according to their neighboring activation values, thereby avoiding zero-valued activations and reducing the number of convolution operations. We implement the zero activation predictor (ZAP) with a lightweight CNN, which imposes negligible overheads and is easy to train and deploy on existing models. Furthermore, without model retraining, the same ZAP can be tuned to many different operating points along the accuracy-savings trade-off curve. For example, using VGG-16 and the ILSVRC-2012 dataset, two different operating points achieve a reduction of 20% and 30% multiply-accumulate (MAC) operations with top-1/top-5 accuracy degradation of 0.1%/0.04% and 1.3%/0.7% without fine-tuning of the entire model, respectively. Considering one-epoch fine-tuning, 45% MAC operations may be reduced with 1.3%/0.7% accuracy degradation.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07636v2
PDF	https://arxiv.org/pdf/1909.07636v2.pdf
PWC	https://paperswithcode.com/paper/thanks-for-nothing-predicting-zero-valued
Repo	https://github.com/gilshm/zap
Framework	pytorch

DeepRED: Deep Image Prior Powered by RED


Title	DeepRED: Deep Image Prior Powered by RED
Authors	Gary Mataev, Michael Elad, Peyman Milanfar
Abstract	Inverse problems in imaging are extensively studied, with a variety of strategies, tools, and theory that have been accumulated over the years. Recently, this field has been immensely influenced by the emergence of deep-learning techniques. One such contribution, which is the focus of this paper, is the Deep Image Prior (DIP) work by Ulyanov, Vedaldi, and Lempitsky (2018). DIP offers a new approach towards the regularization of inverse problems, obtained by forcing the recovered image to be synthesized from a given deep architecture. While DIP has been shown to be quite an effective unsupervised approach, its results still fall short when compared to state-of-the-art alternatives. In this work, we aim to boost DIP by adding an explicit prior, which enriches the overall regularization effect in order to lead to better-recovered images. More specifically, we propose to bring-in the concept of Regularization by Denoising (RED), which leverages existing denoisers for regularizing inverse problems. Our work shows how the two (DIP and RED) can be merged into a highly effective unsupervised recovery process while avoiding the need to differentiate the chosen denoiser, and leading to very effective results, demonstrated for several tested problems.
Tasks	Deblurring, Denoising, Image Super-Resolution
Published	2019-03-25
URL	https://arxiv.org/abs/1903.10176v3
PDF	https://arxiv.org/pdf/1903.10176v3.pdf
PWC	https://paperswithcode.com/paper/deepred-deep-image-prior-powered-by-red
Repo	https://github.com/GaryMataev/DeepRED
Framework	pytorch

Multilingual Neural Machine Translation with Knowledge Distillation


Title	Multilingual Neural Machine Translation with Knowledge Distillation
Authors	Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu
Abstract	Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving. However, traditional multilingual translation usually yields inferior accuracy compared with the counterpart using individual models for each language pair, due to language diversity and model capacity limitations. In this paper, we propose a distillation-based approach to boost the accuracy of multilingual machine translation. Specifically, individual models are first trained and regarded as teachers, and then the multilingual model is trained to fit the training data and match the outputs of individual models simultaneously through knowledge distillation. Experiments on IWSLT, WMT and Ted talk translation datasets demonstrate the effectiveness of our method. Particularly, we show that one model is enough to handle multiple languages (up to 44 languages in our experiment), with comparable or even better accuracy than individual models.
Tasks	Machine Translation
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10461v3
PDF	http://arxiv.org/pdf/1902.10461v3.pdf
PWC	https://paperswithcode.com/paper/multilingual-neural-machine-translation-with-2
Repo	https://github.com/RayeRen/multilingual-kd-pytorch
Framework	pytorch

Towards Recognizing Phrase Translation Processes: Experiments on English-French


Title	Towards Recognizing Phrase Translation Processes: Experiments on English-French
Authors	Yuming Zhai, Pooyan Safari, Gabriel Illouz, Alexandre Allauzen, Anne Vilnat
Abstract	When translating phrases (words or group of words), human translators, consciously or not, resort to different translation processes apart from the literal translation, such as Idiom Equivalence, Generalization, Particularization, Semantic Modulation, etc. Translators and linguists (such as Vinay and Darbelnet, Newmark, etc.) have proposed several typologies to characterize the different translation processes. However, to the best of our knowledge, there has not been effort to automatically classify these fine-grained translation processes. Recently, an English-French parallel corpus of TED Talks has been manually annotated with translation process categories, along with established annotation guidelines. Based on these annotated examples, we propose an automatic classification of translation processes at subsentential level. Experimental results show that we can distinguish non-literal translation from literal translation with an accuracy of 87.09%, and 55.20% for classifying among five non-literal translation processes. This work demonstrates that it is possible to automatically classify translation processes. Even with a small amount of annotated examples, our experiments show the directions that we can follow in future work. One of our long term objectives is leveraging this automatic classification to better control paraphrase extraction from bilingual parallel corpora.
Tasks
Published	2019-04-27
URL	http://arxiv.org/abs/1904.12213v1
PDF	http://arxiv.org/pdf/1904.12213v1.pdf
PWC	https://paperswithcode.com/paper/towards-recognizing-phrase-translation
Repo	https://github.com/YumingZHAI/ctp
Framework	pytorch

Multi-hop Reading Comprehension through Question Decomposition and Rescoring


Title	Multi-hop Reading Comprehension through Question Decomposition and Rescoring
Authors	Sewon Min, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi
Abstract	Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs. We propose a system for multi-hop RC that decomposes a compositional question into simpler sub-questions that can be answered by off-the-shelf single-hop RC models. Since annotations for such decomposition are expensive, we recast sub-question generation as a span prediction problem and show that our method, trained using only 400 labeled examples, generates sub-questions that are as effective as human-authored sub-questions. We also introduce a new global rescoring approach that considers each decomposition (i.e. the sub-questions and their answers) to select the best final answer, greatly improving overall performance. Our experiments on HotpotQA show that this approach achieves the state-of-the-art results, while providing explainable evidence for its decision making in the form of sub-questions.
Tasks	Decision Making, Multi-Hop Reading Comprehension, Question Generation, Reading Comprehension
Published	2019-06-07
URL	https://arxiv.org/abs/1906.02916v2
PDF	https://arxiv.org/pdf/1906.02916v2.pdf
PWC	https://paperswithcode.com/paper/multi-hop-reading-comprehension-through
Repo	https://github.com/shmsw25/DecompRC
Framework	pytorch

Comparative Analysis of CNN-based Spatiotemporal Reasoning in Videos


Title	Comparative Analysis of CNN-based Spatiotemporal Reasoning in Videos
Authors	Okan Köpüklü, Fabian Herzog, Gerhard Rigoll
Abstract	Understanding actions and gestures in video streams requires temporal reasoning of the spatial content from different time instants, i.e., spatiotemporal (ST) modeling. In this paper, we have made a comparative analysis of different ST modeling techniques. Since convolutional neural networks (CNNs) are proved to be an effective tool as a feature extractor for static images, we apply ST modeling techniques on the features of static images from different time instants extracted by CNNs. All techniques are trained end-to-end together with a CNN feature extraction part and evaluated on two publicly available benchmarks: The Jester and the Something-Something dataset. The Jester dataset contains various dynamic and static hand gestures, whereas the Something-Something dataset contains actions of human-object interactions. The common characteristic of these two benchmarks is that the designed architectures need to capture the full temporal content of the actions/gestures in the correct order. Contrary to expectations, experimental results show that recurrent neural network (RNN) based ST modeling techniques yield inferior results compared to other techniques such as fully convolutional architectures. Codes and pretrained models of this work are publicly available.
Tasks	Action Recognition In Videos, Human-Object Interaction Detection
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05165v1
PDF	https://arxiv.org/pdf/1909.05165v1.pdf
PWC	https://paperswithcode.com/paper/comparative-analysis-of-cnn-based
Repo	https://github.com/fubel/stmodeling
Framework	pytorch