October 21, 2019

3144 words 15 mins read

Paper Group AWR 139

Handling Incomplete Heterogeneous Data using VAEs. N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules. Deep Defense: Training DNNs with Improved Adversarial Robustness. Using Deep Learning to Extend the Range of Air-Pollution Monitoring and Forecasting. Image Super-Resolution via Dual-State Recurrent Network …

Handling Incomplete Heterogeneous Data using VAEs


Title	Handling Incomplete Heterogeneous Data using VAEs
Authors	Alfredo Nazabal, Pablo M. Olmos, Zoubin Ghahramani, Isabel Valera
Abstract	Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate to capture the latent structure of vast amounts of complex high-dimensional data. However, existing VAEs can still not directly handle data that are heterogenous (mixed continuous and discrete) or incomplete (with missing data at random), which is indeed common in real-world applications. In this paper, we propose a general framework to design VAEs, suitable for fitting incomplete heterogenous data. The proposed HI-VAE includes likelihood models for real-valued, positive real valued, interval, categorical, ordinal and count data, and allows to estimate (and potentially impute) missing data accurately. Furthermore, HI-VAE presents competitive predictive performance in supervised tasks, outperforming super- vised models when trained on incomplete data
Tasks
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03653v3
PDF	http://arxiv.org/pdf/1807.03653v3.pdf
PWC	https://paperswithcode.com/paper/handling-incomplete-heterogeneous-data-using
Repo	https://github.com/probabilistic-learning/HI-VAE
Framework	tf

N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules


Title	N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules
Authors	Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang
Abstract	Machine learning techniques have recently been adopted in various applications in medicine, biology, chemistry, and material engineering. An important task is to predict the properties of molecules, which serves as the main subroutine in many downstream applications such as virtual screening and drug design. Despite the increasing interest, the key challenge is to construct proper representations of molecules for learning algorithms. This paper introduces the N-gram graph, a simple unsupervised representation for molecules. The method first embeds the vertices in the molecule graph. It then constructs a compact representation for the graph by assembling the vertex embeddings in short walks in the graph, which we show is equivalent to a simple graph neural network that needs no training. The representations can thus be efficiently computed and then used with supervised learning methods for prediction. Experiments on 60 tasks from 10 benchmark datasets demonstrate its advantages over both popular graph neural networks and traditional representation methods. This is complemented by theoretical analysis showing its strong representation and prediction power.
Tasks
Published	2018-06-24
URL	https://arxiv.org/abs/1806.09206v2
PDF	https://arxiv.org/pdf/1806.09206v2.pdf
PWC	https://paperswithcode.com/paper/n-gram-graph-a-novel-molecule-representation
Repo	https://github.com/chao1224/n_gram_graph
Framework	none

Deep Defense: Training DNNs with Improved Adversarial Robustness


Title	Deep Defense: Training DNNs with Improved Adversarial Robustness
Authors	Ziang Yan, Yiwen Guo, Changshui Zhang
Abstract	Despite the efficacy on a variety of computer vision tasks, deep neural networks (DNNs) are vulnerable to adversarial attacks, limiting their applications in security-critical systems. Recent works have shown the possibility of generating imperceptibly perturbed image inputs (a.k.a., adversarial examples) to fool well-trained DNN classifiers into making arbitrary predictions. To address this problem, we propose a training recipe named “deep defense”. Our core idea is to integrate an adversarial perturbation-based regularizer into the classification objective, such that the obtained models learn to resist potential attacks, directly and precisely. The whole optimization problem is solved just like training a recursive network. Experimental results demonstrate that our method outperforms training with adversarial/Parseval regularizations by large margins on various datasets (including MNIST, CIFAR-10 and ImageNet) and different DNN architectures. Code and models for reproducing our results are available at https://github.com/ZiangYan/deepdefense.pytorch
Tasks
Published	2018-02-23
URL	http://arxiv.org/abs/1803.00404v3
PDF	http://arxiv.org/pdf/1803.00404v3.pdf
PWC	https://paperswithcode.com/paper/deep-defense-training-dnns-with-improved
Repo	https://github.com/ZiangYan/deepdefense.pytorch
Framework	pytorch

Using Deep Learning to Extend the Range of Air-Pollution Monitoring and Forecasting


Title	Using Deep Learning to Extend the Range of Air-Pollution Monitoring and Forecasting
Authors	Philipp Haehnel, Jakub Marecek, Julien Monteil, Fearghal O’Donncha
Abstract	Across numerous applications, forecasting relies on numerical solvers for partial differential equations (PDEs). Although the use of deep-learning techniques has been proposed, actual applications have been restricted by the fact the training data are obtained using traditional PDE solvers. Thereby, the uses of deep-learning techniques were limited to domains, where the PDE solver was applicable. We demonstrate a deep-learning framework for air-pollution monitoring and forecasting that provides the ability to train across different model domains, as well as a reduction in the run-time by two orders of magnitude. It presents a first-of-a-kind implementation that combines deep-learning and domain-decomposition techniques to allow model deployments extend beyond the domain(s) on which the it has been trained.
Tasks
Published	2018-10-22
URL	https://arxiv.org/abs/1810.09425v3
PDF	https://arxiv.org/pdf/1810.09425v3.pdf
PWC	https://paperswithcode.com/paper/scaling-up-deep-learning-for-pde-based-models
Repo	https://github.com/IBM/pde-deep-learning
Framework	tf

Image Super-Resolution via Dual-State Recurrent Networks


Title	Image Super-Resolution via Dual-State Recurrent Networks
Authors	Wei Han, Shiyu Chang, Ding Liu, Mo Yu, Michael Witbrock, Thomas S. Huang
Abstract	Advances in image super-resolution (SR) have recently benefited significantly from rapid developments in deep neural networks. Inspired by these recent discoveries, we note that many state-of-the-art deep SR architectures can be reformulated as a single-state recurrent neural network (RNN) with finite unfoldings. In this paper, we explore new structures for SR based on this compact RNN view, leading us to a dual-state design, the Dual-State Recurrent Network (DSRN). Compared to its single state counterparts that operate at a fixed spatial resolution, DSRN exploits both low-resolution (LR) and high-resolution (HR) signals jointly. Recurrent signals are exchanged between these states in both directions (both LR to HR and HR to LR) via delayed feedback. Extensive quantitative and qualitative evaluations on benchmark datasets and on a recent challenge demonstrate that the proposed DSRN performs favorably against state-of-the-art algorithms in terms of both memory consumption and predictive accuracy.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02704v1
PDF	http://arxiv.org/pdf/1805.02704v1.pdf
PWC	https://paperswithcode.com/paper/image-super-resolution-via-dual-state
Repo	https://github.com/WeiHan3/dsrn
Framework	tf

Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation


Title	Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation
Authors	Samuel Läubli, Rico Sennrich, Martin Volk
Abstract	Recent research suggests that neural machine translation achieves parity with professional human translation on the WMT Chinese–English news translation task. We empirically test this claim with alternative evaluation protocols, contrasting the evaluation of single sentences and entire documents. In a pairwise ranking experiment, human raters assessing adequacy and fluency show a stronger preference for human over machine translation when evaluating documents as compared to isolated sentences. Our findings emphasise the need to shift towards document-level evaluation as machine translation improves to the degree that errors which are hard or impossible to spot at the sentence-level become decisive in discriminating quality of different translation outputs.
Tasks	Machine Translation
Published	2018-08-21
URL	http://arxiv.org/abs/1808.07048v1
PDF	http://arxiv.org/pdf/1808.07048v1.pdf
PWC	https://paperswithcode.com/paper/has-machine-translation-achieved-human-parity
Repo	https://github.com/laeubli/parity
Framework	none

Sliced-Wasserstein Autoencoder: An Embarrassingly Simple Generative Model


Title	Sliced-Wasserstein Autoencoder: An Embarrassingly Simple Generative Model
Authors	Soheil Kolouri, Phillip E. Pope, Charles E. Martin, Gustavo K. Rohde
Abstract	In this paper we study generative modeling via autoencoders while using the elegant geometric properties of the optimal transport (OT) problem and the Wasserstein distances. We introduce Sliced-Wasserstein Autoencoders (SWAE), which are generative models that enable one to shape the distribution of the latent space into any samplable probability distribution without the need for training an adversarial network or defining a closed-form for the distribution. In short, we regularize the autoencoder loss with the sliced-Wasserstein distance between the distribution of the encoded training samples and a predefined samplable distribution. We show that the proposed formulation has an efficient numerical solution that provides similar capabilities to Wasserstein Autoencoders (WAE) and Variational Autoencoders (VAE), while benefiting from an embarrassingly simple implementation.
Tasks
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01947v3
PDF	http://arxiv.org/pdf/1804.01947v3.pdf
PWC	https://paperswithcode.com/paper/sliced-wasserstein-autoencoder-an
Repo	https://github.com/skolouri/swae
Framework	pytorch

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering


Title	R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Authors	Pan Lu, Lei Ji, Wei Zhang, Nan Duan, Ming Zhou, Jianyong Wang
Abstract	Recently, Visual Question Answering (VQA) has emerged as one of the most significant tasks in multimodal learning as it requires understanding both visual and textual modalities. Existing methods mainly rely on extracting image and question features to learn their joint feature embedding via multimodal fusion or attention mechanism. Some recent studies utilize external VQA-independent models to detect candidate entities or attributes in images, which serve as semantic knowledge complementary to the VQA task. However, these candidate entities or attributes might be unrelated to the VQA task and have limited semantic capacities. To better utilize semantic knowledge in images, we propose a novel framework to learn visual relation facts for VQA. Specifically, we build up a Relation-VQA (R-VQA) dataset based on the Visual Genome dataset via a semantic similarity module, in which each data consists of an image, a corresponding question, a correct answer and a supporting relation fact. A well-defined relation detector is then adopted to predict visual question-related relation facts. We further propose a multi-step attention model composed of visual attention and semantic attention sequentially to extract related visual knowledge and semantic knowledge. We conduct comprehensive experiments on the two benchmark datasets, demonstrating that our model achieves state-of-the-art performance and verifying the benefit of considering visual relation facts.
Tasks	Question Answering, Semantic Similarity, Semantic Textual Similarity, Visual Question Answering
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09701v2
PDF	http://arxiv.org/pdf/1805.09701v2.pdf
PWC	https://paperswithcode.com/paper/r-vqa-learning-visual-relation-facts-with
Repo	https://github.com/lupantech/rvqa
Framework	none

Conditional GANs for Multi-Illuminant Color Constancy: Revolution or Yet Another Approach?


Title	Conditional GANs for Multi-Illuminant Color Constancy: Revolution or Yet Another Approach?
Authors	Oleksii Sidorov
Abstract	Non-uniform and multi-illuminant color constancy are important tasks, the solution of which will allow to discard information about lighting conditions in the image. Non-uniform illumination and shadows distort colors of real-world objects and mostly do not contain valuable information. Thus, many computer vision and image processing techniques would benefit from automatic discarding of this information at the pre-processing step. In this work we propose novel view on this classical problem via generative end-to-end algorithm based on image conditioned Generative Adversarial Network. We also demonstrate the potential of the given approach for joint shadow detection and removal. Forced by the lack of training data, we render the largest existing shadow removal dataset and make it publicly available. It consists of approximately 6,000 pairs of wide field of view synthetic images with and without shadows.
Tasks	Color Constancy, Shadow Detection, Shadow Detection And Removal
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06604v2
PDF	http://arxiv.org/pdf/1811.06604v2.pdf
PWC	https://paperswithcode.com/paper/conditional-gans-for-multi-illuminant-color
Repo	https://github.com/acecreamu/angularGAN
Framework	pytorch

Adversarial Autoencoders with Constant-Curvature Latent Manifolds


Title	Adversarial Autoencoders with Constant-Curvature Latent Manifolds
Authors	Daniele Grattarola, Lorenzo Livi, Cesare Alippi
Abstract	Constant-curvature Riemannian manifolds (CCMs) have been shown to be ideal embedding spaces in many application domains, as their non-Euclidean geometry can naturally account for some relevant properties of data, like hierarchy and circularity. In this work, we introduce the CCM adversarial autoencoder (CCM-AAE), a probabilistic generative model trained to represent a data distribution on a CCM. Our method works by matching the aggregated posterior of the CCM-AAE with a probability distribution defined on a CCM, so that the encoder implicitly learns to represent data on the CCM to fool the discriminator network. The geometric constraint is also explicitly imposed by jointly training the CCM-AAE to maximise the membership degree of the embeddings to the CCM. While a few works in recent literature make use of either hyperspherical or hyperbolic manifolds for different learning tasks, ours is the first unified framework to seamlessly deal with CCMs of different curvatures. We show the effectiveness of our model on three different datasets characterised by non-trivial geometry: semi-supervised classification on MNIST, link prediction on two popular citation datasets, and graph-based molecule generation using the QM9 chemical database. Results show that our method improves upon other autoencoders based on Euclidean and non-Euclidean geometries on all tasks taken into account.
Tasks	Link Prediction
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04314v2
PDF	http://arxiv.org/pdf/1812.04314v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-autoencoders-with-constant
Repo	https://github.com/danielegrattarola/ccm-aae
Framework	none

Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks


Title	Automatic Ship Detection of Remote Sensing Images from Google Earth in Complex Scenes Based on Multi-Scale Rotation Dense Feature Pyramid Networks
Authors	Xue Yang, Hao Sun, Kun Fu, Jirui Yang, Xian Sun, Menglong Yan, Zhi Guo
Abstract	Ship detection has been playing a significant role in the field of remote sensing for a long time but it is still full of challenges. The main limitations of traditional ship detection methods usually lie in the complexity of application scenarios, the difficulty of intensive object detection and the redundancy of detection region. In order to solve such problems above, we propose a framework called Rotation Dense Feature Pyramid Networks (R-DFPN) which can effectively detect ship in different scenes including ocean and port. Specifically, we put forward the Dense Feature Pyramid Network (DFPN), which is aimed at solving the problem resulted from the narrow width of the ship. Compared with previous multi-scale detectors such as Feature Pyramid Network (FPN), DFPN builds the high-level semantic feature-maps for all scales by means of dense connections, through which enhances the feature propagation and encourages the feature reuse. Additionally, in the case of ship rotation and dense arrangement, we design a rotation anchor strategy to predict the minimum circumscribed rectangle of the object so as to reduce the redundant detection region and improve the recall. Furthermore, we also propose multi-scale ROI Align for the purpose of maintaining the completeness of semantic and spatial information. Experiments based on remote sensing images from Google Earth for ship detection show that our detection method based on R-DFPN representation has a state-of-the-art performance.
Tasks	Object Detection
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04331v1
PDF	http://arxiv.org/pdf/1806.04331v1.pdf
PWC	https://paperswithcode.com/paper/automatic-ship-detection-of-remote-sensing
Repo	https://github.com/DetectionTeamUCAS/RRPN_Faster-RCNN_Tensorflow
Framework	tf

Pushing the bounds of dropout


Title	Pushing the bounds of dropout
Authors	Gábor Melis, Charles Blundell, Tomáš Kočiský, Karl Moritz Hermann, Chris Dyer, Phil Blunsom
Abstract	We show that dropout training is best understood as performing MAP estimation concurrently for a family of conditional models whose objectives are themselves lower bounded by the original dropout objective. This discovery allows us to pick any model from this family after training, which leads to a substantial improvement on regularisation-heavy language modelling. The family includes models that compute a power mean over the sampled dropout masks, and their less stochastic subvariants with tighter and higher lower bounds than the fully stochastic dropout objective. We argue that since the deterministic subvariant’s bound is equal to its objective, and the highest amongst these models, the predominant view of it as a good approximation to MC averaging is misleading. Rather, deterministic dropout is the best available approximation to the true objective.
Tasks	Language Modelling
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09208v2
PDF	http://arxiv.org/pdf/1805.09208v2.pdf
PWC	https://paperswithcode.com/paper/pushing-the-bounds-of-dropout
Repo	https://github.com/deepmind/lamb
Framework	tf

Cauchy noise loss for stochastic optimization of random matrix models via free deterministic equivalents


Title	Cauchy noise loss for stochastic optimization of random matrix models via free deterministic equivalents
Authors	Tomohiro Hayase
Abstract	For random matrix models, the parameter estimation based on the traditional likelihood functions is not straightforward in particular when we have only one sample matrix. We introduce a new parameter optimization method for random matrix models which works even in such a case. The method is based on the spectral distribution instead of the traditional likelihood. In the method, the Cauchy noise has an essential role because the free deterministic equivalent, which is a tool in free probability theory, allows us to approximate the spectral distribution perturbed by Cauchy noises by a smooth and accessible density function. Moreover, we study an asymptotic property of determination gap, which has a similar role as generalization gap. Besides, we propose a new dimensionality recovery method for the signal-plus-noise model, and experimentally demonstrate that it recovers the rank of the signal part even if the true rank is not small. It is a simultaneous rank selection and parameter estimation procedure.
Tasks	Stochastic Optimization
Published	2018-04-09
URL	https://arxiv.org/abs/1804.03154v4
PDF	https://arxiv.org/pdf/1804.03154v4.pdf
PWC	https://paperswithcode.com/paper/cauchy-noise-loss-for-stochastic-optimization
Repo	https://github.com/ThayaFluss/cnl
Framework	none

Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network


Title	Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network
Authors	Yuanwei Li, Bishesh Khanal, Benjamin Hou, Amir Alansary, Juan J. Cerrolaza, Matthew Sinclair, Jacqueline Matthew, Chandni Gupta, Caroline Knight, Bernhard Kainz, Daniel Rueckert
Abstract	Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour-intensive and requires expert knowledge of fetal anatomy. We propose a new Iterative Transformation Network (ITN) for the automatic detection of standard planes in 3D volumes. ITN uses a convolutional neural network to learn the relationship between a 2D plane image and the transformation parameters required to move that plane towards the location/orientation of the standard plane in the 3D volume. During inference, the current plane image is passed iteratively to the network until it converges to the standard plane location. We explore the effect of using different transformation representations as regression outputs of ITN. Under a multi-task learning framework, we introduce additional classification probability outputs to the network to act as confidence measures for the regressed transformation parameters in order to further improve the localisation accuracy. When evaluated on 72 US volumes of fetal brain, our method achieves an error of 3.83mm/12.7 degrees and 3.80mm/12.6 degrees for the transventricular and transcerebellar planes respectively and takes 0.46s per plane. Source code is publicly available at https://github.com/yuanwei1989/plane-detection.
Tasks	Multi-Task Learning
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07486v2
PDF	http://arxiv.org/pdf/1806.07486v2.pdf
PWC	https://paperswithcode.com/paper/standard-plane-detection-in-3d-fetal
Repo	https://github.com/yuanwei1989/plane-detection
Framework	tf

DPatch: An Adversarial Patch Attack on Object Detectors


Title	DPatch: An Adversarial Patch Attack on Object Detectors
Authors	Xin Liu, Huanrui Yang, Ziwei Liu, Linghao Song, Hai Li, Yiran Chen
Abstract	Object detectors have emerged as an indispensable module in modern computer vision systems. In this work, we propose DPatch – a black-box adversarial-patch-based attack towards mainstream object detectors (i.e. Faster R-CNN and YOLO). Unlike the original adversarial patch that only manipulates image-level classifier, our DPatch simultaneously attacks the bounding box regression and object classification so as to disable their predictions. Compared to prior works, DPatch has several appealing properties: (1) DPatch can perform both untargeted and targeted effective attacks, degrading the mAP of Faster R-CNN and YOLO from 75.10% and 65.7% down to below 1%, respectively. (2) DPatch is small in size and its attacking effect is location-independent, making it very practical to implement real-world attacks. (3) DPatch demonstrates great transferability among different detectors as well as training datasets. For example, DPatch that is trained on Faster R-CNN can effectively attack YOLO, and vice versa. Extensive evaluations imply that DPatch can perform effective attacks under black-box setup, i.e., even without the knowledge of the attacked network’s architectures and parameters. Successful realization of DPatch also illustrates the intrinsic vulnerability of the modern detector architectures to such patch-based adversarial attacks.
Tasks	Object Classification
Published	2018-06-05
URL	http://arxiv.org/abs/1806.02299v4
PDF	http://arxiv.org/pdf/1806.02299v4.pdf
PWC	https://paperswithcode.com/paper/dpatch-an-adversarial-patch-attack-on-object
Repo	https://github.com/veralauee/DPatch
Framework	pytorch