February 1, 2020

3088 words 15 mins read

Paper Group AWR 322

Hierarchical Graph-to-Graph Translation for Molecules. Image Synthesis From Reconfigurable Layout and Style. Incidental Supervision from Question-Answering Signals. RPGAN: GANs Interpretability via Random Routing. Neural Network Branching for Neural Network Verification. Optimizing Deep Neural Networks with Multiple Search Neuroevolution. FPCNet: F …

Hierarchical Graph-to-Graph Translation for Molecules


Title	Hierarchical Graph-to-Graph Translation for Molecules
Authors	Wengong Jin, Regina Barzilay, Tommi Jaakkola
Abstract	The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties. Our work in this paper substantially extends prior state-of-the-art on graph-to-graph translation methods for molecular optimization. In particular, we realize coherent multi-resolution representations by interweaving the encoding of substructure components with the atom-level encoding of the original molecular graph. Moreover, our graph decoder is fully autoregressive, and interleaves each step of adding a new substructure with the process of resolving its attachment to the emerging molecule. We evaluate our model on multiple molecular optimization tasks and show that our model significantly outperforms previous state-of-the-art baselines.
Tasks	Drug Discovery, Graph-To-Graph Translation
Published	2019-06-11
URL	https://arxiv.org/abs/1907.11223v2
PDF	https://arxiv.org/pdf/1907.11223v2.pdf
PWC	https://paperswithcode.com/paper/multi-resolution-autoregressive-graph-to
Repo	https://github.com/wengong-jin/hgraph2graph
Framework	pytorch

Image Synthesis From Reconfigurable Layout and Style


Title	Image Synthesis From Reconfigurable Layout and Style
Authors	Wei Sun, Tianfu Wu
Abstract	Despite remarkable recent progress on both unconditional and conditional image synthesis, it remains a long-standing problem to learn generative models that are capable of synthesizing realistic and sharp images from reconfigurable spatial layout (i.e., bounding boxes + class labels in an image lattice) and style (i.e., structural and appearance variations encoded by latent vectors), especially at high resolution. By reconfigurable, it means that a model can preserve the intrinsic one-to-many mapping from a given layout to multiple plausible images with different styles, and is adaptive with respect to perturbations of a layout and style latent code. In this paper, we present a layout- and style-based architecture for generative adversarial networks (termed LostGANs) that can be trained end-to-end to generate images from reconfigurable layout and style. Inspired by the vanilla StyleGAN, the proposed LostGAN consists of two new components: (i) learning fine-grained mask maps in a weakly-supervised manner to bridge the gap between layouts and images, and (ii) learning object instance-specific layout-aware feature normalization (ISLA-Norm) in the generator to realize multi-object style generation. In experiments, the proposed method is tested on the COCO-Stuff dataset and the Visual Genome dataset with state-of-the-art performance obtained. The code and pretrained models are available at \url{https://github.com/iVMCL/LostGANs}.
Tasks	Image Generation, Layout-to-Image Generation
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07500v1
PDF	https://arxiv.org/pdf/1908.07500v1.pdf
PWC	https://paperswithcode.com/paper/image-synthesis-from-reconfigurable-layout
Repo	https://github.com/iVMCL/LostGANs
Framework	pytorch

Incidental Supervision from Question-Answering Signals


Title	Incidental Supervision from Question-Answering Signals
Authors	Hangfeng He, Qiang Ning, Dan Roth
Abstract	Human annotations are costly for many natural language processing (NLP) tasks, especially for those requiring NLP expertise. One promising solution is to use natural language to annotate natural language. However, it remains an open problem how to get supervision signals or learn representations from natural language annotations. This paper studies the case where the annotations are in the format of question-answering (QA) and proposes an effective way to learn useful representations for other tasks. We also find that the representation retrieved from question-answer meaning representation (QAMR) data can almost universally improve on a wide range of tasks, suggesting that such kind of natural language annotations indeed provide unique information on top of modern language models.
Tasks	Question Answering
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00333v1
PDF	https://arxiv.org/pdf/1909.00333v1.pdf
PWC	https://paperswithcode.com/paper/incidental-supervision-from-question
Repo	https://github.com/HornHehhf/ISfromQA
Framework	pytorch

RPGAN: GANs Interpretability via Random Routing


Title	RPGAN: GANs Interpretability via Random Routing
Authors	Andrey Voynov, Artem Babenko
Abstract	In this paper, we introduce Random Path Generative Adversarial Network (RPGAN) – an alternative design of GANs that can serve as a tool for generative model analysis. While the latent space of a typical GAN consists of input vectors, randomly sampled from the standard Gaussian distribution, the latent space of RPGAN consists of random paths in a generator network. As we show, this design allows to understand factors of variation, captured by different generator layers, providing their natural interpretability. With experiments on standard benchmarks, we demonstrate that RPGAN reveals several interesting insights about the roles that different layers play in the image generation process. Aside from interpretability, the RPGAN model also provides competitive generation quality and allows efficient incremental learning on new data.
Tasks	Image Generation
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10920v2
PDF	https://arxiv.org/pdf/1912.10920v2.pdf
PWC	https://paperswithcode.com/paper/rpgan-gans-interpretability-via-random
Repo	https://github.com/anvoynov/RandomPathGAN
Framework	pytorch

Neural Network Branching for Neural Network Verification


Title	Neural Network Branching for Neural Network Verification
Authors	Jingyue Lu, M. Pawan Kumar
Abstract	Formal verification of neural networks is essential for their deployment in safety-critical areas. Many available formal verification methods have been shown to be instances of a unified Branch and Bound (BaB) formulation. We propose a novel framework for designing an effective branching strategy for BaB. Specifically, we learn a graph neural network (GNN) to imitate the strong branching heuristic behaviour. Our framework differs from previous methods for learning to branch in two main aspects. Firstly, our framework directly treats the neural network we want to verify as a graph input for the GNN. Secondly, we develop an intuitive forward and backward embedding update schedule. Empirically, our framework achieves roughly $50%$ reduction in both the number of branches and the time required for verification on various convolutional networks when compared to the best available hand-designed branching strategy. In addition, we show that our GNN model enjoys both horizontal and vertical transferability. Horizontally, the model trained on easy properties performs well on properties of increased difficulty levels. Vertically, the model trained on small neural networks achieves similar performance on large neural networks.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01329v1
PDF	https://arxiv.org/pdf/1912.01329v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-branching-for-neural-network-1
Repo	https://github.com/oval-group/GNN_branching
Framework	pytorch

Optimizing Deep Neural Networks with Multiple Search Neuroevolution


Title	Optimizing Deep Neural Networks with Multiple Search Neuroevolution
Authors	Ahmed Aly, David Weikersdorfer, Claire Delaunay
Abstract	This paper presents an evolutionary metaheuristic called Multiple Search Neuroevolution (MSN) to optimize deep neural networks. The algorithm attempts to search multiple promising regions in the search space simultaneously, maintaining sufficient distance between them. It is tested by training neural networks for two tasks, and compared with other optimization algorithms. The first task is to solve Global Optimization functions with challenging topographies. We found to MSN to outperform classic optimization algorithms such as Evolution Strategies, reducing the number of optimization steps performed by at least 2X. The second task is to train a convolutional neural network (CNN) on the popular MNIST dataset. Using 3.33% of the training set, MSN reaches a validation accuracy of 90%. Stochastic Gradient Descent (SGD) was able to match the same accuracy figure, while taking 7X less optimization steps. Despite lagging, the fact that the MSN metaheurisitc trains a 4.7M-parameter CNN suggests promise for future development. This is by far the largest network ever evolved using a pool of only 50 samples.
Tasks
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05988v1
PDF	http://arxiv.org/pdf/1901.05988v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-deep-neural-networks-with-multiple
Repo	https://github.com/AroMorin/DNNOP
Framework	pytorch

FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture


Title	FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture
Authors	Wenjun Liu, Yuchun Huang, Ying Li, Qi Chen
Abstract	Timely, accurate and automatic detection of pavement cracks is necessary for making cost-effective decisions concerning road maintenance. Conventional crack detection algorithms focus on the design of single or multiple crack features and classifiers. However, complicated topological structures, varying degrees of damage and oil stains make the design of crack features difficult. In addition, the contextual information around a crack is not investigated extensively in the design process. Accordingly, these design features have limited discriminative adaptability and cannot fuse effectively with the classifiers. To solve these problems, this paper proposes a deep learning network for pavement crack detection. Using the Encoder-Decoder structure, crack characteristics with multiple contexts are automatically learned, and end-to-end crack detection is achieved. Specifically, we first propose the Multi-Dilation (MD) module, which can synthesize the crack features of multiple context sizes via dilated convolution with multiple rates. The crack MD features obtained in this module can describe cracks of different widths and topologies. Next, we propose the SE-Upsampling (SEU) module, which uses the Squeeze-and-Excitation learning operation to optimize the MD features. Finally, the above two modules are integrated to develop the fast crack detection network, namely, FPCNet. This network continuously optimizes the MD features step-by-step to realize fast pixel-level crack detection. Experiments are conducted on challenging public CFD datasets and G45 crack datasets involving various crack types under different shooting conditions. The distinct performance and speed improvements over all the datasets demonstrate that the proposed method outperforms other state-of-the-art crack detection methods.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02248v1
PDF	https://arxiv.org/pdf/1907.02248v1.pdf
PWC	https://paperswithcode.com/paper/fpcnet-fast-pavement-crack-detection-network
Repo	https://github.com/YuchunHuang/FPCNet
Framework	none

Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression


Title	Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression
Authors	Mu Li, Kede Ma, Jane You, David Zhang, Wangmeng Zuo
Abstract	Precise estimation of the probabilistic structure of natural images plays an essential role in image compression. Despite the recent remarkable success of end-to-end optimized image compression, the latent codes are usually assumed to be fully statistically factorized in order to simplify entropy modeling. However, this assumption generally does not hold true and may hinder compression performance. Here we present context-based convolutional networks (CCNs) for efficient and effective entropy modeling. In particular, a 3D zigzag scanning order and a 3D code dividing technique are introduced to define proper coding contexts for parallel entropy decoding, both of which boil down to place translation-invariant binary masks on convolution filters of CCNs. We demonstrate the promise of CCNs for entropy modeling in both lossless and lossy image compression. For the former, we directly apply a CCN to the binarized representation of an image to compute the Bernoulli distribution of each code for entropy estimation. For the latter, the categorical distribution of each code is represented by a discretized mixture of Gaussian distributions, whose parameters are estimated by three CCNs. We then jointly optimize the CCN-based entropy model along with analysis and synthesis transforms for rate-distortion performance. Experiments on the Kodak and Tecnick datasets show that our methods powered by the proposed CCNs generally achieve comparable compression performance to the state-of-the-art while being much faster.
Tasks	Image Compression
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10057v2
PDF	https://arxiv.org/pdf/1906.10057v2.pdf
PWC	https://paperswithcode.com/paper/efficient-and-effective-context-based
Repo	https://github.com/limuhit/SCAE
Framework	none

Adaptive Attention Span in Transformers


Title	Adaptive Attention Span in Transformers
Authors	Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
Abstract	We propose a novel self-attention mechanism that can learn its optimal attention span. This allows us to extend significantly the maximum context size used in Transformer, while maintaining control over their memory footprint and computational time. We show the effectiveness of our approach on the task of character level language modeling, where we achieve state-of-the-art performances on text8 and enwiki8 by using a maximum context of 8k characters.
Tasks	Language Modelling
Published	2019-05-19
URL	https://arxiv.org/abs/1905.07799v2
PDF	https://arxiv.org/pdf/1905.07799v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-attention-span-in-transformers
Repo	https://github.com/facebookresearch/adaptive-span
Framework	pytorch

Inductive general game playing


Title	Inductive general game playing
Authors	Andrew Cropper, Richard Evans, Mark Law
Abstract	General game playing (GGP) is a framework for evaluating an agent’s general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over all the games. In this paper, we invert this task: a learner is given game traces and the task is to learn the rules that could produce the traces. This problem is central to inductive general game playing (IGGP). We introduce a technique that automatically generates IGGP tasks from GGP games. We introduce an IGGP dataset which contains traces from 50 diverse games, such as Sudoku, Sokoban, and Checkers. We claim that IGGP is difficult for existing inductive logic programming (ILP) approaches. To support this claim, we evaluate existing ILP systems on our dataset. Our empirical results show that most of the games cannot be correctly learned by existing systems. The best performing system solves only 40% of the tasks perfectly. Our results suggest that IGGP poses many challenges to existing approaches. Furthermore, because we can automatically generate IGGP tasks from GGP games, our dataset will continue to grow with the GGP competition, as new games are added every year. We therefore think that the IGGP problem and dataset will be valuable for motivating and evaluating future research.
Tasks
Published	2019-06-23
URL	https://arxiv.org/abs/1906.09627v1
PDF	https://arxiv.org/pdf/1906.09627v1.pdf
PWC	https://paperswithcode.com/paper/inductive-general-game-playing
Repo	https://github.com/andrewcropper/mlj19-iggp
Framework	none

Quality of syntactic implication of RL-based sentence summarization


Title	Quality of syntactic implication of RL-based sentence summarization
Authors	Hoa T. Le, Christophe Cerisara, Claire Gardent
Abstract	Work on summarization has explored both reinforcement learning (RL) optimization using ROUGE as a reward and syntax-aware models, such as models those input is enriched with part-of-speech (POS)-tags and dependency information. However, it is not clear what is the respective impact of these approaches beyond the standard ROUGE evaluation metric. Especially, RL-based for summarization is becoming more and more popular. In this paper, we provide a detailed comparison of these two approaches and of their combination along several dimensions that relate to the perceived quality of the generated summaries: number of repeated words, distribution of part-of-speech tags, impact of sentence length, relevance and grammaticality. Using the standard Gigaword sentence summarization task, we compare an RL self-critical sequence training (SCST) method with syntax-aware models that leverage POS tags and Dependency information. We show that on all qualitative evaluations, the combined model gives the best results, but also that only training with RL and without any syntactic information already gives nearly as good results as syntax-aware models with less parameters and faster training convergence.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05493v1
PDF	https://arxiv.org/pdf/1912.05493v1.pdf
PWC	https://paperswithcode.com/paper/quality-of-syntactic-implication-of-rl-based
Repo	https://github.com/lethienhoa/Eval-RL
Framework	pytorch

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo


Title	Cost Volume Pyramid Based Depth Inference for Multi-View Stereo
Authors	Jiayu Yang, Wei Mao, Jose M. Alvarez, Miaomiao Liu
Abstract	We propose a cost volume-based neural network for depth inference from multi-view images. We demonstrate that building a cost volume pyramid in a coarse-to-fine manner instead of constructing a cost volume at a fixed resolution leads to a compact, lightweight network and allows us inferring high resolution depth maps to achieve better reconstruction results. To this end, we first build a cost volume based on uniform sampling of fronto-parallel planes across the entire depth range at the coarsest resolution of an image. Then, given current depth estimate, we construct new cost volumes iteratively on the pixelwise depth residual to perform depth map refinement. While sharing similar insight with Point-MVSNet as predicting and refining depth iteratively, we show that working on cost volume pyramid can lead to a more compact, yet efficient network structure compared with the Point-MVSNet on 3D points. We further provide detailed analyses of the relation between (residual) depth sampling and image resolution, which serves as a principle for building compact cost volume pyramid. Experimental results on benchmark datasets show that our model can perform 6x faster and has similar performance as state-of-the-art methods. Code is available at https://github.com/JiayuYANG/CVP-MVSNet
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08329v3
PDF	https://arxiv.org/pdf/1912.08329v3.pdf
PWC	https://paperswithcode.com/paper/cost-volume-pyramid-based-depth-inference-for
Repo	https://github.com/JiayuYANG/CVP-MVSNet
Framework	pytorch

Progressive Image Deraining Networks: A Better and Simpler Baseline


Title	Progressive Image Deraining Networks: A Better and Simpler Baseline
Authors	Dongwei Ren, Wangmeng Zuo, Qinghua Hu, Pengfei Zhu, Deyu Meng
Abstract	Along with the deraining performance improvement of deep networks, their structures and learning become more and more complicated and diverse, making it difficult to analyze the contribution of various network modules when developing new deraining networks. To handle this issue, this paper provides a better and simpler baseline deraining network by considering network architecture, input and output, and loss functions. Specifically, by repeatedly unfolding a shallow ResNet, progressive ResNet (PRN) is proposed to take advantage of recursive computation. A recurrent layer is further introduced to exploit the dependencies of deep features across stages, forming our progressive recurrent network (PReNet). Furthermore, intra-stage recursive computation of ResNet can be adopted in PRN and PReNet to notably reduce network parameters with graceful degradation in deraining performance. For network input and output, we take both stage-wise result and original rainy image as input to each ResNet and finally output the prediction of {residual image}. As for loss functions, single MSE or negative SSIM losses are sufficient to train PRN and PReNet. Experiments show that PRN and PReNet perform favorably on both synthetic and real rainy images. Considering its simplicity, efficiency and effectiveness, our models are expected to serve as a suitable baseline in future deraining research. The source codes are available at https://github.com/csdwren/PReNet.
Tasks	Rain Removal
Published	2019-01-26
URL	https://arxiv.org/abs/1901.09221v3
PDF	https://arxiv.org/pdf/1901.09221v3.pdf
PWC	https://paperswithcode.com/paper/progressive-image-deraining-networks-a-better
Repo	https://github.com/csdwren/PReNet
Framework	pytorch

Symmetry-Based Disentangled Representation Learning requires Interaction with Environments


Title	Symmetry-Based Disentangled Representation Learning requires Interaction with Environments
Authors	Hugo Caselles-Dupré, Michael Garcia-Ortiz, David Filliat
Abstract	Finding a generally accepted formal definition of a disentangled representation in the context of an agent behaving in an environment is an important challenge towards the construction of data-efficient autonomous agents. Higgins et al. recently proposed Symmetry-Based Disentangled Representation Learning, a definition based on a characterization of symmetries in the environment using group theory. We build on their work and make observations, theoretical and empirical, that lead us to argue that Symmetry-Based Disentangled Representation Learning cannot only be based on static observations: agents should interact with the environment to discover its symmetries. Our experiments can be reproduced in Colab and the code is available on GitHub.
Tasks	Representation Learning
Published	2019-03-30
URL	https://arxiv.org/abs/1904.00243v3
PDF	https://arxiv.org/pdf/1904.00243v3.pdf
PWC	https://paperswithcode.com/paper/symmetry-based-disentangled-representation
Repo	https://github.com/Caselles/NeurIPS19-SBDRL
Framework	pytorch

Equivariant Transformer Networks


Title	Equivariant Transformer Networks
Authors	Kai Sheng Tai, Peter Bailis, Gregory Valiant
Abstract	How can prior knowledge on the transformation invariances of a domain be incorporated into the architecture of a neural network? We propose Equivariant Transformers (ETs), a family of differentiable image-to-image mappings that improve the robustness of models towards pre-defined continuous transformation groups. Through the use of specially-derived canonical coordinate systems, ETs incorporate functions that are equivariant by construction with respect to these transformations. We show empirically that ETs can be flexibly composed to improve model robustness towards more complicated transformation groups in several parameters. On a real-world image classification task, ETs improve the sample efficiency of ResNet classifiers, achieving relative improvements in error rate of up to 15% in the limited data regime while increasing model parameter count by less than 1%.
Tasks	Image Classification
Published	2019-01-25
URL	https://arxiv.org/abs/1901.11399v2
PDF	https://arxiv.org/pdf/1901.11399v2.pdf
PWC	https://paperswithcode.com/paper/equivariant-transformer-networks
Repo	https://github.com/stanford-futuredata/equivariant-transformers
Framework	pytorch