February 1, 2020

3071 words 15 mins read

Paper Group AWR 327

GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud. Fine-Tuning Language Models from Human Preferences. Implicit Regularization in Deep Matrix Factorization. Water Preservation in Soan River Basin using Deep Learning Techniques. Multi-objective Evolutionary Algorithms are Still Good: Maximizing Monotone A …

GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud


Title	GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud
Authors	Can Chen, Luca Zanotti Fragonara, Antonios Tsourdos
Abstract	Exploiting fine-grained semantic features on point cloud is still challenging due to its irregular and sparse structure in a non-Euclidean space. Among existing studies, PointNet provides an efficient and promising approach to learn shape features directly on unordered 3D point cloud and has achieved competitive performance. However, local feature that is helpful towards better contextual learning is not considered. Meanwhile, attention mechanism shows efficiency in capturing node representation on graph-based data by attending over neighboring nodes. In this paper, we propose a novel neural network for point cloud, dubbed GAPNet, to learn local geometric representations by embedding graph attention mechanism within stacked Multi-Layer-Perceptron (MLP) layers. Firstly, we introduce a GAPLayer to learn attention features for each point by highlighting different attention weights on neighborhood. Secondly, in order to exploit sufficient features, a multi-head mechanism is employed to allow GAPLayer to aggregate different features from independent heads. Thirdly, we propose an attention pooling layer over neighbors to capture local signature aimed at enhancing network robustness. Finally, GAPNet applies stacked MLP layers to attention features and local signature to fully extract local geometric structures. The proposed GAPNet architecture is tested on the ModelNet40 and ShapeNet part datasets, and achieves state-of-the-art performance in both shape classification and part segmentation tasks.
Tasks
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08705v1
PDF	https://arxiv.org/pdf/1905.08705v1.pdf
PWC	https://paperswithcode.com/paper/gapnet-graph-attention-based-point-neural
Repo	https://github.com/FrankCAN/GAPNet
Framework	tf

Fine-Tuning Language Models from Human Preferences


Title	Fine-Tuning Language Models from Human Preferences
Authors	Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, Geoffrey Irving
Abstract	Reward learning enables the application of reinforcement learning (RL) to tasks where reward is defined by human judgment, building a model of reward by asking humans questions. Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and safe for real-world tasks. In this paper, we build on advances in generative pretraining of language models to apply reward learning to four natural language tasks: continuing text with positive sentiment or physically descriptive language, and summarization tasks on the TL;DR and CNN/Daily Mail datasets. For stylistic continuation we achieve good results with only 5,000 comparisons evaluated by humans. For summarization, models trained with 60,000 comparisons copy whole sentences from the input but skip irrelevant preamble; this leads to reasonable ROUGE scores and very good performance according to our human labelers, but may be exploiting the fact that labelers rely on simple heuristics.
Tasks	Language Modelling
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08593v2
PDF	https://arxiv.org/pdf/1909.08593v2.pdf
PWC	https://paperswithcode.com/paper/fine-tuning-language-models-from-human
Repo	https://github.com/openai/lm-human-preferences
Framework	tf

Implicit Regularization in Deep Matrix Factorization


Title	Implicit Regularization in Deep Matrix Factorization
Authors	Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo
Abstract	Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low “complexity.” We study the implicit regularization of gradient descent over deep linear neural networks for matrix completion and sensing, a model referred to as deep matrix factorization. Our first finding, supported by theory and experiments, is that adding depth to a matrix factorization enhances an implicit tendency towards low-rank solutions, oftentimes leading to more accurate recovery. Secondly, we present theoretical and empirical arguments questioning a nascent view by which implicit regularization in matrix factorization can be captured using simple mathematical norms. Our results point to the possibility that the language of standard regularizers may not be rich enough to fully encompass the implicit regularization brought forth by gradient-based optimization.
Tasks	Matrix Completion
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13655v3
PDF	https://arxiv.org/pdf/1905.13655v3.pdf
PWC	https://paperswithcode.com/paper/implicit-regularization-in-deep-matrix
Repo	https://github.com/roosephu/deep_matrix_factorization
Framework	none

Water Preservation in Soan River Basin using Deep Learning Techniques


Title	Water Preservation in Soan River Basin using Deep Learning Techniques
Authors	Sadaqat ur Rehman, Zhongliang Yang, Muhammad Shahid, Nan Wei, Yongfeng Huang, Muhammad Waqas, Shanshan Tu, Obaid ur Rehman
Abstract	Water supplies are crucial for the development of living beings. However, change in the hydrological process i.e. climate and land usage are the key issues. Sustaining water level and accurate estimating for dynamic conditions is a critical job for hydrologists, but predicting hydrological extremes is an open issue. In this paper, we proposed two deep learning techniques and three machine learning algorithms to predict stream flow, given the present climate conditions. The results showed that the Recurrent Neural Network (RNN) or Long Short-term Memory (LSTM), an artificial neural network based method, outperform other conventional and machine-learning algorithms for predicting stream flow. Furthermore, we analyzed that stream flow is directly affected by precipitation, land usage, and temperature. These indexes are critical, which can be used by hydrologists to identify the potential for stream flow. We make the dataset publicly available (https://github.com/sadaqat007/Dataset) so that others should be able to replicate and build upon the results published.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.10852v1
PDF	https://arxiv.org/pdf/1906.10852v1.pdf
PWC	https://paperswithcode.com/paper/water-preservation-in-soan-river-basin-using
Repo	https://github.com/sadaqat007/Dataset
Framework	none

Multi-objective Evolutionary Algorithms are Still Good: Maximizing Monotone Approximately Submodular Minus Modular Functions


Title	Multi-objective Evolutionary Algorithms are Still Good: Maximizing Monotone Approximately Submodular Minus Modular Functions
Authors	Chao Qian
Abstract	As evolutionary algorithms (EAs) are general-purpose optimization algorithms, recent theoretical studies have tried to analyze their performance for solving general problem classes, with the goal of providing a general theoretical explanation of the behavior of EAs. Particularly, a simple multi-objective EA, i.e., GSEMO, has been shown to be able to achieve good polynomial-time approximation guarantees for submodular optimization, where the objective function is only required to satisfy some properties but without explicit formulation. Submodular optimization has wide applications in diverse areas, and previous studies have considered the cases where the objective functions are monotone submodular, monotone non-submodular, or non-monotone submodular. To complement this line of research, this paper studies the problem class of maximizing monotone approximately submodular minus modular functions (i.e., $f=g-c$) with a size constraint, where $g$ is a non-negative monotone approximately submodular function and $c$ is a non-negative modular function, resulting in the objective function $f$ being non-monotone non-submodular. We prove that the GSEMO can achieve the best-known polynomial-time approximation guarantee. Empirical studies on the applications of Bayesian experimental design and directed vertex cover show the excellent performance of the GSEMO.
Tasks
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05492v1
PDF	https://arxiv.org/pdf/1910.05492v1.pdf
PWC	https://paperswithcode.com/paper/multi-objective-evolutionary-algorithms-are
Repo	https://github.com/paper2019/ApproxSub-Minus-Modular
Framework	none

Robustness of 3D Deep Learning in an Adversarial Setting


Title	Robustness of 3D Deep Learning in an Adversarial Setting
Authors	Matthew Wicker, Marta Kwiatkowska
Abstract	Understanding the spatial arrangement and nature of real-world objects is of paramount importance to many complex engineering tasks, including autonomous navigation. Deep learning has revolutionized state-of-the-art performance for tasks in 3D environments; however, relatively little is known about the robustness of these approaches in an adversarial setting. The lack of comprehensive analysis makes it difficult to justify deployment of 3D deep learning models in real-world, safety-critical applications. In this work, we develop an algorithm for analysis of pointwise robustness of neural networks that operate on 3D data. We show that current approaches presented for understanding the resilience of state-of-the-art models vastly overestimate their robustness. We then use our algorithm to evaluate an array of state-of-the-art models in order to demonstrate their vulnerability to occlusion attacks. We show that, in the worst case, these networks can be reduced to 0% classification accuracy after the occlusion of at most 6.5% of the occupied input space.
Tasks	Autonomous Navigation
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00923v1
PDF	http://arxiv.org/pdf/1904.00923v1.pdf
PWC	https://paperswithcode.com/paper/robustness-of-3d-deep-learning-in-an
Repo	https://github.com/matthewwicker/IterativeSalienceOcclusion
Framework	tf

Finding Task-Relevant Features for Few-Shot Learning by Category Traversal


Title	Finding Task-Relevant Features for Few-Shot Learning by Category Traversal
Authors	Hongyang Li, David Eigen, Samuel Dodge, Matthew Zeiler, Xiaogang Wang
Abstract	Few-shot learning is an important area of research. Conceptually, humans are readily able to understand new concepts given just a few examples, while in more pragmatic terms, limited-example training situations are common in practice. Recent effective approaches to few-shot learning employ a metric-learning framework to learn a feature similarity comparison between a query (test) example, and the few support (training) examples. However, these approaches treat each support class independently from one another, never looking at the entire task as a whole. Because of this, they are constrained to use a single set of features for all possible test-time tasks, which hinders the ability to distinguish the most relevant dimensions for the task at hand. In this work, we introduce a Category Traversal Module that can be inserted as a plug-and-play module into most metric-learning based few-shot learners. This component traverses across the entire support set at once, identifying task-relevant features based on both intra-class commonality and inter-class uniqueness in the feature space. Incorporating our module improves performance considerably (5%-10% relative) over baseline systems on both mini-ImageNet and tieredImageNet benchmarks, with overall performance competitive with recent state-of-the-art systems.
Tasks	Few-Shot Learning, Metric Learning
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11116v1
PDF	https://arxiv.org/pdf/1905.11116v1.pdf
PWC	https://paperswithcode.com/paper/finding-task-relevant-features-for-few-shot-1
Repo	https://github.com/Clarifai/few-shot-ctm
Framework	pytorch

Multi-channel Reverse Dictionary Model


Title	Multi-channel Reverse Dictionary Model
Authors	Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun
Abstract	A reverse dictionary takes the description of a target word as input and outputs the target word together with other words that match the description. Existing reverse dictionary methods cannot deal with highly variable input queries and low-frequency target words successfully. Inspired by the description-to-word inference process of humans, we propose the multi-channel reverse dictionary model, which can mitigate the two problems simultaneously. Our model comprises a sentence encoder and multiple predictors. The predictors are expected to identify different characteristics of the target word from the input query. We evaluate our model on English and Chinese datasets including both dictionary definitions and human-written descriptions. Experimental results show that our model achieves the state-of-the-art performance, and even outperforms the most popular commercial reverse dictionary system on the human-written description dataset. We also conduct quantitative analyses and a case study to demonstrate the effectiveness and robustness of our model. All the code and data of this work can be obtained on https://github.com/thunlp/MultiRD.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08441v2
PDF	https://arxiv.org/pdf/1912.08441v2.pdf
PWC	https://paperswithcode.com/paper/multi-channel-reverse-dictionary-model
Repo	https://github.com/thunlp/MultiRD
Framework	pytorch

D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation


Title	D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation
Authors	Yongjin Zhou, Weijian Huang, Pei Dong, Yong Xia, Shanshan Wang
Abstract	Assessing the location and extent of lesions caused by chronic stroke is critical for medical diagnosis, surgical planning, and prognosis. In recent years, with the rapid development of 2D and 3D convolutional neural networks (CNN), the encoder-decoder structure has shown great potential in the field of medical image segmentation. However, the 2D CNN ignores the 3D information of medical images, while the 3D CNN suffers from high computational resource demands. This paper proposes a new architecture called dimension-fusion-UNet (D-UNet), which combines 2D and 3D convolution innovatively in the encoding stage. The proposed architecture achieves a better segmentation performance than 2D networks, while requiring significantly less computation time in comparison to 3D networks. Furthermore, to alleviate the data imbalance issue between positive and negative samples for the network training, we propose a new loss function called Enhance Mixing Loss (EML). This function adds a weighted focal coefficient and combines two traditional loss functions. The proposed method has been tested on the ATLAS dataset and compared to three state-of-the-art methods. The results demonstrate that the proposed method achieves the best quality performance in terms of DSC = 0.5349+0.2763 and precision = 0.6331+0.295).
Tasks	Lesion Segmentation, Medical Diagnosis, Medical Image Segmentation, Semantic Segmentation
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05104v1
PDF	https://arxiv.org/pdf/1908.05104v1.pdf
PWC	https://paperswithcode.com/paper/d-unet-a-dimension-fusion-u-shape-network-for
Repo	https://github.com/SZUHvern/D-UNet
Framework	none

Photo-Realistic Facial Details Synthesis from Single Image


Title	Photo-Realistic Facial Details Synthesis from Single Image
Authors	Anpei Chen, Zhang Chen, Guli Zhang, Ziheng Zhang, Kenny Mitchell, Jingyi Yu
Abstract	We present a single-image 3D face synthesis technique that can handle challenging facial expressions while recovering fine geometric details. Our technique employs expression analysis for proxy face geometry generation and combines supervised and unsupervised learning for facial detail synthesis. On proxy generation, we conduct emotion prediction to determine a new expression-informed proxy. On detail synthesis, we present a Deep Facial Detail Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs both geometry and appearance loss functions. For geometry, we capture 366 high-quality 3D scans from 122 different subjects under 3 facial expressions. For appearance, we use additional 20K in-the-wild face images and apply image-based rendering to accommodate lighting variations. Comprehensive experiments demonstrate that our framework can produce high-quality 3D faces with realistic details under challenging facial expressions.
Tasks	Face Generation
Published	2019-03-26
URL	https://arxiv.org/abs/1903.10873v5
PDF	https://arxiv.org/pdf/1903.10873v5.pdf
PWC	https://paperswithcode.com/paper/photo-realistic-facial-details-synthesis-from
Repo	https://github.com/apchenstu/Facial_Details_Synthesis
Framework	pytorch

Learning Discrete and Continuous Factors of Data via Alternating Disentanglement


Title	Learning Discrete and Continuous Factors of Data via Alternating Disentanglement
Authors	Yeonwoo Jeong, Hyun Oh Song
Abstract	We address the problem of unsupervised disentanglement of discrete and continuous explanatory factors of data. We first show a simple procedure for minimizing the total correlation of the continuous latent variables without having to use a discriminator network or perform importance sampling, via cascading the information flow in the $\beta$-vae framework. Furthermore, we propose a method which avoids offloading the entire burden of jointly modeling the continuous and discrete factors to the variational encoder by employing a separate discrete inference procedure. This leads to an interesting alternating minimization problem which switches between finding the most likely discrete configuration given the continuous factors and updating the variational encoder based on the computed discrete factors. Experiments show that the proposed method clearly disentangles discrete factors and significantly outperforms current disentanglement methods based on the disentanglement score and inference network classification score. The source code is available at https://github.com/snu-mllab/DisentanglementICML19.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09432v1
PDF	https://arxiv.org/pdf/1905.09432v1.pdf
PWC	https://paperswithcode.com/paper/learning-discrete-and-continuous-factors-of
Repo	https://github.com/snu-mllab/DisentanglementICML19
Framework	tf

OpenKiwi: An Open Source Framework for Quality Estimation


Title	OpenKiwi: An Open Source Framework for Quality Estimation
Authors	Fábio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins
Abstract	We introduce OpenKiwi, a PyTorch-based open source framework for translation quality estimation. OpenKiwi supports training and testing of word-level and sentence-level quality estimation systems, implementing the winning systems of the WMT 2015-18 quality estimation campaigns. We benchmark OpenKiwi on two datasets from WMT 2018 (English-German SMT and NMT), yielding state-of-the-art performance on the word-level tasks and near state-of-the-art in the sentence-level tasks.
Tasks
Published	2019-02-22
URL	https://arxiv.org/abs/1902.08646v2
PDF	https://arxiv.org/pdf/1902.08646v2.pdf
PWC	https://paperswithcode.com/paper/openkiwi-an-open-source-framework-for-quality
Repo	https://github.com/Unbabel/OpenKiwi
Framework	pytorch

Rethinking Kernel Methods for Node Representation Learning on Graphs


Title	Rethinking Kernel Methods for Node Representation Learning on Graphs
Authors	Yu Tian, Long Zhao, Xi Peng, Dimitris N. Metaxas
Abstract	Graph kernels are kernel methods measuring graph similarity and serve as a standard tool for graph classification. However, the use of kernel methods for node classification, which is a related problem to graph representation learning, is still ill-posed and the state-of-the-art methods are heavily based on heuristics. Here, we present a novel theoretical kernel-based framework for node classification that can bridge the gap between these two representation learning problems on graphs. Our approach is motivated by graph kernel methodology but extended to learn the node representations capturing the structural information in a graph. We theoretically show that our formulation is as powerful as any positive semidefinite kernels. To efficiently learn the kernel, we propose a novel mechanism for node feature aggregation and a data-driven similarity metric employed during the training phase. More importantly, our framework is flexible and complementary to other graph-based deep learning models, e.g., Graph Convolutional Networks (GCNs). We empirically evaluate our approach on a number of standard node classification benchmarks, and demonstrate that our model sets the new state of the art.
Tasks	Graph Classification, Graph Representation Learning, Graph Similarity, Node Classification, Representation Learning
Published	2019-10-06
URL	https://arxiv.org/abs/1910.02548v1
PDF	https://arxiv.org/pdf/1910.02548v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-kernel-methods-for-node
Repo	https://github.com/bluer555/KernelGCN
Framework	pytorch

Graph-Based Parallel Large Scale Structure from Motion


Title	Graph-Based Parallel Large Scale Structure from Motion
Authors	Yu Chen, Shuhan Shen, Yisong Chen, Guoping Wang
Abstract	While Structure from Motion (SfM) achieves great success in 3D reconstruction, it still meets challenges on large scale scenes. In this work, large scale SfM is deemed as a graph problem, and we tackle it in a divide-and-conquer manner. Firstly, the images clustering algorithm divides images into clusters with strong connectivity, leading to robust local reconstructions. Then followed with an image expansion step, the connection and completeness of scenes are enhanced by expanding along with a maximum spanning tree. After local reconstructions, we construct a minimum spanning tree (MinST) to find accurate similarity transformations. Then the MinST is transformed into a Minimum Height Tree (MHT) to find a proper anchor node and is further utilized to prevent error accumulation. When evaluated on different kinds of datasets, our approach shows superiority over the state-of-the-art in accuracy and efficiency. Our algorithm is open-sourced at https://github.com/AIBluefisher/GraphSfM.
Tasks	3D Reconstruction
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10659v1
PDF	https://arxiv.org/pdf/1912.10659v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-parallel-large-scale-structure
Repo	https://github.com/AIBluefisher/GraphSfM
Framework	none

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation


Title	Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
Authors	Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
Abstract	In this paper, we propose an effective and efficient pyramid multi-view stereo (MVS) net for accurate and complete dense point cloud reconstruction. Different from existing deep-learning based MVS methods, our VA-MVSNet incorporates the cost variance between different views by introducing two novel self-adaptive view aggregation: pixel-wise view aggregation and voxel-wise view aggregation. Moreover, to enhance the point cloud reconstruction on the texture-less regions, we extend VA-MVSNet with pyramid multi-scale images input as PVA-MVSNet, where multi-metric constraints are leveraged to aggregate the reliable depth estimation at the coarser scale to fill-in the mismatched regions at the finer scale. Experimental results show that our approach establishes a new state-of-the-art on the DTU dataset with significant improvements in the completeness and overall quality of 3D reconstruction, and ranks 1st on the Tanks and Temples benchmark among all published deep-learning based methods. Our codebase is available at https://github.com/yhw-yhw/PVAMVSNet.
Tasks	3D Reconstruction, Depth Estimation
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03001v1
PDF	https://arxiv.org/pdf/1912.03001v1.pdf
PWC	https://paperswithcode.com/paper/pyramid-multi-view-stereo-net-with-self
Repo	https://github.com/yhw-yhw/PVAMVSNet
Framework	pytorch