Paper Group AWR 327
GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud. Fine-Tuning Language Models from Human Preferences. Implicit Regularization in Deep Matrix Factorization. Water Preservation in Soan River Basin using Deep Learning Techniques. Multi-objective Evolutionary Algorithms are Still Good: Maximizing Monotone A …
GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud
Title | GAPNet: Graph Attention based Point Neural Network for Exploiting Local Feature of Point Cloud |
Authors | Can Chen, Luca Zanotti Fragonara, Antonios Tsourdos |
Abstract | Exploiting fine-grained semantic features on point cloud is still challenging due to its irregular and sparse structure in a non-Euclidean space. Among existing studies, PointNet provides an efficient and promising approach to learn shape features directly on unordered 3D point cloud and has achieved competitive performance. However, local feature that is helpful towards better contextual learning is not considered. Meanwhile, attention mechanism shows efficiency in capturing node representation on graph-based data by attending over neighboring nodes. In this paper, we propose a novel neural network for point cloud, dubbed GAPNet, to learn local geometric representations by embedding graph attention mechanism within stacked Multi-Layer-Perceptron (MLP) layers. Firstly, we introduce a GAPLayer to learn attention features for each point by highlighting different attention weights on neighborhood. Secondly, in order to exploit sufficient features, a multi-head mechanism is employed to allow GAPLayer to aggregate different features from independent heads. Thirdly, we propose an attention pooling layer over neighbors to capture local signature aimed at enhancing network robustness. Finally, GAPNet applies stacked MLP layers to attention features and local signature to fully extract local geometric structures. The proposed GAPNet architecture is tested on the ModelNet40 and ShapeNet part datasets, and achieves state-of-the-art performance in both shape classification and part segmentation tasks. |
Tasks | |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08705v1 |
https://arxiv.org/pdf/1905.08705v1.pdf | |
PWC | https://paperswithcode.com/paper/gapnet-graph-attention-based-point-neural |
Repo | https://github.com/FrankCAN/GAPNet |
Framework | tf |
Fine-Tuning Language Models from Human Preferences
Title | Fine-Tuning Language Models from Human Preferences |
Authors | Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, Geoffrey Irving |
Abstract | Reward learning enables the application of reinforcement learning (RL) to tasks where reward is defined by human judgment, building a model of reward by asking humans questions. Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and safe for real-world tasks. In this paper, we build on advances in generative pretraining of language models to apply reward learning to four natural language tasks: continuing text with positive sentiment or physically descriptive language, and summarization tasks on the TL;DR and CNN/Daily Mail datasets. For stylistic continuation we achieve good results with only 5,000 comparisons evaluated by humans. For summarization, models trained with 60,000 comparisons copy whole sentences from the input but skip irrelevant preamble; this leads to reasonable ROUGE scores and very good performance according to our human labelers, but may be exploiting the fact that labelers rely on simple heuristics. |
Tasks | Language Modelling |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08593v2 |
https://arxiv.org/pdf/1909.08593v2.pdf | |
PWC | https://paperswithcode.com/paper/fine-tuning-language-models-from-human |
Repo | https://github.com/openai/lm-human-preferences |
Framework | tf |
Implicit Regularization in Deep Matrix Factorization
Title | Implicit Regularization in Deep Matrix Factorization |
Authors | Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo |
Abstract | Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low “complexity.” We study the implicit regularization of gradient descent over deep linear neural networks for matrix completion and sensing, a model referred to as deep matrix factorization. Our first finding, supported by theory and experiments, is that adding depth to a matrix factorization enhances an implicit tendency towards low-rank solutions, oftentimes leading to more accurate recovery. Secondly, we present theoretical and empirical arguments questioning a nascent view by which implicit regularization in matrix factorization can be captured using simple mathematical norms. Our results point to the possibility that the language of standard regularizers may not be rich enough to fully encompass the implicit regularization brought forth by gradient-based optimization. |
Tasks | Matrix Completion |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13655v3 |
https://arxiv.org/pdf/1905.13655v3.pdf | |
PWC | https://paperswithcode.com/paper/implicit-regularization-in-deep-matrix |
Repo | https://github.com/roosephu/deep_matrix_factorization |
Framework | none |
Water Preservation in Soan River Basin using Deep Learning Techniques
Title | Water Preservation in Soan River Basin using Deep Learning Techniques |
Authors | Sadaqat ur Rehman, Zhongliang Yang, Muhammad Shahid, Nan Wei, Yongfeng Huang, Muhammad Waqas, Shanshan Tu, Obaid ur Rehman |
Abstract | Water supplies are crucial for the development of living beings. However, change in the hydrological process i.e. climate and land usage are the key issues. Sustaining water level and accurate estimating for dynamic conditions is a critical job for hydrologists, but predicting hydrological extremes is an open issue. In this paper, we proposed two deep learning techniques and three machine learning algorithms to predict stream flow, given the present climate conditions. The results showed that the Recurrent Neural Network (RNN) or Long Short-term Memory (LSTM), an artificial neural network based method, outperform other conventional and machine-learning algorithms for predicting stream flow. Furthermore, we analyzed that stream flow is directly affected by precipitation, land usage, and temperature. These indexes are critical, which can be used by hydrologists to identify the potential for stream flow. We make the dataset publicly available (https://github.com/sadaqat007/Dataset) so that others should be able to replicate and build upon the results published. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10852v1 |
https://arxiv.org/pdf/1906.10852v1.pdf | |
PWC | https://paperswithcode.com/paper/water-preservation-in-soan-river-basin-using |
Repo | https://github.com/sadaqat007/Dataset |
Framework | none |
Multi-objective Evolutionary Algorithms are Still Good: Maximizing Monotone Approximately Submodular Minus Modular Functions
Title | Multi-objective Evolutionary Algorithms are Still Good: Maximizing Monotone Approximately Submodular Minus Modular Functions |
Authors | Chao Qian |
Abstract | As evolutionary algorithms (EAs) are general-purpose optimization algorithms, recent theoretical studies have tried to analyze their performance for solving general problem classes, with the goal of providing a general theoretical explanation of the behavior of EAs. Particularly, a simple multi-objective EA, i.e., GSEMO, has been shown to be able to achieve good polynomial-time approximation guarantees for submodular optimization, where the objective function is only required to satisfy some properties but without explicit formulation. Submodular optimization has wide applications in diverse areas, and previous studies have considered the cases where the objective functions are monotone submodular, monotone non-submodular, or non-monotone submodular. To complement this line of research, this paper studies the problem class of maximizing monotone approximately submodular minus modular functions (i.e., $f=g-c$) with a size constraint, where $g$ is a non-negative monotone approximately submodular function and $c$ is a non-negative modular function, resulting in the objective function $f$ being non-monotone non-submodular. We prove that the GSEMO can achieve the best-known polynomial-time approximation guarantee. Empirical studies on the applications of Bayesian experimental design and directed vertex cover show the excellent performance of the GSEMO. |
Tasks | |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05492v1 |
https://arxiv.org/pdf/1910.05492v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-objective-evolutionary-algorithms-are |
Repo | https://github.com/paper2019/ApproxSub-Minus-Modular |
Framework | none |
Robustness of 3D Deep Learning in an Adversarial Setting
Title | Robustness of 3D Deep Learning in an Adversarial Setting |
Authors | Matthew Wicker, Marta Kwiatkowska |
Abstract | Understanding the spatial arrangement and nature of real-world objects is of paramount importance to many complex engineering tasks, including autonomous navigation. Deep learning has revolutionized state-of-the-art performance for tasks in 3D environments; however, relatively little is known about the robustness of these approaches in an adversarial setting. The lack of comprehensive analysis makes it difficult to justify deployment of 3D deep learning models in real-world, safety-critical applications. In this work, we develop an algorithm for analysis of pointwise robustness of neural networks that operate on 3D data. We show that current approaches presented for understanding the resilience of state-of-the-art models vastly overestimate their robustness. We then use our algorithm to evaluate an array of state-of-the-art models in order to demonstrate their vulnerability to occlusion attacks. We show that, in the worst case, these networks can be reduced to 0% classification accuracy after the occlusion of at most 6.5% of the occupied input space. |
Tasks | Autonomous Navigation |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00923v1 |
http://arxiv.org/pdf/1904.00923v1.pdf | |
PWC | https://paperswithcode.com/paper/robustness-of-3d-deep-learning-in-an |
Repo | https://github.com/matthewwicker/IterativeSalienceOcclusion |
Framework | tf |
Finding Task-Relevant Features for Few-Shot Learning by Category Traversal
Title | Finding Task-Relevant Features for Few-Shot Learning by Category Traversal |
Authors | Hongyang Li, David Eigen, Samuel Dodge, Matthew Zeiler, Xiaogang Wang |
Abstract | Few-shot learning is an important area of research. Conceptually, humans are readily able to understand new concepts given just a few examples, while in more pragmatic terms, limited-example training situations are common in practice. Recent effective approaches to few-shot learning employ a metric-learning framework to learn a feature similarity comparison between a query (test) example, and the few support (training) examples. However, these approaches treat each support class independently from one another, never looking at the entire task as a whole. Because of this, they are constrained to use a single set of features for all possible test-time tasks, which hinders the ability to distinguish the most relevant dimensions for the task at hand. In this work, we introduce a Category Traversal Module that can be inserted as a plug-and-play module into most metric-learning based few-shot learners. This component traverses across the entire support set at once, identifying task-relevant features based on both intra-class commonality and inter-class uniqueness in the feature space. Incorporating our module improves performance considerably (5%-10% relative) over baseline systems on both mini-ImageNet and tieredImageNet benchmarks, with overall performance competitive with recent state-of-the-art systems. |
Tasks | Few-Shot Learning, Metric Learning |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11116v1 |
https://arxiv.org/pdf/1905.11116v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-task-relevant-features-for-few-shot-1 |
Repo | https://github.com/Clarifai/few-shot-ctm |
Framework | pytorch |
Multi-channel Reverse Dictionary Model
Title | Multi-channel Reverse Dictionary Model |
Authors | Lei Zhang, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Qun Liu, Maosong Sun |
Abstract | A reverse dictionary takes the description of a target word as input and outputs the target word together with other words that match the description. Existing reverse dictionary methods cannot deal with highly variable input queries and low-frequency target words successfully. Inspired by the description-to-word inference process of humans, we propose the multi-channel reverse dictionary model, which can mitigate the two problems simultaneously. Our model comprises a sentence encoder and multiple predictors. The predictors are expected to identify different characteristics of the target word from the input query. We evaluate our model on English and Chinese datasets including both dictionary definitions and human-written descriptions. Experimental results show that our model achieves the state-of-the-art performance, and even outperforms the most popular commercial reverse dictionary system on the human-written description dataset. We also conduct quantitative analyses and a case study to demonstrate the effectiveness and robustness of our model. All the code and data of this work can be obtained on https://github.com/thunlp/MultiRD. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08441v2 |
https://arxiv.org/pdf/1912.08441v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-channel-reverse-dictionary-model |
Repo | https://github.com/thunlp/MultiRD |
Framework | pytorch |
D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation
Title | D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation |
Authors | Yongjin Zhou, Weijian Huang, Pei Dong, Yong Xia, Shanshan Wang |
Abstract | Assessing the location and extent of lesions caused by chronic stroke is critical for medical diagnosis, surgical planning, and prognosis. In recent years, with the rapid development of 2D and 3D convolutional neural networks (CNN), the encoder-decoder structure has shown great potential in the field of medical image segmentation. However, the 2D CNN ignores the 3D information of medical images, while the 3D CNN suffers from high computational resource demands. This paper proposes a new architecture called dimension-fusion-UNet (D-UNet), which combines 2D and 3D convolution innovatively in the encoding stage. The proposed architecture achieves a better segmentation performance than 2D networks, while requiring significantly less computation time in comparison to 3D networks. Furthermore, to alleviate the data imbalance issue between positive and negative samples for the network training, we propose a new loss function called Enhance Mixing Loss (EML). This function adds a weighted focal coefficient and combines two traditional loss functions. The proposed method has been tested on the ATLAS dataset and compared to three state-of-the-art methods. The results demonstrate that the proposed method achieves the best quality performance in terms of DSC = 0.5349+0.2763 and precision = 0.6331+0.295). |
Tasks | Lesion Segmentation, Medical Diagnosis, Medical Image Segmentation, Semantic Segmentation |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05104v1 |
https://arxiv.org/pdf/1908.05104v1.pdf | |
PWC | https://paperswithcode.com/paper/d-unet-a-dimension-fusion-u-shape-network-for |
Repo | https://github.com/SZUHvern/D-UNet |
Framework | none |
Photo-Realistic Facial Details Synthesis from Single Image
Title | Photo-Realistic Facial Details Synthesis from Single Image |
Authors | Anpei Chen, Zhang Chen, Guli Zhang, Ziheng Zhang, Kenny Mitchell, Jingyi Yu |
Abstract | We present a single-image 3D face synthesis technique that can handle challenging facial expressions while recovering fine geometric details. Our technique employs expression analysis for proxy face geometry generation and combines supervised and unsupervised learning for facial detail synthesis. On proxy generation, we conduct emotion prediction to determine a new expression-informed proxy. On detail synthesis, we present a Deep Facial Detail Net (DFDN) based on Conditional Generative Adversarial Net (CGAN) that employs both geometry and appearance loss functions. For geometry, we capture 366 high-quality 3D scans from 122 different subjects under 3 facial expressions. For appearance, we use additional 20K in-the-wild face images and apply image-based rendering to accommodate lighting variations. Comprehensive experiments demonstrate that our framework can produce high-quality 3D faces with realistic details under challenging facial expressions. |
Tasks | Face Generation |
Published | 2019-03-26 |
URL | https://arxiv.org/abs/1903.10873v5 |
https://arxiv.org/pdf/1903.10873v5.pdf | |
PWC | https://paperswithcode.com/paper/photo-realistic-facial-details-synthesis-from |
Repo | https://github.com/apchenstu/Facial_Details_Synthesis |
Framework | pytorch |
Learning Discrete and Continuous Factors of Data via Alternating Disentanglement
Title | Learning Discrete and Continuous Factors of Data via Alternating Disentanglement |
Authors | Yeonwoo Jeong, Hyun Oh Song |
Abstract | We address the problem of unsupervised disentanglement of discrete and continuous explanatory factors of data. We first show a simple procedure for minimizing the total correlation of the continuous latent variables without having to use a discriminator network or perform importance sampling, via cascading the information flow in the $\beta$-vae framework. Furthermore, we propose a method which avoids offloading the entire burden of jointly modeling the continuous and discrete factors to the variational encoder by employing a separate discrete inference procedure. This leads to an interesting alternating minimization problem which switches between finding the most likely discrete configuration given the continuous factors and updating the variational encoder based on the computed discrete factors. Experiments show that the proposed method clearly disentangles discrete factors and significantly outperforms current disentanglement methods based on the disentanglement score and inference network classification score. The source code is available at https://github.com/snu-mllab/DisentanglementICML19. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09432v1 |
https://arxiv.org/pdf/1905.09432v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-discrete-and-continuous-factors-of |
Repo | https://github.com/snu-mllab/DisentanglementICML19 |
Framework | tf |
OpenKiwi: An Open Source Framework for Quality Estimation
Title | OpenKiwi: An Open Source Framework for Quality Estimation |
Authors | Fábio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, André F. T. Martins |
Abstract | We introduce OpenKiwi, a PyTorch-based open source framework for translation quality estimation. OpenKiwi supports training and testing of word-level and sentence-level quality estimation systems, implementing the winning systems of the WMT 2015-18 quality estimation campaigns. We benchmark OpenKiwi on two datasets from WMT 2018 (English-German SMT and NMT), yielding state-of-the-art performance on the word-level tasks and near state-of-the-art in the sentence-level tasks. |
Tasks | |
Published | 2019-02-22 |
URL | https://arxiv.org/abs/1902.08646v2 |
https://arxiv.org/pdf/1902.08646v2.pdf | |
PWC | https://paperswithcode.com/paper/openkiwi-an-open-source-framework-for-quality |
Repo | https://github.com/Unbabel/OpenKiwi |
Framework | pytorch |
Rethinking Kernel Methods for Node Representation Learning on Graphs
Title | Rethinking Kernel Methods for Node Representation Learning on Graphs |
Authors | Yu Tian, Long Zhao, Xi Peng, Dimitris N. Metaxas |
Abstract | Graph kernels are kernel methods measuring graph similarity and serve as a standard tool for graph classification. However, the use of kernel methods for node classification, which is a related problem to graph representation learning, is still ill-posed and the state-of-the-art methods are heavily based on heuristics. Here, we present a novel theoretical kernel-based framework for node classification that can bridge the gap between these two representation learning problems on graphs. Our approach is motivated by graph kernel methodology but extended to learn the node representations capturing the structural information in a graph. We theoretically show that our formulation is as powerful as any positive semidefinite kernels. To efficiently learn the kernel, we propose a novel mechanism for node feature aggregation and a data-driven similarity metric employed during the training phase. More importantly, our framework is flexible and complementary to other graph-based deep learning models, e.g., Graph Convolutional Networks (GCNs). We empirically evaluate our approach on a number of standard node classification benchmarks, and demonstrate that our model sets the new state of the art. |
Tasks | Graph Classification, Graph Representation Learning, Graph Similarity, Node Classification, Representation Learning |
Published | 2019-10-06 |
URL | https://arxiv.org/abs/1910.02548v1 |
https://arxiv.org/pdf/1910.02548v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-kernel-methods-for-node |
Repo | https://github.com/bluer555/KernelGCN |
Framework | pytorch |
Graph-Based Parallel Large Scale Structure from Motion
Title | Graph-Based Parallel Large Scale Structure from Motion |
Authors | Yu Chen, Shuhan Shen, Yisong Chen, Guoping Wang |
Abstract | While Structure from Motion (SfM) achieves great success in 3D reconstruction, it still meets challenges on large scale scenes. In this work, large scale SfM is deemed as a graph problem, and we tackle it in a divide-and-conquer manner. Firstly, the images clustering algorithm divides images into clusters with strong connectivity, leading to robust local reconstructions. Then followed with an image expansion step, the connection and completeness of scenes are enhanced by expanding along with a maximum spanning tree. After local reconstructions, we construct a minimum spanning tree (MinST) to find accurate similarity transformations. Then the MinST is transformed into a Minimum Height Tree (MHT) to find a proper anchor node and is further utilized to prevent error accumulation. When evaluated on different kinds of datasets, our approach shows superiority over the state-of-the-art in accuracy and efficiency. Our algorithm is open-sourced at https://github.com/AIBluefisher/GraphSfM. |
Tasks | 3D Reconstruction |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10659v1 |
https://arxiv.org/pdf/1912.10659v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-parallel-large-scale-structure |
Repo | https://github.com/AIBluefisher/GraphSfM |
Framework | none |
Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
Title | Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation |
Authors | Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai |
Abstract | In this paper, we propose an effective and efficient pyramid multi-view stereo (MVS) net for accurate and complete dense point cloud reconstruction. Different from existing deep-learning based MVS methods, our VA-MVSNet incorporates the cost variance between different views by introducing two novel self-adaptive view aggregation: pixel-wise view aggregation and voxel-wise view aggregation. Moreover, to enhance the point cloud reconstruction on the texture-less regions, we extend VA-MVSNet with pyramid multi-scale images input as PVA-MVSNet, where multi-metric constraints are leveraged to aggregate the reliable depth estimation at the coarser scale to fill-in the mismatched regions at the finer scale. Experimental results show that our approach establishes a new state-of-the-art on the DTU dataset with significant improvements in the completeness and overall quality of 3D reconstruction, and ranks 1st on the Tanks and Temples benchmark among all published deep-learning based methods. Our codebase is available at https://github.com/yhw-yhw/PVAMVSNet. |
Tasks | 3D Reconstruction, Depth Estimation |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.03001v1 |
https://arxiv.org/pdf/1912.03001v1.pdf | |
PWC | https://paperswithcode.com/paper/pyramid-multi-view-stereo-net-with-self |
Repo | https://github.com/yhw-yhw/PVAMVSNet |
Framework | pytorch |