April 3, 2020

3166 words 15 mins read

Paper Group AWR 26

Paper Group AWR 26

Complexity Guarantees for Polyak Steps with Momentum. Supervised Learning on Relational Databases with Graph Neural Networks. Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study. Synthesizing human-like sketches from natural images using a conditional convolutional decoder. TaskNorm: Rethinking Batch …

Complexity Guarantees for Polyak Steps with Momentum

Title Complexity Guarantees for Polyak Steps with Momentum
Authors Mathieu Barré, Adrien Taylor, Alexandre d’Aspremont
Abstract In smooth strongly convex optimization, or in the presence of H"olderian error bounds, knowledge of the curvature parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.
Published 2020-02-03
URL https://arxiv.org/abs/2002.00915v1
PDF https://arxiv.org/pdf/2002.00915v1.pdf
PWC https://paperswithcode.com/paper/complexity-guarantees-for-polyak-steps-with
Repo https://github.com/mathbarre/PerformanceEstimationPolyakSteps
Framework none

Supervised Learning on Relational Databases with Graph Neural Networks

Title Supervised Learning on Relational Databases with Graph Neural Networks
Authors Milan Cvitkovic
Abstract The majority of data scientists and machine learning practitioners use relational data in their work [State of ML and Data Science 2017, Kaggle, Inc.]. But training machine learning models on data stored in relational databases requires significant data extraction and feature engineering efforts. These efforts are not only costly, but they also destroy potentially important relational structure in the data. We introduce a method that uses Graph Neural Networks to overcome these challenges. Our proposed method outperforms state-of-the-art automatic feature engineering methods on two out of three datasets.
Tasks Feature Engineering
Published 2020-02-06
URL https://arxiv.org/abs/2002.02046v1
PDF https://arxiv.org/pdf/2002.02046v1.pdf
PWC https://paperswithcode.com/paper/supervised-learning-on-relational-databases
Repo https://github.com/mwcvitkovic/Supervised-Learning-on-Relational-Databases-with-GNNs
Framework none

Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study

Title Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study
Authors Ke Li, Zilin Xiang, Tao Chen, Shuo Wang, Kay Chen Tan
Abstract Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/knowledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is “not difficult” to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques.
Tasks Transfer Learning
Published 2020-02-08
URL https://arxiv.org/abs/2002.03148v1
PDF https://arxiv.org/pdf/2002.03148v1.pdf
PWC https://paperswithcode.com/paper/understanding-the-automated-parameter
Repo https://github.com/COLA-Laboratory/icse2020
Framework none

Synthesizing human-like sketches from natural images using a conditional convolutional decoder

Title Synthesizing human-like sketches from natural images using a conditional convolutional decoder
Authors Moritz Kampelmühler, Axel Pinz
Abstract Humans are able to precisely communicate diverse concepts by employing sketches, a highly reduced and abstract shape based representation of visual content. We propose, for the first time, a fully convolutional end-to-end architecture that is able to synthesize human-like sketches of objects in natural images with potentially cluttered background. To enable an architecture to learn this highly abstract mapping, we employ the following key components: (1) a fully convolutional encoder-decoder structure, (2) a perceptual similarity loss function operating in an abstract feature space and (3) conditioning of the decoder on the label of the object that shall be sketched. Given the combination of these architectural concepts, we can train our structure in an end-to-end supervised fashion on a collection of sketch-image pairs. The generated sketches of our architecture can be classified with 85.6% Top-5 accuracy and we verify their visual quality via a user study. We find that deep features as a perceptual similarity metric enable image translation with large domain gaps and our findings further show that convolutional neural networks trained on image classification tasks implicitly learn to encode shape information. Code is available under https://github.com/kampelmuehler/synthesizing_human_like_sketches
Tasks Image Classification
Published 2020-03-16
URL https://arxiv.org/abs/2003.07101v1
PDF https://arxiv.org/pdf/2003.07101v1.pdf
PWC https://paperswithcode.com/paper/synthesizing-human-like-sketches-from-natural
Repo https://github.com/kampelmuehler/synthesizing_human_like_sketches
Framework pytorch

TaskNorm: Rethinking Batch Normalization for Meta-Learning

Title TaskNorm: Rethinking Batch Normalization for Meta-Learning
Authors John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, Richard E. Turner
Abstract Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.
Tasks Image Classification, Meta-Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.03284v1
PDF https://arxiv.org/pdf/2003.03284v1.pdf
PWC https://paperswithcode.com/paper/tasknorm-rethinking-batch-normalization-for
Repo https://github.com/cambridge-mlg/cnaps
Framework pytorch

Rethinking Image Mixture for Unsupervised Visual Representation Learning

Title Rethinking Image Mixture for Unsupervised Visual Representation Learning
Authors Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell
Abstract In supervised learning, smoothing label/prediction distribution in neural network training has been proven useful in preventing the model from being over-confident, and is crucial for learning more robust visual representations. This observation motivates us to explore the way to make predictions flattened in unsupervised learning. Considering that human annotated labels are not adopted in unsupervised learning, we introduce a straightforward approach to perturb input image space in order to soften the output prediction space indirectly. Despite its conceptual simplicity, we show empirically that with the simple solution – image mixture, we can learn more robust visual representations from the transformed input, and the benefits of representations learned from this space can be inherited by the linear classification and downstream tasks.
Tasks Representation Learning
Published 2020-03-11
URL https://arxiv.org/abs/2003.05438v1
PDF https://arxiv.org/pdf/2003.05438v1.pdf
PWC https://paperswithcode.com/paper/rethinking-image-mixture-for-unsupervised
Repo https://github.com/szq0214/Rethinking-Image-Mixture-for-Unsupervised-Learning
Framework pytorch

Audio inpainting with generative adversarial network

Title Audio inpainting with generative adversarial network
Authors P. P. Ebner, A. Eltelt
Abstract We study the ability of Wasserstein Generative Adversarial Network (WGAN) to generate missing audio content which is, in context, (statistically similar) to the sound and the neighboring borders. We deal with the challenge of audio inpainting long range gaps (500 ms) using WGAN models. We improved the quality of the inpainting part using a new proposed WGAN architecture that uses a short-range and a long-range neighboring borders compared to the classical WGAN model. The performance was compared with two different audio instruments (piano and guitar) and on virtuoso pianists together with a string orchestra. The objective difference grading (ODG) was used to evaluate the performance of both architectures. The proposed model outperforms the classical WGAN model and improves the reconstruction of high-frequency content. Further, we got better results for instruments where the frequency spectrum is mainly in the lower range where small noises are less annoying for human ear and the inpainting part is more perceptible. Finally, we could show that better test results for audio dataset were reached where a particular instrument is accompanist by other instruments if we train the network only on this particular instrument neglecting the other instruments.
Published 2020-03-13
URL https://arxiv.org/abs/2003.07704v1
PDF https://arxiv.org/pdf/2003.07704v1.pdf
PWC https://paperswithcode.com/paper/audio-inpainting-with-generative-adversarial
Repo https://github.com/nperraud/gan_audio_inpainting
Framework tf

Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation

Title Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
Authors Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu
Abstract Online semantic 3D segmentation in company with real-time RGB-D reconstruction poses special challenges such as how to perform 3D convolution directly over the progressively fused 3D geometric data, and how to smartly fuse information from frame to frame. We propose a novel fusion-aware 3D point convolution which operates directly on the geometric surface being reconstructed and exploits effectively the inter-frame correlation for high quality 3D feature learning. This is enabled by a dedicated dynamic data structure which organizes the online acquired point cloud with global-local trees. Globally, we compile the online reconstructed 3D points into an incrementally growing coordinate interval tree, enabling fast point insertion and neighborhood query. Locally, we maintain the neighborhood information for each point using an octree whose construction benefits from the fast query of the global tree.Both levels of trees update dynamically and help the 3D convolution effectively exploits the temporal coherence for effective information fusion across RGB-D frames.
Tasks Scene Segmentation
Published 2020-03-13
URL https://arxiv.org/abs/2003.06233v2
PDF https://arxiv.org/pdf/2003.06233v2.pdf
PWC https://paperswithcode.com/paper/fusion-aware-point-convolution-for-online
Repo https://github.com/jzhzhang/FusionAwareConv
Framework pytorch

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction

Title Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
Authors Abduallah Mohamed, Kun Qian, Mohamed Elhoseiny, Christian Claudel
Abstract Better machine understanding of pedestrian behaviors enables faster progress in modeling interactions between agents such as autonomous vehicles and humans. Pedestrian trajectories are not only influenced by the pedestrian itself but also by interaction with surrounding objects. Previous methods modeled these interactions by using a variety of aggregation methods that integrate different learned pedestrians states. We propose the Social Spatio-Temporal Graph Convolutional Neural Network (Social-STGCNN), which substitutes the need of aggregation methods by modeling the interactions as a graph. Our results show an improvement over the state of art by 20% on the Final Displacement Error (FDE) and an improvement on the Average Displacement Error (ADE) with 8.5 times less parameters and up to 48 times faster inference speed than previously reported methods. In addition, our model is data efficient, and exceeds previous state of the art on the ADE metric with only 20% of the training data. We propose a kernel function to embed the social interactions between pedestrians within the adjacency matrix. Through qualitative analysis, we show that our model inherited social behaviors that can be expected between pedestrians trajectories. Code is available at https://github.com/abduallahmohamed/Social-STGCNN.
Tasks Autonomous Vehicles, Trajectory Prediction
Published 2020-02-27
URL https://arxiv.org/abs/2002.11927v3
PDF https://arxiv.org/pdf/2002.11927v3.pdf
PWC https://paperswithcode.com/paper/social-stgcnn-a-social-spatio-temporal-graph
Repo https://github.com/abduallahmohamed/Social-STGCNN
Framework pytorch

Knowledge-aware Attention Network for Protein-Protein Interaction Extraction

Title Knowledge-aware Attention Network for Protein-Protein Interaction Extraction
Authors Huiwei Zhou, Zhuang Liu1, Shixian Ning, Chengkun Lang, Yingyu Lin, Lei Du
Abstract Protein-protein interaction (PPI) extraction from published scientific literature provides additional support for precision medicine efforts. However, many of the current PPI extraction methods need extensive feature engineering and cannot make full use of the prior knowledge in knowledge bases (KB). KBs contain huge amounts of structured information about entities and relationships, therefore plays a pivotal role in PPI extraction. This paper proposes a knowledge-aware attention network (KAN) to fuse prior knowledge about protein-protein pairs and context information for PPI extraction. The proposed model first adopts a diagonal-disabled multi-head attention mechanism to encode context sequence along with knowledge representations learned from KB. Then a novel multi-dimensional attention mechanism is used to select the features that can best describe the encoded context. Experiment results on the BioCreative VI PPI dataset show that the proposed approach could acquire knowledge-aware dependencies between different words in a sequence and lead to a new state-of-the-art performance.
Tasks Feature Engineering
Published 2020-01-07
URL https://arxiv.org/abs/2001.02091v1
PDF https://arxiv.org/pdf/2001.02091v1.pdf
PWC https://paperswithcode.com/paper/knowledge-aware-attention-network-for-protein
Repo https://github.com/zhuango/KAN
Framework pytorch

Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry

Title Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry
Authors Guangming Wang, Chi Zhang, Hesheng Wang, Jingchuan Wang, Yong Wang, Xinlei Wang
Abstract In autonomous driving, monocular sequences contain lots of information. Monocular depth estimation, camera ego-motion estimation and optical flow estimation in consecutive frames are high-profile concerns recently. By analyzing tasks above, pixels in the first frame are modeled into three parts: the rigid region, the non-rigid region, and the occluded region. In joint unsupervised training of depth and pose, we can segment the occluded region explicitly. The occlusion information is used in unsupervised learning of depth, pose and optical flow, as the image reconstructed by depth, pose and flow will be invalid in occluded regions. A less-than-mean mask is designed to further exclude the mismatched pixels which are interfered with motion or illumination change in the training of depth and pose networks. This method is also used to exclude some trivial mismatched pixels in the training of the flow net. Maximum normalization is proposed for smoothness term of depth-pose networks to restrain degradation in textureless regions. In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of flow. Our experiments in KITTI dataset demonstrate that the model based on three regions, full and explicit segmentation of occlusion, rigid region and non-rigid region with corresponding unsupervised losses can improve performance on three tasks significantly.
Tasks Autonomous Driving, Depth And Camera Motion, Depth Estimation, Monocular Depth Estimation, Motion Estimation, Optical Flow Estimation
Published 2020-03-02
URL https://arxiv.org/abs/2003.00766v1
PDF https://arxiv.org/pdf/2003.00766v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-of-depth-optical-flow-1
Repo https://github.com/guangmingw/DOPlearning
Framework pytorch

Image Matching across Wide Baselines: From Paper to Practice

Title Image Matching across Wide Baselines: From Paper to Practice
Authors Yuhe Jin, Dmytro Mishkin, Anastasiia Mishchuk, Jiri Matas, Pascal Fua, Kwang Moo Yi, Eduard Trulls
Abstract We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task – the accuracy of the reconstructed camera pose – as our primary metric. Our pipeline’s modular structure allows us to easily integrate, configure, and combine methods and heuristics. We demonstrate this by embedding dozens of popular algorithms and evaluating them, from seminal works to the cutting edge of machine learning research. We show that with proper settings, classical solutions may still outperform the perceived state of the art. Besides establishing the actual state of the art, the experiments conducted in this paper reveal unexpected properties of SfM pipelines that can be exploited to help improve their performance, for both algorithmic and learned methods. Data and code are online https://github.com/vcg-uvic/image-matching-benchmark, providing an easy-to-use and flexible framework for the benchmarking of local feature and robust estimation methods, both alongside and against top-performing methods. This work provides the basis for an open challenge on wide-baseline image matching https://vision.uvic.ca/image-matching-challenge .
Published 2020-03-03
URL https://arxiv.org/abs/2003.01587v1
PDF https://arxiv.org/pdf/2003.01587v1.pdf
PWC https://paperswithcode.com/paper/image-matching-across-wide-baselines-from
Repo https://github.com/vcg-uvic/image-matching-benchmark
Framework none

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

Title The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Authors Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao
Abstract We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.
Tasks Multi-Task Learning, Structured Prediction
Published 2020-02-19
URL https://arxiv.org/abs/2002.07972v1
PDF https://arxiv.org/pdf/2002.07972v1.pdf
PWC https://paperswithcode.com/paper/the-microsoft-toolkit-of-multi-task-deep
Repo https://github.com/namisan/mt-dnn
Framework pytorch

Fully Differentiable Procedural Content Generation through Generative Playing Networks

Title Fully Differentiable Procedural Content Generation through Generative Playing Networks
Authors Philip Bontrager, Julian Togelius
Abstract To procedurally create interactive content such as environments or game levels, we need agents that can evaluate the content; but to train such agents, we need content they can train on. Generative Playing Networks is a framework that learns agent policies and generates environments in tandem through a symbiotic process. Policies are learned using an actor-critic reinforcement learning algorithm so as to master the environment, and environments are created by a generator network which tries to provide an appropriate level of challenge for the agent. This is accomplished by the generator learning to make content based on estimates by the critic. Thus, this process provides an implicit curriculum for the agent, creating more complex environments over time. Unlike previous approaches to procedural content generation, Generative Playing Networks is end-to-end differentiable and does not require human-designed examples or domain knowledge. We demonstrate the capability of this framework by training an agent and level generator for a 2D dungeon crawler game.
Published 2020-02-12
URL https://arxiv.org/abs/2002.05259v1
PDF https://arxiv.org/pdf/2002.05259v1.pdf
PWC https://paperswithcode.com/paper/fully-differentiable-procedural-content
Repo https://github.com/pbontrager/GenerativePlayingNetworks
Framework pytorch

Operationally meaningful representations of physical systems in neural networks

Title Operationally meaningful representations of physical systems in neural networks
Authors Hendrik Poulsen Nautrup, Tony Metger, Raban Iten, Sofiene Jerbi, Lea M. Trenkwalder, Henrik Wilming, Hans J. Briegel, Renato Renner
Abstract To make progress in science, we often build abstract representations of physical systems that meaningfully encode information about the systems. The representations learnt by most current machine learning techniques reflect statistical structure present in the training data; however, these methods do not allow us to specify explicit and operationally meaningful requirements on the representation. Here, we present a neural network architecture based on the notion that agents dealing with different aspects of a physical system should be able to communicate relevant information as efficiently as possible to one another. This produces representations that separate different parameters which are useful for making statements about the physical system in different experimental settings. We present examples involving both classical and quantum physics. For instance, our architecture finds a compact representation of an arbitrary two-qubit system that separates local parameters from parameters describing quantum correlations. We further show that this method can be combined with reinforcement learning to enable representation learning within interactive scenarios where agents need to explore experimental settings to identify relevant variables.
Tasks Representation Learning
Published 2020-01-02
URL https://arxiv.org/abs/2001.00593v1
PDF https://arxiv.org/pdf/2001.00593v1.pdf
PWC https://paperswithcode.com/paper/operationally-meaningful-representations-of
Repo https://github.com/tonymetger/communicating_scinet
Framework tf
comments powered by Disqus