Paper Group AWR 26
Complexity Guarantees for Polyak Steps with Momentum. Supervised Learning on Relational Databases with Graph Neural Networks. Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study. Synthesizing human-like sketches from natural images using a conditional convolutional decoder. TaskNorm: Rethinking Batch …
Complexity Guarantees for Polyak Steps with Momentum
Title | Complexity Guarantees for Polyak Steps with Momentum |
Authors | Mathieu Barré, Adrien Taylor, Alexandre d’Aspremont |
Abstract | In smooth strongly convex optimization, or in the presence of H"olderian error bounds, knowledge of the curvature parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00915v1 |
https://arxiv.org/pdf/2002.00915v1.pdf | |
PWC | https://paperswithcode.com/paper/complexity-guarantees-for-polyak-steps-with |
Repo | https://github.com/mathbarre/PerformanceEstimationPolyakSteps |
Framework | none |
Supervised Learning on Relational Databases with Graph Neural Networks
Title | Supervised Learning on Relational Databases with Graph Neural Networks |
Authors | Milan Cvitkovic |
Abstract | The majority of data scientists and machine learning practitioners use relational data in their work [State of ML and Data Science 2017, Kaggle, Inc.]. But training machine learning models on data stored in relational databases requires significant data extraction and feature engineering efforts. These efforts are not only costly, but they also destroy potentially important relational structure in the data. We introduce a method that uses Graph Neural Networks to overcome these challenges. Our proposed method outperforms state-of-the-art automatic feature engineering methods on two out of three datasets. |
Tasks | Feature Engineering |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02046v1 |
https://arxiv.org/pdf/2002.02046v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-learning-on-relational-databases |
Repo | https://github.com/mwcvitkovic/Supervised-Learning-on-Relational-Databases-with-GNNs |
Framework | none |
Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study
Title | Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study |
Authors | Ke Li, Zilin Xiang, Tao Chen, Shuo Wang, Kay Chen Tan |
Abstract | Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/knowledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is “not difficult” to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques. |
Tasks | Transfer Learning |
Published | 2020-02-08 |
URL | https://arxiv.org/abs/2002.03148v1 |
https://arxiv.org/pdf/2002.03148v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-the-automated-parameter |
Repo | https://github.com/COLA-Laboratory/icse2020 |
Framework | none |
Synthesizing human-like sketches from natural images using a conditional convolutional decoder
Title | Synthesizing human-like sketches from natural images using a conditional convolutional decoder |
Authors | Moritz Kampelmühler, Axel Pinz |
Abstract | Humans are able to precisely communicate diverse concepts by employing sketches, a highly reduced and abstract shape based representation of visual content. We propose, for the first time, a fully convolutional end-to-end architecture that is able to synthesize human-like sketches of objects in natural images with potentially cluttered background. To enable an architecture to learn this highly abstract mapping, we employ the following key components: (1) a fully convolutional encoder-decoder structure, (2) a perceptual similarity loss function operating in an abstract feature space and (3) conditioning of the decoder on the label of the object that shall be sketched. Given the combination of these architectural concepts, we can train our structure in an end-to-end supervised fashion on a collection of sketch-image pairs. The generated sketches of our architecture can be classified with 85.6% Top-5 accuracy and we verify their visual quality via a user study. We find that deep features as a perceptual similarity metric enable image translation with large domain gaps and our findings further show that convolutional neural networks trained on image classification tasks implicitly learn to encode shape information. Code is available under https://github.com/kampelmuehler/synthesizing_human_like_sketches |
Tasks | Image Classification |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07101v1 |
https://arxiv.org/pdf/2003.07101v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesizing-human-like-sketches-from-natural |
Repo | https://github.com/kampelmuehler/synthesizing_human_like_sketches |
Framework | pytorch |
TaskNorm: Rethinking Batch Normalization for Meta-Learning
Title | TaskNorm: Rethinking Batch Normalization for Meta-Learning |
Authors | John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, Richard E. Turner |
Abstract | Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines. However, the hierarchical nature of the meta-learning setting presents several challenges that can render conventional batch normalization ineffective, giving rise to the need to rethink normalization in this setting. We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm. Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time for both gradient based and gradient-free meta-learning approaches. Importantly, TaskNorm is found to consistently improve performance. Finally, we provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms. |
Tasks | Image Classification, Meta-Learning |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03284v1 |
https://arxiv.org/pdf/2003.03284v1.pdf | |
PWC | https://paperswithcode.com/paper/tasknorm-rethinking-batch-normalization-for |
Repo | https://github.com/cambridge-mlg/cnaps |
Framework | pytorch |
Rethinking Image Mixture for Unsupervised Visual Representation Learning
Title | Rethinking Image Mixture for Unsupervised Visual Representation Learning |
Authors | Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell |
Abstract | In supervised learning, smoothing label/prediction distribution in neural network training has been proven useful in preventing the model from being over-confident, and is crucial for learning more robust visual representations. This observation motivates us to explore the way to make predictions flattened in unsupervised learning. Considering that human annotated labels are not adopted in unsupervised learning, we introduce a straightforward approach to perturb input image space in order to soften the output prediction space indirectly. Despite its conceptual simplicity, we show empirically that with the simple solution – image mixture, we can learn more robust visual representations from the transformed input, and the benefits of representations learned from this space can be inherited by the linear classification and downstream tasks. |
Tasks | Representation Learning |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05438v1 |
https://arxiv.org/pdf/2003.05438v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-image-mixture-for-unsupervised |
Repo | https://github.com/szq0214/Rethinking-Image-Mixture-for-Unsupervised-Learning |
Framework | pytorch |
Audio inpainting with generative adversarial network
Title | Audio inpainting with generative adversarial network |
Authors | P. P. Ebner, A. Eltelt |
Abstract | We study the ability of Wasserstein Generative Adversarial Network (WGAN) to generate missing audio content which is, in context, (statistically similar) to the sound and the neighboring borders. We deal with the challenge of audio inpainting long range gaps (500 ms) using WGAN models. We improved the quality of the inpainting part using a new proposed WGAN architecture that uses a short-range and a long-range neighboring borders compared to the classical WGAN model. The performance was compared with two different audio instruments (piano and guitar) and on virtuoso pianists together with a string orchestra. The objective difference grading (ODG) was used to evaluate the performance of both architectures. The proposed model outperforms the classical WGAN model and improves the reconstruction of high-frequency content. Further, we got better results for instruments where the frequency spectrum is mainly in the lower range where small noises are less annoying for human ear and the inpainting part is more perceptible. Finally, we could show that better test results for audio dataset were reached where a particular instrument is accompanist by other instruments if we train the network only on this particular instrument neglecting the other instruments. |
Tasks | |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.07704v1 |
https://arxiv.org/pdf/2003.07704v1.pdf | |
PWC | https://paperswithcode.com/paper/audio-inpainting-with-generative-adversarial |
Repo | https://github.com/nperraud/gan_audio_inpainting |
Framework | tf |
Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
Title | Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation |
Authors | Jiazhao Zhang, Chenyang Zhu, Lintao Zheng, Kai Xu |
Abstract | Online semantic 3D segmentation in company with real-time RGB-D reconstruction poses special challenges such as how to perform 3D convolution directly over the progressively fused 3D geometric data, and how to smartly fuse information from frame to frame. We propose a novel fusion-aware 3D point convolution which operates directly on the geometric surface being reconstructed and exploits effectively the inter-frame correlation for high quality 3D feature learning. This is enabled by a dedicated dynamic data structure which organizes the online acquired point cloud with global-local trees. Globally, we compile the online reconstructed 3D points into an incrementally growing coordinate interval tree, enabling fast point insertion and neighborhood query. Locally, we maintain the neighborhood information for each point using an octree whose construction benefits from the fast query of the global tree.Both levels of trees update dynamically and help the 3D convolution effectively exploits the temporal coherence for effective information fusion across RGB-D frames. |
Tasks | Scene Segmentation |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.06233v2 |
https://arxiv.org/pdf/2003.06233v2.pdf | |
PWC | https://paperswithcode.com/paper/fusion-aware-point-convolution-for-online |
Repo | https://github.com/jzhzhang/FusionAwareConv |
Framework | pytorch |
Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
Title | Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction |
Authors | Abduallah Mohamed, Kun Qian, Mohamed Elhoseiny, Christian Claudel |
Abstract | Better machine understanding of pedestrian behaviors enables faster progress in modeling interactions between agents such as autonomous vehicles and humans. Pedestrian trajectories are not only influenced by the pedestrian itself but also by interaction with surrounding objects. Previous methods modeled these interactions by using a variety of aggregation methods that integrate different learned pedestrians states. We propose the Social Spatio-Temporal Graph Convolutional Neural Network (Social-STGCNN), which substitutes the need of aggregation methods by modeling the interactions as a graph. Our results show an improvement over the state of art by 20% on the Final Displacement Error (FDE) and an improvement on the Average Displacement Error (ADE) with 8.5 times less parameters and up to 48 times faster inference speed than previously reported methods. In addition, our model is data efficient, and exceeds previous state of the art on the ADE metric with only 20% of the training data. We propose a kernel function to embed the social interactions between pedestrians within the adjacency matrix. Through qualitative analysis, we show that our model inherited social behaviors that can be expected between pedestrians trajectories. Code is available at https://github.com/abduallahmohamed/Social-STGCNN. |
Tasks | Autonomous Vehicles, Trajectory Prediction |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11927v3 |
https://arxiv.org/pdf/2002.11927v3.pdf | |
PWC | https://paperswithcode.com/paper/social-stgcnn-a-social-spatio-temporal-graph |
Repo | https://github.com/abduallahmohamed/Social-STGCNN |
Framework | pytorch |
Knowledge-aware Attention Network for Protein-Protein Interaction Extraction
Title | Knowledge-aware Attention Network for Protein-Protein Interaction Extraction |
Authors | Huiwei Zhou, Zhuang Liu1, Shixian Ning, Chengkun Lang, Yingyu Lin, Lei Du |
Abstract | Protein-protein interaction (PPI) extraction from published scientific literature provides additional support for precision medicine efforts. However, many of the current PPI extraction methods need extensive feature engineering and cannot make full use of the prior knowledge in knowledge bases (KB). KBs contain huge amounts of structured information about entities and relationships, therefore plays a pivotal role in PPI extraction. This paper proposes a knowledge-aware attention network (KAN) to fuse prior knowledge about protein-protein pairs and context information for PPI extraction. The proposed model first adopts a diagonal-disabled multi-head attention mechanism to encode context sequence along with knowledge representations learned from KB. Then a novel multi-dimensional attention mechanism is used to select the features that can best describe the encoded context. Experiment results on the BioCreative VI PPI dataset show that the proposed approach could acquire knowledge-aware dependencies between different words in a sequence and lead to a new state-of-the-art performance. |
Tasks | Feature Engineering |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.02091v1 |
https://arxiv.org/pdf/2001.02091v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-aware-attention-network-for-protein |
Repo | https://github.com/zhuango/KAN |
Framework | pytorch |
Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry
Title | Unsupervised Learning of Depth, Optical Flow and Pose with Occlusion from 3D Geometry |
Authors | Guangming Wang, Chi Zhang, Hesheng Wang, Jingchuan Wang, Yong Wang, Xinlei Wang |
Abstract | In autonomous driving, monocular sequences contain lots of information. Monocular depth estimation, camera ego-motion estimation and optical flow estimation in consecutive frames are high-profile concerns recently. By analyzing tasks above, pixels in the first frame are modeled into three parts: the rigid region, the non-rigid region, and the occluded region. In joint unsupervised training of depth and pose, we can segment the occluded region explicitly. The occlusion information is used in unsupervised learning of depth, pose and optical flow, as the image reconstructed by depth, pose and flow will be invalid in occluded regions. A less-than-mean mask is designed to further exclude the mismatched pixels which are interfered with motion or illumination change in the training of depth and pose networks. This method is also used to exclude some trivial mismatched pixels in the training of the flow net. Maximum normalization is proposed for smoothness term of depth-pose networks to restrain degradation in textureless regions. In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of flow. Our experiments in KITTI dataset demonstrate that the model based on three regions, full and explicit segmentation of occlusion, rigid region and non-rigid region with corresponding unsupervised losses can improve performance on three tasks significantly. |
Tasks | Autonomous Driving, Depth And Camera Motion, Depth Estimation, Monocular Depth Estimation, Motion Estimation, Optical Flow Estimation |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00766v1 |
https://arxiv.org/pdf/2003.00766v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-depth-optical-flow-1 |
Repo | https://github.com/guangmingw/DOPlearning |
Framework | pytorch |
Image Matching across Wide Baselines: From Paper to Practice
Title | Image Matching across Wide Baselines: From Paper to Practice |
Authors | Yuhe Jin, Dmytro Mishkin, Anastasiia Mishchuk, Jiri Matas, Pascal Fua, Kwang Moo Yi, Eduard Trulls |
Abstract | We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task – the accuracy of the reconstructed camera pose – as our primary metric. Our pipeline’s modular structure allows us to easily integrate, configure, and combine methods and heuristics. We demonstrate this by embedding dozens of popular algorithms and evaluating them, from seminal works to the cutting edge of machine learning research. We show that with proper settings, classical solutions may still outperform the perceived state of the art. Besides establishing the actual state of the art, the experiments conducted in this paper reveal unexpected properties of SfM pipelines that can be exploited to help improve their performance, for both algorithmic and learned methods. Data and code are online https://github.com/vcg-uvic/image-matching-benchmark, providing an easy-to-use and flexible framework for the benchmarking of local feature and robust estimation methods, both alongside and against top-performing methods. This work provides the basis for an open challenge on wide-baseline image matching https://vision.uvic.ca/image-matching-challenge . |
Tasks | |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01587v1 |
https://arxiv.org/pdf/2003.01587v1.pdf | |
PWC | https://paperswithcode.com/paper/image-matching-across-wide-baselines-from |
Repo | https://github.com/vcg-uvic/image-matching-benchmark |
Framework | none |
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Title | The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding |
Authors | Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao |
Abstract | We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn. |
Tasks | Multi-Task Learning, Structured Prediction |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.07972v1 |
https://arxiv.org/pdf/2002.07972v1.pdf | |
PWC | https://paperswithcode.com/paper/the-microsoft-toolkit-of-multi-task-deep |
Repo | https://github.com/namisan/mt-dnn |
Framework | pytorch |
Fully Differentiable Procedural Content Generation through Generative Playing Networks
Title | Fully Differentiable Procedural Content Generation through Generative Playing Networks |
Authors | Philip Bontrager, Julian Togelius |
Abstract | To procedurally create interactive content such as environments or game levels, we need agents that can evaluate the content; but to train such agents, we need content they can train on. Generative Playing Networks is a framework that learns agent policies and generates environments in tandem through a symbiotic process. Policies are learned using an actor-critic reinforcement learning algorithm so as to master the environment, and environments are created by a generator network which tries to provide an appropriate level of challenge for the agent. This is accomplished by the generator learning to make content based on estimates by the critic. Thus, this process provides an implicit curriculum for the agent, creating more complex environments over time. Unlike previous approaches to procedural content generation, Generative Playing Networks is end-to-end differentiable and does not require human-designed examples or domain knowledge. We demonstrate the capability of this framework by training an agent and level generator for a 2D dungeon crawler game. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05259v1 |
https://arxiv.org/pdf/2002.05259v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-differentiable-procedural-content |
Repo | https://github.com/pbontrager/GenerativePlayingNetworks |
Framework | pytorch |
Operationally meaningful representations of physical systems in neural networks
Title | Operationally meaningful representations of physical systems in neural networks |
Authors | Hendrik Poulsen Nautrup, Tony Metger, Raban Iten, Sofiene Jerbi, Lea M. Trenkwalder, Henrik Wilming, Hans J. Briegel, Renato Renner |
Abstract | To make progress in science, we often build abstract representations of physical systems that meaningfully encode information about the systems. The representations learnt by most current machine learning techniques reflect statistical structure present in the training data; however, these methods do not allow us to specify explicit and operationally meaningful requirements on the representation. Here, we present a neural network architecture based on the notion that agents dealing with different aspects of a physical system should be able to communicate relevant information as efficiently as possible to one another. This produces representations that separate different parameters which are useful for making statements about the physical system in different experimental settings. We present examples involving both classical and quantum physics. For instance, our architecture finds a compact representation of an arbitrary two-qubit system that separates local parameters from parameters describing quantum correlations. We further show that this method can be combined with reinforcement learning to enable representation learning within interactive scenarios where agents need to explore experimental settings to identify relevant variables. |
Tasks | Representation Learning |
Published | 2020-01-02 |
URL | https://arxiv.org/abs/2001.00593v1 |
https://arxiv.org/pdf/2001.00593v1.pdf | |
PWC | https://paperswithcode.com/paper/operationally-meaningful-representations-of |
Repo | https://github.com/tonymetger/communicating_scinet |
Framework | tf |