Paper Group ANR 541
Which Surrogate Works for Empirical Performance Modelling? A Case Study with Differential Evolution. Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning. Assessing The Factual Accuracy of Generated Text. Convolutional Neural Network for Convective Storm Nowcasting Using 3D Doppler Weather Radar Data. Transfer Lear …
Which Surrogate Works for Empirical Performance Modelling? A Case Study with Differential Evolution
Title | Which Surrogate Works for Empirical Performance Modelling? A Case Study with Differential Evolution |
Authors | Ke Li, Zilin Xiang, Kay Chen Tan |
Abstract | It is not uncommon that meta-heuristic algorithms contain some intrinsic parameters, the optimal configuration of which is crucial for achieving their peak performance. However, evaluating the effectiveness of a configuration is expensive, as it involves many costly runs of the target algorithm. Perhaps surprisingly, it is possible to build a cheap-to-evaluate surrogate that models the algorithm’s empirical performance as a function of its parameters. Such surrogates constitute an important building block for understanding algorithm performance, algorithm portfolio/selection, and the automatic algorithm configuration. In principle, many off-the-shelf machine learning techniques can be used to build surrogates. In this paper, we take the differential evolution (DE) as the baseline algorithm for proof-of-concept study. Regression models are trained to model the DE’s empirical performance given a parameter configuration. In particular, we evaluate and compare four popular regression algorithms both in terms of how well they predict the empirical performance with respect to a particular parameter configuration, and also how well they approximate the parameter versus the empirical performance landscapes. |
Tasks | |
Published | 2019-01-30 |
URL | http://arxiv.org/abs/1901.11120v1 |
http://arxiv.org/pdf/1901.11120v1.pdf | |
PWC | https://paperswithcode.com/paper/which-surrogate-works-for-empirical |
Repo | |
Framework | |
Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning
Title | Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning |
Authors | Mohammad Yaghini, Bogdan Kulynych, Carmela Troncoso |
Abstract | A membership inference attack (MIA) against a machine learning model enables an attacker to determine whether a given data record was part of the model’s training dataset or not. Such attacks have been shown to be practical both in centralized and federated settings, and pose a threat in many privacy-sensitive domains such as medicine or law enforcement. In the literature, the effectiveness of these attacks is invariably reported using metrics computed across the whole population. In this paper, we take a closer look at the attack’s performance across different subgroups present in the data distributions. We introduce a framework that enables us to efficiently analyze the vulnerability of machine learning models to MIA. We discover that even if the accuracy of MIA looks no better than random guessing over the whole population, subgroups are subject to disparate vulnerability, i.e., certain subgroups can be significantly more vulnerable than others. We provide a theoretical definition for MIA vulnerability which we validate empirically both on synthetic and real data. |
Tasks | Inference Attack |
Published | 2019-06-02 |
URL | https://arxiv.org/abs/1906.00389v1 |
https://arxiv.org/pdf/1906.00389v1.pdf | |
PWC | https://paperswithcode.com/paper/190600389 |
Repo | |
Framework | |
Assessing The Factual Accuracy of Generated Text
Title | Assessing The Factual Accuracy of Generated Text |
Authors | Ben Goodrich, Vinay Rao, Mohammad Saleh, Peter J Liu |
Abstract | We propose a model-based metric to estimate the factual accuracy of generated text that is complementary to typical scoring schemes like ROUGE (Recall-Oriented Understudy for Gisting Evaluation) and BLEU (Bilingual Evaluation Understudy). We introduce and release a new large-scale dataset based on Wikipedia and Wikidata to train relation classifiers and end-to-end fact extraction models. The end-to-end models are shown to be able to extract complete sets of facts from datasets with full pages of text. We then analyse multiple models that estimate factual accuracy on a Wikipedia text summarization task, and show their efficacy compared to ROUGE and other model-free variants by conducting a human evaluation study. |
Tasks | Text Summarization |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13322v1 |
https://arxiv.org/pdf/1905.13322v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-the-factual-accuracy-of-generated |
Repo | |
Framework | |
Convolutional Neural Network for Convective Storm Nowcasting Using 3D Doppler Weather Radar Data
Title | Convolutional Neural Network for Convective Storm Nowcasting Using 3D Doppler Weather Radar Data |
Authors | Lei Han, Juanzhen Sun, Wei Zhang |
Abstract | Convective storms are one of the severe weather hazards found during the warm season. Doppler weather radar is the only operational instrument that can frequently sample the detailed structure of convective storm which has a small spatial scale and short lifetime. For the challenging task of short-term convective storm forecasting, 3-D radar images contain information about the processes in convective storm. However, effectively extracting such information from multisource raw data has been problematic due to a lack of methodology and computation limitations. Recent advancements in deep learning techniques and graphics processing units now make it possible. This article investigates the feasibility and performance of an end-to-end deep learning nowcasting method. The nowcasting problem was transformed into a classification problem first, and then, a deep learning method that uses a convolutional neural network was presented to make predictions. On the first layer of CNN, a cross-channel 3D convolution was proposed to fuse 3D raw data. The CNN method eliminates the handcrafted feature engineering, i.e., the process of using domain knowledge of the data to manually design features. Operationally produced historical data of the Beijing-Tianjin-Hebei region in China was used to train the nowcasting system and evaluate its performance; 3737332 samples were collected in the training data set. The experimental results show that the deep learning method improves nowcasting skills compared with traditional machine learning methods. |
Tasks | Feature Engineering |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06185v2 |
https://arxiv.org/pdf/1911.06185v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-for-convective |
Repo | |
Framework | |
Transfer Learning From Sound Representations For Anger Detection in Speech
Title | Transfer Learning From Sound Representations For Anger Detection in Speech |
Authors | Mohamed Ezzeldin A. ElShaer, Scott Wisdom, Taniya Mishra |
Abstract | In this work, we train fully convolutional networks to detect anger in speech. Since training these deep architectures requires large amounts of data and the size of emotion datasets is relatively small, we use transfer learning. However, unlike previous approaches that use speech or emotion-based tasks for the source model, we instead use SoundNet, a fully convolutional neural network trained multimodally on a massive video dataset to classify audio, with ground-truth labels provided by vision-based classifiers. As a result of transfer learning from SoundNet, our trained anger detection model improves performance and generalizes well on a variety of acted, elicited, and natural emotional speech datasets. We also test the cross-lingual effectiveness of our model by evaluating our English-trained model on Mandarin Chinese speech emotion data. Furthermore, our proposed system has low latency suitable for real-time applications, only requiring 1.2 seconds of audio to make a reliable classification. |
Tasks | Transfer Learning |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02120v1 |
http://arxiv.org/pdf/1902.02120v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-from-sound-representations |
Repo | |
Framework | |
Weakly Supervised Localization Using Background Images
Title | Weakly Supervised Localization Using Background Images |
Authors | Ziyi Kou, Wentian Zhao, Guofeng Cui, Shaojie Wang |
Abstract | Weakly Supervised Object Localization (WSOL) methodsusually rely on fully convolutional networks in order to ob-tain class activation maps(CAMs) of targeted labels. How-ever, these networks always highlight the most discriminativeparts to perform the task, the located areas are much smallerthan entire targeted objects. In this work, we propose a novelend-to-end model to enlarge CAMs generated from classifi-cation models, which can localize targeted objects more pre-cisely. In detail, we add an additional module in traditionalclassification networks to extract foreground object propos-als from images without classifying them into specific cate-gories. Then we set these normalized regions as unrestrictedpixel-level mask supervision for the following classificationtask. We collect a set of images defined as Background ImageSet from the Internet. The number of them is much smallerthan the targeted dataset but surprisingly well supports themethod to extract foreground regions from different pictures.The region extracted is independent from classification task,where the extracted region in each image covers almost en-tire object rather than just a significant part. Therefore, theseregions can serve as masks to supervise the response mapgenerated from classification models to become larger andmore precise. The method achieves state-of-the-art results onCUB-200-2011 in terms of Top-1 and Top-5 localization er-ror while has a competitive result on ILSVRC2016 comparedwith other approaches. |
Tasks | Object Localization, Weakly-Supervised Object Localization |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03619v3 |
https://arxiv.org/pdf/1909.03619v3.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-localization-using |
Repo | |
Framework | |
Deep Reinforcement Learning for Multi-objective Optimization
Title | Deep Reinforcement Learning for Multi-objective Optimization |
Authors | Kaiwen Li, Tao Zhang, Rui Wang |
Abstract | This study proposes an end-to-end framework for solving multi-objective optimization problems (MOPs) using Deep Reinforcement Learning (DRL), termed DRL-MOA. The idea of decomposition is adopted to decompose a MOP into a set of scalar optimization subproblems. The subproblems are then optimized cooperatively by a neighbourhood-based parameter transfer strategy which significantly accelerates the training procedure and makes the realization of DRL-MOA possible. The subproblems are modelled as neural networks and the RL method is used to optimize them. In specific, the multi-objective travelling salesman problem (MOTSP) is solved in this work using the DRL-MOA framework by modelling the subproblem as the Pointer Network. It is found that, once the trained model is available, it can scale to MOTSPs of any number of cities, e.g., 70-city, 100-city, even the 200-city MOTSP, without re-training the model. The Pareto Front can be directly obtained by a simple feed-forward of the network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. Experimental results indicate a strong convergence ability of the DRL-MOA, especially for large-scale MOTSPs, e.g., 200-city MOTSP, for which evolutionary algorithms such as NSGA-II and MOEA/D are pretty hard to converge even implemented for a large number of iterations. The DRL-MOA can also obtain a much wider spread of the PF than the two competitors. Moreover, the DRL-MOA has a high level of modularity and can be easily generalized to other MOPs by replacing the modelling of the subproblem. |
Tasks | |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.02386v1 |
https://arxiv.org/pdf/1906.02386v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-multi-1 |
Repo | |
Framework | |
Volumetric Isosurface Rendering with Deep Learning-Based Super-Resolution
Title | Volumetric Isosurface Rendering with Deep Learning-Based Super-Resolution |
Authors | Sebastian Weiss, Mengyu Chu, Nils Thuerey, Rüdiger Westermann |
Abstract | Rendering an accurate image of an isosurface in a volumetric field typically requires large numbers of data samples. Reducing the number of required samples lies at the core of research in volume rendering. With the advent of deep learning networks, a number of architectures have been proposed recently to infer missing samples in multi-dimensional fields, for applications such as image super-resolution and scan completion. In this paper, we investigate the use of such architectures for learning the upscaling of a low-resolution sampling of an isosurface to a higher resolution, with high fidelity reconstruction of spatial detail and shading. We introduce a fully convolutional neural network, to learn a latent representation generating a smooth, edge-aware normal field and ambient occlusions from a low-resolution normal and depth field. By adding a frame-to-frame motion loss into the learning stage, the upscaling can consider temporal variations and achieves improved frame-to-frame coherence. We demonstrate the quality of the network for isosurfaces which were never seen during training, and discuss remote and in-situ visualization as well as focus+context visualization as potential applications |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-06-15 |
URL | https://arxiv.org/abs/1906.06520v1 |
https://arxiv.org/pdf/1906.06520v1.pdf | |
PWC | https://paperswithcode.com/paper/volumetric-isosurface-rendering-with-deep |
Repo | |
Framework | |
Zero-shot Text-to-SQL Learning with Auxiliary Task
Title | Zero-shot Text-to-SQL Learning with Auxiliary Task |
Authors | Shuaichen Chang, Pengfei Liu, Yun Tang, Jing Huang, Xiaodong He, Bowen Zhou |
Abstract | Recent years have seen great success in the use of neural seq2seq models on the text-to-SQL task. However, little work has paid attention to how these models generalize to realistic unseen data, which naturally raises a question: does this impressive performance signify a perfect generalization model, or are there still some limitations? In this paper, we first diagnose the bottleneck of text-to-SQL task by providing a new testbed, in which we observe that existing models present poor generalization ability on rarely-seen data. The above analysis encourages us to design a simple but effective auxiliary task, which serves as a supportive model as well as a regularization term to the generation task to increase the models generalization. Experimentally, We evaluate our models on a large text-to-SQL dataset WikiSQL. Compared to a strong baseline coarse-to-fine model, our models improve over the baseline by more than 3% absolute in accuracy on the whole dataset. More interestingly, on a zero-shot subset test of WikiSQL, our models achieve 5% absolute accuracy gain over the baseline, clearly demonstrating its superior generalizability. |
Tasks | Text-To-Sql |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11052v1 |
https://arxiv.org/pdf/1908.11052v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-text-to-sql-learning-with-auxiliary |
Repo | |
Framework | |
FairSight: Visual Analytics for Fairness in Decision Making
Title | FairSight: Visual Analytics for Fairness in Decision Making |
Authors | Yongsu Ahn, Yu-Ru Lin |
Abstract | Data-driven decision making related to individuals has become increasingly pervasive, but the issue concerning the potential discrimination has been raised by recent studies. In response, researchers have made efforts to propose and implement fairness measures and algorithms, but those efforts have not been translated to the real-world practice of data-driven decision making. As such, there is still an urgent need to create a viable tool to facilitate fair decision making. We propose FairSight, a visual analytic system to address this need; it is designed to achieve different notions of fairness in ranking decisions through identifying the required actions – understanding, measuring, diagnosing and mitigating biases – that together lead to fairer decision making. Through a case study and user study, we demonstrate that the proposed visual analytic and diagnostic modules in the system are effective in understanding the fairness-aware decision pipeline and obtaining more fair outcomes. |
Tasks | Decision Making |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00176v2 |
https://arxiv.org/pdf/1908.00176v2.pdf | |
PWC | https://paperswithcode.com/paper/fairsight-visual-analytics-for-fairness-in |
Repo | |
Framework | |
Getting Topology and Point Cloud Generation to Mesh
Title | Getting Topology and Point Cloud Generation to Mesh |
Authors | Austin Dill, Chun-Liang Li, Songwei Ge, Eunsu Kang |
Abstract | In this work, we explore the idea that effective generative models for point clouds under the autoencoding framework must acknowledge the relationship between a continuous surface, a discretized mesh, and a set of points sampled from the surface. This view motivates a generative model that works by progressively deforming a uniform sphere until it approximates the goal point cloud. We review the underlying concepts leading to this conclusion from computer graphics and topology in differential geometry, and model the generation process as deformation via deep neural network parameterization. Finally, we show that this view of the problem produces a model that can generate quality meshes efficiently. |
Tasks | Point Cloud Generation |
Published | 2019-12-08 |
URL | https://arxiv.org/abs/1912.03787v1 |
https://arxiv.org/pdf/1912.03787v1.pdf | |
PWC | https://paperswithcode.com/paper/getting-topology-and-point-cloud-generation |
Repo | |
Framework | |
Asynchronous Methods for Model-Based Reinforcement Learning
Title | Asynchronous Methods for Model-Based Reinforcement Learning |
Authors | Yunzhi Zhang, Ignasi Clavera, Boren Tsai, Pieter Abbeel |
Abstract | Significant progress has been made in the area of model-based reinforcement learning. State-of-the-art algorithms are now able to match the asymptotic performance of model-free methods while being significantly more data efficient. However, this success has come at a price: state-of-the-art model-based methods require significant computation interleaved with data collection, resulting in run times that take days, even if the amount of agent interaction might be just hours or even minutes. When considering the goal of learning in real-time on real robots, this means these state-of-the-art model-based algorithms still remain impractical. In this work, we propose an asynchronous framework for model-based reinforcement learning methods that brings down the run time of these algorithms to be just the data collection time. We evaluate our asynchronous framework on a range of standard MuJoCo benchmarks. We also evaluate our asynchronous framework on three real-world robotic manipulation tasks. We show how asynchronous learning not only speeds up learning w.r.t wall-clock time through parallelization, but also further reduces the sample complexity of model-based approaches by means of improving the exploration and by means of effectively avoiding the policy overfitting to the deficiencies of learned dynamics models. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12453v1 |
https://arxiv.org/pdf/1910.12453v1.pdf | |
PWC | https://paperswithcode.com/paper/asynchronous-methods-for-model-based |
Repo | |
Framework | |
A support vector regression-based multi-fidelity surrogate model
Title | A support vector regression-based multi-fidelity surrogate model |
Authors | Maolin Shi, Shuo Wang, Wei Sun, Liye Lv, Xueguan Song |
Abstract | Computational simulations with different fidelity have been widely used in engineering design. A high-fidelity (HF) model is generally more accurate but also more time-consuming than an low-fidelity (LF) model. To take advantages of both HF and LF models, multi-fidelity surrogate models that aim to integrate information from both HF and LF models have gained increasing popularity. In this paper, a multi-fidelity surrogate model based on support vector regression named as Co_SVR is developed by combining HF and LF models. In Co_SVR, a kernel function is used to map the map the difference between the HF and LF models. Besides, a heuristic algorithm is used to obtain the optimal parameters of Co_SVR. The proposed Co_SVR is compared with two popular multi-fidelity surrogate models Co_Kriging model, Co_RBF model, and their single-fidelity surrogates through several numerical cases and a pressure vessel design problem. The results show that Co_SVR provides competitive prediction accuracy for numerical cases, and presents a better performance compared with the Co_Kriging and Co_RBF models and single-fidelity surrogate models. |
Tasks | |
Published | 2019-06-22 |
URL | https://arxiv.org/abs/1906.09439v1 |
https://arxiv.org/pdf/1906.09439v1.pdf | |
PWC | https://paperswithcode.com/paper/a-support-vector-regression-based-multi |
Repo | |
Framework | |
A Hierarchy of Graph Neural Networks Based on Learnable Local Features
Title | A Hierarchy of Graph Neural Networks Based on Learnable Local Features |
Authors | Michael Lingzhi Li, Meng Dong, Jiawei Zhou, Alexander M. Rush |
Abstract | Graph neural networks (GNNs) are a powerful tool to learn representations on graphs by iteratively aggregating features from node neighbourhoods. Many variant models have been proposed, but there is limited understanding on both how to compare different architectures and how to construct GNNs systematically. Here, we propose a hierarchy of GNNs based on their aggregation regions. We derive theoretical results about the discriminative power and feature representation capabilities of each class. Then, we show how this framework can be utilized to systematically construct arbitrarily powerful GNNs. As an example, we construct a simple architecture that exceeds the expressiveness of the Weisfeiler-Lehman graph isomorphism test. We empirically validate our theory on both synthetic and real-world benchmarks, and demonstrate our example’s theoretical power translates to strong results on node classification, graph classification, and graph regression tasks. |
Tasks | Graph Classification, Graph Regression, Node Classification |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05256v1 |
https://arxiv.org/pdf/1911.05256v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hierarchy-of-graph-neural-networks-based-on-1 |
Repo | |
Framework | |
A Multigrid Method for Efficiently Training Video Models
Title | A Multigrid Method for Efficiently Training Video Models |
Authors | Chao-Yuan Wu, Ross Girshick, Kaiming He, Christoph Feichtenhofer, Philipp Krähenbühl |
Abstract | Training competitive deep video models is an order of magnitude slower than training their counterpart image models. Slow training causes long research cycles, which hinders progress in video understanding research. Following standard practice for training image models, video model training assumes a fixed mini-batch shape: a specific number of clips, frames, and spatial size. However, what is the optimal shape? High resolution models perform well, but train slowly. Low resolution models train faster, but they are inaccurate. Inspired by multigrid methods in numerical optimization, we propose to use variable mini-batch shapes with different spatial-temporal resolutions that are varied according to a schedule. The different shapes arise from resampling the training data on multiple sampling grids. Training is accelerated by scaling up the mini-batch size and learning rate when shrinking the other dimensions. We empirically demonstrate a general and robust grid schedule that yields a significant out-of-the-box training speedup without a loss in accuracy for different models (I3D, non-local, SlowFast), datasets (Kinetics, Something-Something, Charades), and training settings (with and without pre-training, 128 GPUs or 1 GPU). As an illustrative example, the proposed multigrid method trains a ResNet-50 SlowFast network 4.5x faster (wall-clock time, same hardware) while also improving accuracy (+0.8% absolute) on Kinetics-400 compared to the baseline training method. |
Tasks | Video Understanding |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00998v1 |
https://arxiv.org/pdf/1912.00998v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multigrid-method-for-efficiently-training |
Repo | |
Framework | |