Paper Group AWR 27
Discrete Action On-Policy Learning with Action-Value Critic. TAdam: A Robust Stochastic Gradient Optimizer. Random Smoothing Might be Unable to Certify $\ell_\infty$ Robustness for High-Dimensional Images. To Share or Not To Share: A Comprehensive Appraisal of Weight-Sharing. Negative Margin Matters: Understanding Margin in Few-shot Classification. …
Discrete Action On-Policy Learning with Action-Value Critic
Title | Discrete Action On-Policy Learning with Action-Value Critic |
Authors | Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou |
Abstract | Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension, making it challenging to apply existing on-policy gradient based deep RL algorithms efficiently. To effectively operate in multidimensional discrete action spaces, we construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. We follow rigorous statistical analysis to design how to generate and combine these correlated actions, and how to sparsify the gradients by shutting down the contributions from certain dimensions. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques. We demonstrate these properties on OpenAI Gym benchmark tasks, and illustrate how discretizing the action space could benefit the exploration phase and hence facilitate convergence to a better local optimal solution thanks to the flexibility of discrete policy. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03534v2 |
https://arxiv.org/pdf/2002.03534v2.pdf | |
PWC | https://paperswithcode.com/paper/discrete-action-on-policy-learning-with |
Repo | https://github.com/yuguangyue/CARSM |
Framework | tf |
TAdam: A Robust Stochastic Gradient Optimizer
Title | TAdam: A Robust Stochastic Gradient Optimizer |
Authors | Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto |
Abstract | Machine learning algorithms aim to find patterns from observations, which may include some noise, especially in robotics domain. To perform well even with such noise, we expect them to be able to detect outliers and discard them when needed. We therefore propose a new stochastic gradient optimization method, whose robustness is directly built in the algorithm, using the robust student-t distribution as its core idea. Adam, the popular optimization method, is modified with our method and the resultant optimizer, so-called TAdam, is shown to effectively outperform Adam in terms of robustness against noise on diverse task, ranging from regression and classification to reinforcement learning problems. The implementation of our algorithm can be found at https://github.com/Mahoumaru/TAdam.git |
Tasks | |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00179v2 |
https://arxiv.org/pdf/2003.00179v2.pdf | |
PWC | https://paperswithcode.com/paper/tadam-a-robust-stochastic-gradient-optimizer |
Repo | https://github.com/Mahoumaru/TAdam |
Framework | pytorch |
Random Smoothing Might be Unable to Certify $\ell_\infty$ Robustness for High-Dimensional Images
Title | Random Smoothing Might be Unable to Certify $\ell_\infty$ Robustness for High-Dimensional Images |
Authors | Avrim Blum, Travis Dick, Naren Manoj, Hongyang Zhang |
Abstract | We show a hardness result for random smoothing to achieve certified adversarial robustness against attacks in the $\ell_p$ ball of radius $\epsilon$ when $p>2$. Although random smoothing has been well understood for the $\ell_2$ case using the Gaussian distribution, much remains unknown concerning the existence of a noise distribution that works for the case of $p>2$. This has been posed as an open problem by Cohen et al. (2019) and includes many significant paradigms such as the $\ell_\infty$ threat model. In this work, we show that any noise distribution $\mathcal{D}$ over $\mathbb{R}^d$ that provides $\ell_p$ robustness for all base classifiers with $p>2$ must satisfy $\mathbb{E}\eta_i^2=\Omega(d^{1-2/p}\epsilon^2(1-\delta)/\delta^2)$ for 99% of the features (pixels) of vector $\eta\sim\mathcal{D}$, where $\epsilon$ is the robust radius and $\delta$ is the score gap between the highest-scored class and the runner-up. Therefore, for high-dimensional images with pixel values bounded in $[0,255]$, the required noise will eventually dominate the useful information in the images, leading to trivial smoothed classifiers. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03517v3 |
https://arxiv.org/pdf/2002.03517v3.pdf | |
PWC | https://paperswithcode.com/paper/random-smoothing-might-be-unable-to-certify |
Repo | https://github.com/hongyanz/TRADES-smoothing |
Framework | pytorch |
To Share or Not To Share: A Comprehensive Appraisal of Weight-Sharing
Title | To Share or Not To Share: A Comprehensive Appraisal of Weight-Sharing |
Authors | Aloïs Pourchot, Alexis Ducarouge, Olivier Sigaud |
Abstract | Weight-sharing (WS) has recently emerged as a paradigm to accelerate the automated search for efficient neural architectures, a process dubbed Neural Architecture Search (NAS). Although very appealing, this framework is not without drawbacks and several works have started to question its capabilities on small hand-crafted benchmarks. In this paper, we take advantage of the NASBench-101 dataset to challenge the efficiency of WS on a representative search space. By comparing a SOTA WS approach to a plain random search we show that, despite decent correlations between evaluations using weight-sharing and standalone ones, WS is only rarely helpful to NAS. We highlight in particular the reliance of the benefits on the search space itself. |
Tasks | Neural Architecture Search |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04289v1 |
https://arxiv.org/pdf/2002.04289v1.pdf | |
PWC | https://paperswithcode.com/paper/to-share-or-not-to-share-a-comprehensive |
Repo | https://github.com/apourchot/to_share_or_not_to_share |
Framework | pytorch |
Negative Margin Matters: Understanding Margin in Few-shot Classification
Title | Negative Margin Matters: Understanding Margin in Few-shot Classification |
Authors | Bin Liu, Yue Cao, Yutong Lin, Qi Li, Zheng Zhang, Mingsheng Long, Han Hu |
Abstract | This paper introduces a negative margin loss to metric learning based few-shot learning methods. The negative margin loss significantly outperforms regular softmax loss, and achieves state-of-the-art accuracy on three standard few-shot classification benchmarks with few bells and whistles. These results are contrary to the common practice in the metric learning field, that the margin is zero or positive. To understand why the negative margin loss performs well for the few-shot classification, we analyze the discriminability of learned features w.r.t different margins for training and novel classes, both empirically and theoretically. We find that although negative margin reduces the feature discriminability for training classes, it may also avoid falsely mapping samples of the same novel class to multiple peaks or clusters, and thus benefit the discrimination of novel classes. Code is available at https://github.com/bl0/negative-margin.few-shot. |
Tasks | Few-Shot Image Classification, Few-Shot Learning, Metric Learning |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.12060v1 |
https://arxiv.org/pdf/2003.12060v1.pdf | |
PWC | https://paperswithcode.com/paper/negative-margin-matters-understanding-margin |
Repo | https://github.com/bl0/negative-margin.few-shot |
Framework | pytorch |
Monocular Depth Prediction Through Continuous 3D Loss
Title | Monocular Depth Prediction Through Continuous 3D Loss |
Authors | Minghan Zhu, Maani Ghaffari, Yuanxin Zhong, Pingping Lu, Zhong Cao, Ryan M. Eustice, Huei Peng |
Abstract | This paper reports a new continuous 3D loss function for learning depth from monocular images. The dense depth prediction from a monocular image is supervised using sparse LIDAR points, exploiting available data from camera-LIDAR sensor suites during training. Currently, accurate and affordable range sensor is not available. Stereo cameras and LIDARs measure depth either inaccurately or sparsely/costly. In contrast to the current point-to-point loss evaluation approach, the proposed 3D loss treats point clouds as continuous objects; and therefore, it overcomes the lack of dense ground truth depth due to the sparsity of LIDAR measurements. Experimental evaluations show that the proposed method achieves accurate depth measurement with consistent 3D geometric structures through a monocular camera. |
Tasks | Depth Estimation |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09763v1 |
https://arxiv.org/pdf/2003.09763v1.pdf | |
PWC | https://paperswithcode.com/paper/monocular-depth-prediction-through-continuous |
Repo | https://github.com/minghanz/DepthC3D |
Framework | pytorch |
Identification of Chimera using Machine Learning
Title | Identification of Chimera using Machine Learning |
Authors | M. A. Ganaie, Saptarshi Ghosh, Naveen Mendola, M Tanveer, Sarika Jalan |
Abstract | Coupled dynamics on the network models have been tremendously helpful in getting insight into complex spatiotemporal dynamical patterns of a wide variety of large-scale real-world complex systems. Chimera, a state of coexistence of incoherence and coherence, is one of such patterns arising in identically coupled oscillators, which has recently drawn tremendous attention due to its peculiar nature and wide applicability, specially in neuroscience. The identification of chimera is a challenging problem due to ambiguity in its appearance. We present a distinctive approach to identify and characterize the chimera state using machine learning techniques, namely random forest, oblique random forests via multi-surface proximal support vector machines (MPRaF-T, P, N) and sparse pre-trained / auto-encoder based random vector functional link neural network (RVFL-AE). We demonstrate high accuracy in identifying the coherent, incoherent and chimera states from given spatial profiles. We validate this approach for different time-continuous and time discrete coupled dynamics on networks. This work provides a direction for employing machine learning techniques to identify dynamical patterns arising due to the interaction among non-linear units on large-scale, and for characterizing complex spatio-temporal phenomena in real-world systems for various applications. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.08985v1 |
https://arxiv.org/pdf/2001.08985v1.pdf | |
PWC | https://paperswithcode.com/paper/identification-of-chimera-using-machine |
Repo | https://github.com/complex-systems-lab/Project_Chimera_ML |
Framework | none |
Exposing Backdoors in Robust Machine Learning Models
Title | Exposing Backdoors in Robust Machine Learning Models |
Authors | Ezekiel Soremekun, Sakshi Udeshi, Sudipta Chattopadhyay, Andreas Zeller |
Abstract | The introduction of robust optimisation has pushed the state-of-the-art in defending against adversarial attacks. However, the behaviour of such optimisation has not been studied in the light of a fundamentally different class of attacks called backdoors. In this paper, we demonstrate that adversarially robust models are susceptible to backdoor attacks. Subsequently, we observe that backdoors are reflected in the feature representation of such models. Then, this is leveraged to detect backdoor-infected models. Specifically, we use feature clustering to effectively detect backdoor-infected robust Deep Neural Networks (DNNs). In our evaluation of major classification tasks, our approach effectively detects robust DNNs infected with backdoors. Our investigation reveals that salient features of adversarially robust DNNs break the stealthy nature of backdoor attacks. |
Tasks | |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2003.00865v1 |
https://arxiv.org/pdf/2003.00865v1.pdf | |
PWC | https://paperswithcode.com/paper/exposing-backdoors-in-robust-machine-learning |
Repo | https://github.com/sakshiudeshi/Expose-Robust-Backdoors |
Framework | pytorch |
Learning distributed representations of graphs with Geo2DR
Title | Learning distributed representations of graphs with Geo2DR |
Authors | Paul Scherer, Pietro Lio |
Abstract | We present Geo2DR, a Python library for unsupervised learning on graph-structured data using discrete substructure patterns and neural language models. It contains efficient implementations of popular graph decomposition algorithms and neural language models in PyTorch which are combined to learn representations using the distributive hypothesis. Furthermore, Geo2DR comes with general data processing and loading methods which can bring substantial speed-up in the training of the neural language models. Through this we provide a unified set of tools and design methodology to quickly construct systems capable of learning distributed representations of graphs. This is useful for replication of existing methods, modification, or even creation of novel systems. This work serves to present the Geo2DR library and perform a comprehensive comparative analysis of existing methods re-implemented using Geo2DR across several widely used graph classification benchmarks. We show a high reproducibility of results in published methods and interoperability with other libraries useful for distributive language modelling. |
Tasks | Graph Classification, Language Modelling |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.05926v2 |
https://arxiv.org/pdf/2003.05926v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-distributed-representations-of-3 |
Repo | https://github.com/paulmorio/geo2dr |
Framework | pytorch |
Hierarchical Conditional Relation Networks for Video Question Answering
Title | Hierarchical Conditional Relation Networks for Video Question Answering |
Authors | Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran |
Abstract | Video question answering (VideoQA) is challenging as it requires modeling capacity to distill dynamic visual artifacts and distant relations and to associate them with linguistic concepts. We introduce a general-purpose reusable neural unit called Conditional Relation Network (CRN) that serves as a building block to construct more sophisticated structures for representation and reasoning over video. CRN takes as input an array of tensorial objects and a conditioning feature, and computes an array of encoded output objects. Model building becomes a simple exercise of replication, rearrangement and stacking of these reusable units for diverse modalities and contextual information. This design thus supports high-order relational and multi-step reasoning. The resulting architecture for VideoQA is a CRN hierarchy whose branches represent sub-videos or clips, all sharing the same question as the contextual condition. Our evaluations on well-known datasets achieved new SoTA results, demonstrating the impact of building a general-purpose reasoning unit on complex domains such as VideoQA. |
Tasks | Question Answering, Video Question Answering |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10698v3 |
https://arxiv.org/pdf/2002.10698v3.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-conditional-relation-networks |
Repo | https://github.com/thaolmk54/hcrn-videoqa |
Framework | pytorch |
FusionLane: Multi-Sensor Fusion for Lane Marking Semantic Segmentation Using Deep Neural Networks
Title | FusionLane: Multi-Sensor Fusion for Lane Marking Semantic Segmentation Using Deep Neural Networks |
Authors | Ruochen Yin, Biao Yu, Huapeng Wu, Yutao Song, Runxin Niu |
Abstract | It is a crucial step to achieve effective semantic segmentation of lane marking during the construction of the lane level high-precision map. In recent years, many image semantic segmentation methods have been proposed. These methods mainly focus on the image from camera, due to the limitation of the sensor itself, the accurate three-dimensional spatial position of the lane marking cannot be obtained, so the demand for the lane level high-precision map construction cannot be met. This paper proposes a lane marking semantic segmentation method based on LIDAR and camera fusion deep neural network. Different from other methods, in order to obtain accurate position information of the segmentation results, the semantic segmentation object of this paper is a bird’s eye view converted from a LIDAR points cloud instead of an image captured by a camera. This method first uses the deeplabv3+ [\ref{ref:1}] network to segment the image captured by the camera, and the segmentation result is merged with the point clouds collected by the LIDAR as the input of the proposed network. In this neural network, we also add a long short-term memory (LSTM) structure to assist the network for semantic segmentation of lane markings by using the the time series information. The experiments on more than 14,000 image datasets which we have manually labeled and expanded have shown the proposed method has better performance on the semantic segmentation of the points cloud bird’s eye view. Therefore, the automation of high-precision map construction can be significantly improved. Our code is available at https://github.com/rolandying/FusionLane. |
Tasks | Semantic Segmentation, Sensor Fusion, Time Series |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04404v1 |
https://arxiv.org/pdf/2003.04404v1.pdf | |
PWC | https://paperswithcode.com/paper/fusionlane-multi-sensor-fusion-for-lane |
Repo | https://github.com/rolandying/FusionLane |
Framework | tf |
Causal structure learning from time series: Large regression coefficients may predict causal links better in practice than small p-values
Title | Causal structure learning from time series: Large regression coefficients may predict causal links better in practice than small p-values |
Authors | Sebastian Weichwald, Martin E Jakobsen, Phillip B Mogensen, Lasse Petersen, Nikolaj Thams, Gherardo Varando |
Abstract | In this article, we describe the algorithms for causal structure learning from time series data that won the Causality 4 Climate competition at the Conference on Neural Information Processing Systems 2019 (NeurIPS). We examine how our combination of established ideas achieves competitive performance on semi-realistic and realistic time series data exhibiting common challenges in real-world Earth sciences data. In particular, we discuss a) a rationale for leveraging linear methods to identify causal links in non-linear systems, b) a simulation-backed explanation as to why large regression coefficients may predict causal links better in practice than small p-values and thus why normalising the data may sometimes hinder causal structure learning. For benchmark usage, we provide implementations at https://github.com/sweichwald/tidybench and detail the algorithms here. We propose the presented competition-proven methods for baseline benchmark comparisons to guide the development of novel algorithms for structure learning from time series. |
Tasks | Time Series |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09573v1 |
https://arxiv.org/pdf/2002.09573v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-structure-learning-from-time-series |
Repo | https://github.com/sweichwald/tidybench |
Framework | none |
Speech2Phone: A Multilingual and Text Independent Speaker Identification Model
Title | Speech2Phone: A Multilingual and Text Independent Speaker Identification Model |
Authors | Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Hamilton Pereira da Silva, Pedro Luiz de Paula Filho, Alessandro Ferreira Cordeiro, Victor de Oliveira Guedes, Sandra Maria Aluisio |
Abstract | Voice recognition is an area with a wide application potential. Speaker identification is useful in several voice recognition tasks, as seen in voice-based authentication, transcription systems and intelligent personal assistants. Some tasks benefit from open-set models which can handle new speakers without the need of retraining. Audio embeddings for speaker identification is a proposal to solve this issue. However, choosing a suitable model is a difficult task, especially when the training resources are scarce. Besides, it is not always clear whether embeddings are as good as more traditional methods. In this work, we propose the Speech2Phone and compare several embedding models for open-set speaker identification, as well as traditional closed-set models. The models were investigated in the scenario of small datasets, which makes them more applicable to languages in which data scarceness is an issue. The results show that embeddings generated by artificial neural networks are competitive when compared to classical approaches for the task. Considering a testing dataset composed of 20 speakers, the best models reach accuracies of 100% and 76.96% for closed an open set scenarios, respectively. Results suggest that the models can perform language independent speaker identification. Among the tested models, a fully connected one, here presented as Speech2Phone, led to the higher accuracy. Furthermore, the models were tested for different languages showing that the knowledge learned was successfully transferred for close and distant languages to Portuguese (in terms of vocabulary). Finally, the models can scale and can handle more speakers than they were trained for, identifying 150% more speakers while still maintaining 55% accuracy. |
Tasks | Speaker Identification |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.11213v1 |
https://arxiv.org/pdf/2002.11213v1.pdf | |
PWC | https://paperswithcode.com/paper/speech2phone-a-multilingual-and-text |
Repo | https://github.com/Edresson/Speech2Phone |
Framework | none |
pymoo: Multi-objective Optimization in Python
Title | pymoo: Multi-objective Optimization in Python |
Authors | Julian Blank, Kalyanmoy Deb |
Abstract | Python has become the programming language of choice for research and industry projects related to data science, machine learning, and deep learning. Since optimization is an inherent part of these research fields, more optimization related frameworks have arisen in the past few years. Only a few of them support optimization of multiple conflicting objectives at a time, but do not provide comprehensive tools for a complete multi-objective optimization task. To address this issue, we have developed pymoo, a multi-objective optimization framework in Python. We provide a guide to getting started with our framework by demonstrating the implementation of an exemplary constrained multi-objective optimization scenario. Moreover, we give a high-level overview of the architecture of pymoo to show its capabilities followed by an explanation of each module and its corresponding sub-modules. The implementations in our framework are customizable and algorithms can be modified/extended by supplying custom operators. Moreover, a variety of single, multi and many-objective test problems are provided and gradients can be retrieved by automatic differentiation out of the box. Also, pymoo addresses practical needs, such as the parallelization of function evaluations, methods to visualize low and high-dimensional spaces, and tools for multi-criteria decision making. For more information about pymoo, readers are encouraged to visit: https://pymoo.org |
Tasks | Decision Making |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2002.04504v1 |
https://arxiv.org/pdf/2002.04504v1.pdf | |
PWC | https://paperswithcode.com/paper/pymoo-multi-objective-optimization-in-python |
Repo | https://github.com/msu-coinlab/pymoo |
Framework | none |
Benchmarking Graph Neural Networks
Title | Benchmarking Graph Neural Networks |
Authors | Vijay Prakash Dwivedi, Chaitanya K. Joshi, Thomas Laurent, Yoshua Bengio, Xavier Bresson |
Abstract | Graph neural networks (GNNs) have become the standard toolkit for analyzing and learning from data on graphs. They have been successfully applied to a myriad of domains including chemistry, physics, social sciences, knowledge graphs, recommendation, and neuroscience. As the field grows, it becomes critical to identify the architectures and key mechanisms which generalize across graphs sizes, enabling us to tackle larger, more complex datasets and domains. Unfortunately, it has been increasingly difficult to gauge the effectiveness of new GNNs and compare models in the absence of a standardized benchmark with consistent experimental settings and large datasets. In this paper, we propose a reproducible GNN benchmarking framework, with the facility for researchers to add new datasets and models conveniently. We apply this benchmarking framework to novel medium-scale graph datasets from mathematical modeling, computer vision, chemistry and combinatorial problems to establish key operations when designing effective GNNs. Precisely, graph convolutions, anisotropic diffusion, residual connections and normalization layers are universal building blocks for developing robust and scalable GNNs. |
Tasks | Knowledge Graphs |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00982v1 |
https://arxiv.org/pdf/2003.00982v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-graph-neural-networks |
Repo | https://github.com/dmlc/dgl |
Framework | pytorch |