Paper Group AWR 51
Semantic-based End-to-End Learning for Typhoon Intensity Prediction. Deep Multi-attributed Graph Translation with Node-Edge Co-evolution. Invariant Rationalization. Non-Adversarial Video Synthesis with Learned Priors. StageNet: Stage-Aware Neural Networks for Health Risk Prediction. Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible …
Semantic-based End-to-End Learning for Typhoon Intensity Prediction
Title | Semantic-based End-to-End Learning for Typhoon Intensity Prediction |
Authors | Hamada M. Zahera, Mohamed Ahmed Sherif, Axel Ngonga |
Abstract | Disaster prediction is one of the most critical tasks towards disaster surveillance and preparedness. Existing technologies employ different machine learning approaches to predict incoming disasters from historical environmental data. However, for short-term disasters (e.g., earthquakes), historical data alone has a limited prediction capability. Therefore, additional sources of warnings are required for accurate prediction. We consider social media as a supplementary source of knowledge in addition to historical environmental data. However, social media posts (e.g., tweets) is very informal and contains only limited content. To alleviate these limitations, we propose the combination of semantically-enriched word embedding models to represent entities in tweets with their semantic representations computed with the traditionalword2vec. Moreover, we study how the correlation between social media posts and typhoons magnitudes (also called intensities)-in terms of volume and sentiments of tweets-. Based on these insights, we propose an end-to-end based framework that learns from disaster-related tweets and environmental data to improve typhoon intensity prediction. This paper is an extension of our work originally published in K-CAP 2019 [32]. We extended this paper by building our framework with state-of-the-art deep neural models, up-dated our dataset with new typhoons and their tweets to-date and benchmark our approach against recent baselines in disaster prediction. Our experimental results show that our approach outperforms the accuracy of the state-of-the-art baselines in terms of F1-score with (CNN by12.1%and BiLSTM by3.1%) improvement compared with last experiments |
Tasks | |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.13779v1 |
https://arxiv.org/pdf/2003.13779v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-based-end-to-end-learning-for |
Repo | https://github.com/dice-group/joint-model-disaster-prediction |
Framework | none |
Deep Multi-attributed Graph Translation with Node-Edge Co-evolution
Title | Deep Multi-attributed Graph Translation with Node-Edge Co-evolution |
Authors | Xiaojie Guo, Liang Zhao, Cameron Nowzari, Setareh Rafatirad, Houman Homayoun, Sai Manoj Pudukotai Dinakarrao |
Abstract | Generalized from image and language translation, graph translation aims to generate a graph in the target domain by conditioning an input graph in the source domain. This promising topic has attracted fast-increasing attention recently. Existing works are limited to either merely predicting the node attributes of graphs with fixed topology or predicting only the graph topology without considering node attributes, but cannot simultaneously predict both of them, due to substantial challenges: 1) difficulty in characterizing the interactive, iterative, and asynchronous translation process of both nodes and edges and 2) difficulty in discovering and maintaining the inherent consistency between the node and edge in predicted graphs. These challenges prevent a generic, end-to-end framework for joint node and edge attributes prediction, which is a need for real-world applications such as malware confinement in IoT networks and structural-to-functional network translation. These real-world applications highly depend on hand-crafting and ad-hoc heuristic models, but cannot sufficiently utilize massive historical data. In this paper, we termed this generic problem “multi-attributed graph translation” and developed a novel framework integrating both node and edge translations seamlessly. The novel edge translation path is generic, which is proven to be a generalization of the existing topology translation models. Then, a spectral graph regularization based on our non-parametric graph Laplacian is proposed in order to learn and maintain the consistency of the predicted nodes and edges. Finally, extensive experiments on both synthetic and real-world application data demonstrated the effectiveness of the proposed method. |
Tasks | |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09945v1 |
https://arxiv.org/pdf/2003.09945v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-attributed-graph-translation-with |
Repo | https://github.com/xguo7/NEC-DGT |
Framework | tf |
Invariant Rationalization
Title | Invariant Rationalization |
Authors | Shiyu Chang, Yang Zhang, Mo Yu, Tommi S. Jaakkola |
Abstract | Selective rationalization improves neural network interpretability by identifying a small subset of input features – the rationale – that best explains or supports the prediction. A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale. However, MMI can be problematic because it picks up spurious correlations between the input features and the output. Instead, we introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments. We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments. Our data and code are available. |
Tasks | |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09772v1 |
https://arxiv.org/pdf/2003.09772v1.pdf | |
PWC | https://paperswithcode.com/paper/invariant-rationalization |
Repo | https://github.com/code-terminator/invariant_rationalization |
Framework | tf |
Non-Adversarial Video Synthesis with Learned Priors
Title | Non-Adversarial Video Synthesis with Learned Priors |
Authors | Abhishek Aich, Akash Gupta, Rameswar Panda, Rakib Hyder, M. Salman Asif, Amit K. Roy-Chowdhury |
Abstract | Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames. To this end, we develop a novel approach that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Optimizing for the input latent space along with the network weights allows us to generate videos in a controlled environment, i.e., we can faithfully generate all videos the model has seen during the learning process as well as new unseen videos. Extensive experiments on three challenging and diverse datasets well demonstrate that our approach generates superior quality videos compared to the existing state-of-the-art methods. |
Tasks | |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09565v2 |
https://arxiv.org/pdf/2003.09565v2.pdf | |
PWC | https://paperswithcode.com/paper/non-adversarial-video-synthesis-with-learned |
Repo | https://github.com/abhishekaich27/Navsynth |
Framework | pytorch |
StageNet: Stage-Aware Neural Networks for Health Risk Prediction
Title | StageNet: Stage-Aware Neural Networks for Health Risk Prediction |
Authors | Junyi Gao, Cao Xiao, Yasha Wang, Wen Tang, Lucas M. Glass, Jimeng Sun |
Abstract | Deep learning has demonstrated success in health risk prediction especially for patients with chronic and progressing conditions. Most existing works focus on learning disease Network (StageNet) model to extract disease stage information from patient data and integrate it into risk prediction. StageNet is enabled by (1) a stage-aware long short-term memory (LSTM) module that extracts health stage variations unsupervisedly; (2) a stage-adaptive convolutional module that incorporates stage-related progression patterns into risk prediction. We evaluate StageNet on two real-world datasets and show that StageNet outperforms state-of-the-art models in risk prediction task and patient subtyping task. Compared to the best baseline model, StageNet achieves up to 12% higher AUPRC for risk prediction task on two real-world patient datasets. StageNet also achieves over 58% higher Calinski-Harabasz score (a cluster quality metric) for a patient subtyping task. |
Tasks | |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.10054v1 |
https://arxiv.org/pdf/2001.10054v1.pdf | |
PWC | https://paperswithcode.com/paper/stagenet-stage-aware-neural-networks-for |
Repo | https://github.com/v1xerunt/StageNet |
Framework | pytorch |
Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises
Title | Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises |
Authors | Bin Yan, Dong Wang, Huchuan Lu, Xiaoyun Yang |
Abstract | Adversarial attack of CNN aims at deceiving models to misbehave by adding imperceptible perturbations to images. This feature facilitates to understand neural networks deeply and to improve the robustness of deep learning models. Although several works have focused on attacking image classifiers and object detectors, an effective and efficient method for attacking single object trackers of any target in a model-free way remains lacking. In this paper, a cooling-shrinking attack method is proposed to deceive state-of-the-art SiameseRPN-based trackers. An effective and efficient perturbation generator is trained with a carefully designed adversarial loss, which can simultaneously cool hot regions where the target exists on the heatmaps and force the predicted bounding box to shrink, making the tracked target invisible to trackers. Numerous experiments on OTB100, VOT2018, and LaSOT datasets show that our method can effectively fool the state-of-the-art SiameseRPN++ tracker by adding small perturbations to the template or the search regions. Besides, our method has good transferability and is able to deceive other top-performance trackers such as DaSiamRPN, DaSiamRPN-UpdateNet, and DiMP. The source codes are available at https://github.com/MasterBin-IIAU/CSA. |
Tasks | Adversarial Attack |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09595v1 |
https://arxiv.org/pdf/2003.09595v1.pdf | |
PWC | https://paperswithcode.com/paper/cooling-shrinking-attack-blinding-the-tracker |
Repo | https://github.com/MasterBin-IIAU/CSA |
Framework | pytorch |
Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning
Title | Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning |
Authors | Massimo Caccia, Pau Rodriguez, Oleksiy Ostapenko, Fabrice Normandin, Min Lin, Lucas Caccia, Issam Laradji, Irina Rish, Alexande Lacoste, David Vazquez, Laurent Charlin |
Abstract | Learning from non-stationary data remains a great challenge for machine learning. Continual learning addresses this problem in scenarios where the learning agent faces a stream of changing tasks. In these scenarios, the agent is expected to retain its highest performance on previous tasks without revisiting them while adapting well to the new tasks. Two new recent continual-learning scenarios have been proposed. In meta-continual learning, the model is pre-trained to minimize catastrophic forgetting when trained on a sequence of tasks. In continual-meta learning, the goal is faster remembering, i.e., focusing on how quickly the agent recovers performance rather than measuring the agent’s performance without any adaptation. Both scenarios have the potential to propel the field forward. Yet in their original formulations, they each have limitations. As a remedy, we propose a more general scenario where an agent must quickly solve (new) out-of-distribution tasks, while also requiring fast remembering. We show that current continual learning, meta learning, meta-continual learning, and continual-meta learning techniques fail in this new scenario. Accordingly, we propose a strong baseline: Continual-MAML, an online extension of the popular MAML algorithm. In our empirical experiments, we show that our method is better suited to the new scenario than the methodologies mentioned above, as well as standard continual learning and meta learning approaches. |
Tasks | Continual Learning, Meta-Learning |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.05856v1 |
https://arxiv.org/pdf/2003.05856v1.pdf | |
PWC | https://paperswithcode.com/paper/online-fast-adaptation-and-knowledge |
Repo | https://github.com/ElementAI/osaka |
Framework | pytorch |
One Neuron to Fool Them All
Title | One Neuron to Fool Them All |
Authors | Anshuman Suri, David Evans |
Abstract | Despite vast research in adversarial examples, the root causes of model susceptibility are not well understood. Instead of looking at attack-specific robustness, we propose a notion that evaluates the sensitivity of individual neurons in terms of how robust the model’s output is to direct perturbations of that neuron’s output. Analyzing models from this perspective reveals distinctive characteristics of standard as well as adversarially-trained robust models, and leads to several curious results. In our experiments on CIFAR-10 and ImageNet, we find that attacks using a loss function that targets just a single sensitive neuron find adversarial examples nearly as effectively as ones that target the full model. We analyze the properties of these sensitive neurons to propose a regularization term that can help a model achieve robustness to a variety of different perturbation constraints while maintaining accuracy on natural data distributions. Code for all our experiments is available at https://github.com/iamgroot42/sauron . |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09372v1 |
https://arxiv.org/pdf/2003.09372v1.pdf | |
PWC | https://paperswithcode.com/paper/one-neuron-to-fool-them-all |
Repo | https://github.com/iamgroot42/sauron |
Framework | pytorch |
Learning to adapt class-specific features across domains for semantic segmentation
Title | Learning to adapt class-specific features across domains for semantic segmentation |
Authors | Mikel Menta, Adriana Romero, Joost van de Weijer |
Abstract | Recent advances in unsupervised domain adaptation have shown the effectiveness of adversarial training to adapt features across domains, endowing neural networks with the capability of being tested on a target domain without requiring any training annotations in this domain. The great majority of existing domain adaptation models rely on image translation networks, which often contain a huge amount of domain-specific parameters. Additionally, the feature adaptation step often happens globally, at a coarse level, hindering its applicability to tasks such as semantic segmentation, where details are of crucial importance to provide sharp results. In this thesis, we present a novel architecture, which learns to adapt features across domains by taking into account per class information. To that aim, we design a conditional pixel-wise discriminator network, whose output is conditioned on the segmentation masks. Moreover, following recent advances in image translation, we adopt the recently introduced StarGAN architecture as image translation backbone, since it is able to perform translations across multiple domains by means of a single generator network. Preliminary results on a segmentation task designed to assess the effectiveness of the proposed approach highlight the potential of the model, improving upon strong baselines and alternative designs. |
Tasks | Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08311v1 |
https://arxiv.org/pdf/2001.08311v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-adapt-class-specific-features |
Repo | https://github.com/mkmenta/domain_adapt_segm |
Framework | pytorch |
FPConv: Learning Local Flattening for Point Convolution
Title | FPConv: Learning Local Flattening for Point Convolution |
Authors | Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han |
Abstract | We introduce FPConv, a novel surface-style convolution operator designed for 3D point cloud analysis. Unlike previous methods, FPConv doesn’t require transforming to intermediate representation like 3D grid or graph and directly works on surface geometry of point cloud. To be more specific, for each point, FPConv performs a local flattening by automatically learning a weight map to softly project surrounding points onto a 2D grid. Regular 2D convolution can thus be applied for efficient feature learning. FPConv can be easily integrated into various network architectures for tasks like 3D object classification and 3D scene segmentation, and achieve comparable performance with existing volumetric-type convolutions. More importantly, our experiments also show that FPConv can be a complementary of volumetric convolutions and jointly training them can further boost overall performance into state-of-the-art results. |
Tasks | 3D Object Classification, Object Classification, Scene Segmentation |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10701v3 |
https://arxiv.org/pdf/2002.10701v3.pdf | |
PWC | https://paperswithcode.com/paper/fpconv-learning-local-flattening-for-point |
Repo | https://github.com/lyqun/FPConv |
Framework | pytorch |
Randomized Smoothing of All Shapes and Sizes
Title | Randomized Smoothing of All Shapes and Sizes |
Authors | Greg Yang, Tony Duan, J. Edward Hu, Hadi Salman, Ilya Razenshteyn, Jerry Li |
Abstract | Randomized smoothing is a recently proposed defense against adversarial attacks that has achieved state-of-the-art provable robustness against $\ell_2$ perturbations. Soon after, a number of works devised new randomized smoothing schemes for other metrics, such as $\ell_1$ or $\ell_\infty$; however, for each geometry, substantial effort was needed to derive new robustness guarantees. This begs the question: can we find a general theory for randomized smoothing? In this work we propose a novel framework for devising and analyzing randomized smoothing schemes, and validate its effectiveness in practice. Our theoretical contributions are as follows: (1) We show that for an appropriate notion of “optimal”, the optimal smoothing distributions for any “nice” norm have level sets given by the *Wulff Crystal* of that norm. (2) We propose two novel and complementary methods for deriving provably robust radii for any smoothing distribution. Finally, (3) we show fundamental limits to current randomized smoothing techniques via the theory of *Banach space cotypes*. By combining (1) and (2), we significantly improve the state-of-the-art certified accuracy in $\ell_1$ on standard datasets. On the other hand, using (3), we show that, without more information than label statistics under random input perturbations, randomized smoothing cannot achieve nontrivial certified accuracy against perturbations of $\ell_p$-norm $\Omega(\min(1, d^{\frac{1}{p}-\frac{1}{2}}))$, when the input dimension $d$ is large. We provide code in github.com/tonyduan/rs4a. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08118v2 |
https://arxiv.org/pdf/2002.08118v2.pdf | |
PWC | https://paperswithcode.com/paper/randomized-smoothing-of-all-shapes-and-sizes |
Repo | https://github.com/tonyduan/rs4a |
Framework | pytorch |
Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning
Title | Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning |
Authors | Zhongling Huang, Corneliu Octavian Dumitru, Zongxu Pan, Bin Lei, Mihai Datcu |
Abstract | The classification of large-scale high-resolution SAR land cover images acquired by satellites is a challenging task, facing several difficulties such as semantic annotation with expertise, changing data characteristics due to varying imaging parameters or regional target area differences, and complex scattering mechanisms being different from optical imaging. Given a large-scale SAR land cover dataset collected from TerraSAR-X images with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 patches, three main challenges in automatically interpreting SAR images of highly imbalanced classes, geographic diversity, and label noise are addressed. In this letter, a deep transfer learning method is proposed based on a similarly annotated optical land cover dataset (NWPU-RESISC45). Besides, a top-2 smooth loss function with cost-sensitive parameters was introduced to tackle the label noise and imbalanced classes’ problems. The proposed method shows high efficiency in transferring information from a similarly annotated remote sensing dataset, a robust performance on highly imbalanced classes, and is alleviating the over-fitting problem caused by label noise. What’s more, the learned deep model has a good generalization for other SAR-specific tasks, such as MSTAR target recognition with a state-of-the-art classification accuracy of 99.46%. |
Tasks | Transfer Learning |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01425v1 |
https://arxiv.org/pdf/2001.01425v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-of-large-scale-high-resolution |
Repo | https://github.com/Alien9427/SAR_specific_models |
Framework | pytorch |
REST: Robust and Efficient Neural Networks for Sleep Monitoring in the Wild
Title | REST: Robust and Efficient Neural Networks for Sleep Monitoring in the Wild |
Authors | Rahul Duggal, Scott Freitas, Cao Xiao, Duen Horng Chau, Jimeng Sun |
Abstract | In recent years, significant attention has been devoted towards integrating deep learning technologies in the healthcare domain. However, to safely and practically deploy deep learning models for home health monitoring, two significant challenges must be addressed: the models should be (1) robust against noise; and (2) compact and energy-efficient. We propose REST, a new method that simultaneously tackles both issues via 1) adversarial training and controlling the Lipschitz constant of the neural network through spectral regularization while 2) enabling neural network compression through sparsity regularization. We demonstrate that REST produces highly-robust and efficient models that substantially outperform the original full-sized models in the presence of noise. For the sleep staging task over single-channel electroencephalogram (EEG), the REST model achieves a macro-F1 score of 0.67 vs. 0.39 achieved by a state-of-the-art model in the presence of Gaussian noise while obtaining 19x parameter reduction and 15x MFLOPS reduction on two large, real-world EEG datasets. By deploying these models to an Android application on a smartphone, we quantitatively observe that REST allows models to achieve up to 17x energy reduction and 9x faster inference. We open-source the code repository with this paper: https://github.com/duggalrahul/REST. |
Tasks | EEG, Neural Network Compression, Sleep Stage Detection |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2001.11363v1 |
https://arxiv.org/pdf/2001.11363v1.pdf | |
PWC | https://paperswithcode.com/paper/rest-robust-and-efficient-neural-networks-for |
Repo | https://github.com/duggalrahul/REST |
Framework | tf |
AnomalyDAE: Dual autoencoder for anomaly detection on attributed networks
Title | AnomalyDAE: Dual autoencoder for anomaly detection on attributed networks |
Authors | Haoyi Fan, Fengbin Zhang, Zuoyong Li |
Abstract | Anomaly detection on attributed networks aims at finding nodes whose patterns deviate significantly from the majority of reference nodes, which is pervasive in many applications such as network intrusion detection and social spammer detection. However, most existing methods neglect the complex cross-modality interactions between network structure and node attribute. In this paper, we propose a deep joint representation learning framework for anomaly detection through a dual autoencoder (AnomalyDAE), which captures the complex interactions between network structure and node attribute for high-quality embeddings. Specifically, AnomalyDAE consists of a structure autoencoder and an attribute autoencoder to learn both node embedding and attribute embedding jointly in latent space. Moreover, attention mechanism is employed in structure encoder to learn the importance between a node and its neighbors for an effective capturing of structure pattern, which is important to anomaly detection. Besides, by taking both the node embedding and attribute embedding as inputs of attribute decoder, the cross-modality interactions between network structure and node attribute are learned during the reconstruction of node attribute. Finally, anomalies can be detected by measuring the reconstruction errors of nodes from both the structure and attribute perspectives. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method. |
Tasks | Anomaly Detection, Intrusion Detection, Network Intrusion Detection, Representation Learning |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03665v2 |
https://arxiv.org/pdf/2002.03665v2.pdf | |
PWC | https://paperswithcode.com/paper/anomalydae-dual-autoencoder-for-anomaly |
Repo | https://github.com/haoyfan/AnomalyDAE |
Framework | tf |
BATS: Binary ArchitecTure Search
Title | BATS: Binary ArchitecTure Search |
Authors | Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos |
Abstract | This paper proposes Binary ArchitecTure Search (BATS), a framework that drastically reduces the accuracy gap between binary neural networks and their real-valued counterparts by means of Neural Architecture Search (NAS). We show that directly applying NAS to the binary domain provides very poor results. To alleviate this, we describe, to our knowledge, for the first time, the 3 key ingredients for successfully applying NAS to the binary domain. Specifically, we (1) introduce and design a novel binary-oriented search space, (2) propose a new mechanism for controlling and stabilising the resulting searched topologies, (3) propose and validate a series of new search strategies for binary networks that lead to faster convergence and lower search times. Experimental results demonstrate the effectiveness of the proposed approach and the necessity of searching in the binary space directly. Moreover, (4) we set a new state-of-the-art for binary neural networks on CIFAR10, CIFAR100 and ImageNet datasets. Code will be made available https://github.com/1adrianb/binary-nas |
Tasks | Neural Architecture Search |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01711v1 |
https://arxiv.org/pdf/2003.01711v1.pdf | |
PWC | https://paperswithcode.com/paper/bats-binary-architecture-search |
Repo | https://github.com/1adrianb/binary-nas |
Framework | none |