Paper Group AWR 276
M3Fusion: A Deep Learning Architecture for Multi-{Scale/Modal/Temporal} satellite data fusion. Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning. Adversarial Decomposition of Text Representation. Removing Confounding Factors Associated Weights in Deep Neural Networks Improves the Prediction Accuracy for Healthcare App …
M3Fusion: A Deep Learning Architecture for Multi-{Scale/Modal/Temporal} satellite data fusion
Title | M3Fusion: A Deep Learning Architecture for Multi-{Scale/Modal/Temporal} satellite data fusion |
Authors | P. Benedetti, D. Ienco, R. Gaetano, K. Osé, R. Pensa, S. Dupuy |
Abstract | Modern Earth Observation systems provide sensing data at different temporal and spatial resolutions. Among optical sensors, today the Sentinel-2 program supplies high-resolution temporal (every 5 days) and high spatial resolution (10m) images that can be useful to monitor land cover dynamics. On the other hand, Very High Spatial Resolution images (VHSR) are still an essential tool to figure out land cover mapping characterized by fine spatial patterns. Understand how to efficiently leverage these complementary sources of information together to deal with land cover mapping is still challenging. With the aim to tackle land cover mapping through the fusion of multi-temporal High Spatial Resolution and Very High Spatial Resolution satellite images, we propose an End-to-End Deep Learning framework, named M3Fusion, able to leverage simultaneously the temporal knowledge contained in time series data as well as the fine spatial information available in VHSR information. Experiments carried out on the Reunion Island study area asses the quality of our proposal considering both quantitative and qualitative aspects. |
Tasks | Time Series |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01945v1 |
http://arxiv.org/pdf/1803.01945v1.pdf | |
PWC | https://paperswithcode.com/paper/m3fusion-a-deep-learning-architecture-for |
Repo | https://github.com/remicres/otbtf |
Framework | tf |
Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning
Title | Grassmannian Learning: Embedding Geometry Awareness in Shallow and Deep Learning |
Authors | Jiayao Zhang, Guangxu Zhu, Robert W. Heath Jr., Kaibin Huang |
Abstract | Modern machine learning algorithms have been adopted in a range of signal-processing applications spanning computer vision, natural language processing, and artificial intelligence. Many relevant problems involve subspace-structured features, orthogonality constrained or low-rank constrained objective functions, or subspace distances. These mathematical characteristics are expressed naturally using the Grassmann manifold. Unfortunately, this fact is not yet explored in many traditional learning algorithms. In the last few years, there have been growing interests in studying Grassmann manifold to tackle new learning problems. Such attempts have been reassured by substantial performance improvements in both classic learning and learning using deep neural networks. We term the former as shallow and the latter deep Grassmannian learning. The aim of this paper is to introduce the emerging area of Grassmannian learning by surveying common mathematical problems and primary solution approaches, and overviewing various applications. We hope to inspire practitioners in different fields to adopt the powerful tool of Grassmannian learning in their research. |
Tasks | |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02229v2 |
http://arxiv.org/pdf/1808.02229v2.pdf | |
PWC | https://paperswithcode.com/paper/grassmannian-learning-embedding-geometry |
Repo | https://github.com/matthew-mcateer/Keras_pruning |
Framework | tf |
Adversarial Decomposition of Text Representation
Title | Adversarial Decomposition of Text Representation |
Authors | Alexey Romanov, Anna Rumshisky, Anna Rogers, David Donahue |
Abstract | In this paper, we present a method for adversarial decomposition of text representation. This method can be used to decompose a representation of an input sentence into several independent vectors, each of them responsible for a specific aspect of the input sentence. We evaluate the proposed method on two case studies: the conversion between different social registers and diachronic language change. We show that the proposed method is capable of fine-grained controlled change of these aspects of the input sentence. It is also learning a continuous (rather than categorical) representation of the style of the sentence, which is more linguistically realistic. The model uses adversarial-motivational training and includes a special motivational loss, which acts opposite to the discriminator and encourages a better decomposition. Furthermore, we evaluate the obtained meaning embeddings on a downstream task of paraphrase detection and show that they significantly outperform the embeddings of a regular autoencoder. |
Tasks | |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.09042v2 |
http://arxiv.org/pdf/1808.09042v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-decomposition-of-text |
Repo | https://github.com/text-machine-lab/adversarial_decomposition |
Framework | pytorch |
Removing Confounding Factors Associated Weights in Deep Neural Networks Improves the Prediction Accuracy for Healthcare Applications
Title | Removing Confounding Factors Associated Weights in Deep Neural Networks Improves the Prediction Accuracy for Healthcare Applications |
Authors | Haohan Wang, Zhenglin Wu, Eric P. Xing |
Abstract | The proliferation of healthcare data has brought the opportunities of applying data-driven approaches, such as machine learning methods, to assist diagnosis. Recently, many deep learning methods have been shown with impressive successes in predicting disease status with raw input data. However, the “black-box” nature of deep learning and the high-reliability requirement of biomedical applications have created new challenges regarding the existence of confounding factors. In this paper, with a brief argument that inappropriate handling of confounding factors will lead to models’ sub-optimal performance in real-world applications, we present an efficient method that can remove the influences of confounding factors such as age or gender to improve the across-cohort prediction accuracy of neural networks. One distinct advantage of our method is that it only requires minimal changes of the baseline model’s architecture so that it can be plugged into most of the existing neural networks. We conduct experiments across CT-scan, MRA, and EEG brain wave with convolutional neural networks and LSTM to verify the efficiency of our method. |
Tasks | EEG |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07276v3 |
http://arxiv.org/pdf/1803.07276v3.pdf | |
PWC | https://paperswithcode.com/paper/removing-confounding-factors-associated |
Repo | https://github.com/HaohanWang/CF |
Framework | tf |
Conditional Affordance Learning for Driving in Urban Environments
Title | Conditional Affordance Learning for Driving in Urban Environments |
Authors | Axel Sauer, Nikolay Savinov, Andreas Geiger |
Abstract | Most existing approaches to autonomous driving fall into one of two categories: modular pipelines, that build an extensive model of the environment, and imitation learning approaches, that map images directly to control outputs. A recently proposed third paradigm, direct perception, aims to combine the advantages of both by using a neural network to learn appropriate low-dimensional intermediate representations. However, existing direct perception approaches are restricted to simple highway situations, lacking the ability to navigate intersections, stop at traffic lights or respect speed limits. In this work, we propose a direct perception approach which maps video input to intermediate representations suitable for autonomous navigation in complex urban environments given high-level directional inputs. Compared to state-of-the-art reinforcement and conditional imitation learning approaches, we achieve an improvement of up to 68 % in goal-directed navigation on the challenging CARLA simulation benchmark. In addition, our approach is the first to handle traffic lights and speed signs by using image-level labels only, as well as smooth car-following, resulting in a significant reduction of traffic accidents in simulation. |
Tasks | Autonomous Driving, Autonomous Navigation, Imitation Learning |
Published | 2018-06-18 |
URL | http://arxiv.org/abs/1806.06498v3 |
http://arxiv.org/pdf/1806.06498v3.pdf | |
PWC | https://paperswithcode.com/paper/conditional-affordance-learning-for-driving |
Repo | https://github.com/xl-sr/CAL |
Framework | none |
Let Me Not Lie: Learning MultiNomial Logit
Title | Let Me Not Lie: Learning MultiNomial Logit |
Authors | Brian Sifringer, Virginie Lurkin, Alexandre Alahi |
Abstract | In discrete choice modeling (DCM), model misspecifications may lead to limited predictability and biased parameter estimates. In this paper, we propose a new approach for estimating choice models in which we divide the systematic part of the utility specification into (i) a knowledge-driven part, and (ii) a data-driven one, which learns a new representation from available explanatory variables. Our formulation increases the predictive power of standard DCM without sacrificing their interpretability. We show the effectiveness of our formulation by augmenting the utility specification of the Multinomial Logit (MNL) and the Nested Logit (NL) models with a new non-linear representation arising from a Neural Network (NN), leading to new choice models referred to as the Learning Multinomial Logit (L-MNL) and Learning Nested Logit (L-NL) models. Using multiple publicly available datasets based on revealed and stated preferences, we show that our models outperform the traditional ones, both in terms of predictive performance and accuracy in parameter estimation. All source code of the models are shared to promote open science. |
Tasks | |
Published | 2018-12-23 |
URL | https://arxiv.org/abs/1812.09747v2 |
https://arxiv.org/pdf/1812.09747v2.pdf | |
PWC | https://paperswithcode.com/paper/let-me-not-lie-learning-multinomial-logit |
Repo | https://github.com/BSifringer/EnhancedDCM |
Framework | tf |
Towards Anticipation of Architectural Smells using Link Prediction Techniques
Title | Towards Anticipation of Architectural Smells using Link Prediction Techniques |
Authors | J. Andrés Díaz-Pace, Antonela Tommasel, Daniela Godoy |
Abstract | Software systems naturally evolve, and this evolution often brings design problems that cause system degradation. Architectural smells are typical symptoms of such problems, and several of these smells are related to undesired dependencies among modules. The early detection of these smells is important for developers, because they can plan ahead for maintenance or refactoring efforts, thus preventing system degradation. Existing tools for identifying architectural smells can detect the smells once they exist in the source code. This means that their undesired dependencies are already created. In this work, we explore a forward-looking approach that is able to infer groups of likely module dependencies that can anticipate architectural smells in a future system version. Our approach considers the current module structure as a network, along with information from previous versions, and applies link prediction techniques (from the field of social network analysis). In particular, we focus on dependency-related smells, such as Cyclic Dependency and Hublike Dependency, which fit well with the link prediction model. An initial evaluation with two open-source projects shows that, under certain considerations, the predictions of our approach are satisfactory. Furthermore, the approach can be extended to other types of dependency-based smells or metrics. |
Tasks | Link Prediction |
Published | 2018-08-20 |
URL | http://arxiv.org/abs/1808.06362v1 |
http://arxiv.org/pdf/1808.06362v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-anticipation-of-architectural-smells |
Repo | https://github.com/tommantonela/scam-2018 |
Framework | none |
Notes on Deep Learning for NLP
Title | Notes on Deep Learning for NLP |
Authors | Antoine J. -P. Tixier |
Abstract | My notes on Deep Learning for NLP. |
Tasks | |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.09772v2 |
http://arxiv.org/pdf/1808.09772v2.pdf | |
PWC | https://paperswithcode.com/paper/notes-on-deep-learning-for-nlp |
Repo | https://github.com/NovaisGabriel/CNN_RNN_for_NLP |
Framework | tf |
A Stein variational Newton method
Title | A Stein variational Newton method |
Authors | Gianluca Detommaso, Tiangang Cui, Alessio Spantini, Youssef Marzouk, Robert Scheichl |
Abstract | Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03085v2 |
http://arxiv.org/pdf/1806.03085v2.pdf | |
PWC | https://paperswithcode.com/paper/a-stein-variational-newton-method |
Repo | https://github.com/gianlucadetommaso/Stein-variational-samplers |
Framework | none |
Adaptive Sampling Towards Fast Graph Representation Learning
Title | Adaptive Sampling Towards Fast Graph Representation Learning |
Authors | Wenbing Huang, Tong Zhang, Yu Rong, Junzhou Huang |
Abstract | Graph Convolutional Networks (GCNs) have become a crucial tool on learning representations of graph vertices. The main challenge of adapting GCNs on large-scale graphs is the scalability issue that it incurs heavy cost both in computation and memory due to the uncontrollable neighborhood expansion across layers. In this paper, we accelerate the training of GCNs through developing an adaptive layer-wise sampling method. By constructing the network layer by layer in a top-down passway, we sample the lower layer conditioned on the top one, where the sampled neighborhoods are shared by different parent nodes and the over expansion is avoided owing to the fixed-size sampling. More importantly, the proposed sampler is adaptive and applicable for explicit variance reduction, which in turn enhances the training of our method. Furthermore, we propose a novel and economical approach to promote the message passing over distant nodes by applying skip connections. Intensive experiments on several benchmarks verify the effectiveness of our method regarding the classification accuracy while enjoying faster convergence speed. |
Tasks | Graph Representation Learning, Node Classification, Representation Learning |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05343v3 |
http://arxiv.org/pdf/1809.05343v3.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-sampling-towards-fast-graph |
Repo | https://github.com/huangwb/AS-GCN |
Framework | tf |
Injecting and removing malignant features in mammography with CycleGAN: Investigation of an automated adversarial attack using neural networks
Title | Injecting and removing malignant features in mammography with CycleGAN: Investigation of an automated adversarial attack using neural networks |
Authors | Anton S. Becker, Lukas Jendele, Ondrej Skopek, Nicole Berger, Soleen Ghafoor, Magda Marcon, Ender Konukoglu |
Abstract | $\textbf{Purpose}$ To train a cycle-consistent generative adversarial network (CycleGAN) on mammographic data to inject or remove features of malignancy, and to determine whether these AI-mediated attacks can be detected by radiologists. $\textbf{Material and Methods}$ From the two publicly available datasets, BCDR and INbreast, we selected images from cancer patients and healthy controls. An internal dataset served as test data, withheld during training. We ran two experiments training CycleGAN on low and higher resolution images ($256 \times 256$ px and $512 \times 408$ px). Three radiologists read the images and rated the likelihood of malignancy on a scale from 1-5 and the likelihood of the image being manipulated. The readout was evaluated by ROC analysis (Area under the ROC curve = AUC). $\textbf{Results}$ At the lower resolution, only one radiologist exhibited markedly lower detection of cancer (AUC=0.85 vs 0.63, p=0.06), while the other two were unaffected (0.67 vs. 0.69 and 0.75 vs. 0.77, p=0.55). Only one radiologist could discriminate between original and modified images slightly better than guessing/chance (0.66, p=0.008). At the higher resolution, all radiologists showed significantly lower detection rate of cancer in the modified images (0.77-0.84 vs. 0.59-0.69, p=0.008), however, they were now able to reliably detect modified images due to better visibility of artifacts (0.92, 0.92 and 0.97). $\textbf{Conclusion}$ A CycleGAN can implicitly learn malignant features and inject or remove them so that a substantial proportion of small mammographic images would consequently be misdiagnosed. At higher resolutions, however, the method is currently limited and has a clear trade-off between manipulation of images and introduction of artifacts. |
Tasks | Adversarial Attack |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07767v1 |
http://arxiv.org/pdf/1811.07767v1.pdf | |
PWC | https://paperswithcode.com/paper/injecting-and-removing-malignant-features-in |
Repo | https://github.com/BreastGAN/experiment1 |
Framework | tf |
Improving the Generalization of Adversarial Training with Domain Adaptation
Title | Improving the Generalization of Adversarial Training with Domain Adaptation |
Authors | Chuanbiao Song, Kun He, Liwei Wang, John E. Hopcroft |
Abstract | By injecting adversarial examples into training data, adversarial training is promising for improving the robustness of deep learning models. However, most existing adversarial training approaches are based on a specific type of adversarial attack. It may not provide sufficiently representative samples from the adversarial domain, leading to a weak generalization ability on adversarial examples from other attacks. Moreover, during the adversarial training, adversarial perturbations on inputs are usually crafted by fast single-step adversaries so as to scale to large datasets. This work is mainly focused on the adversarial training yet efficient FGSM adversary. In this scenario, it is difficult to train a model with great generalization due to the lack of representative adversarial samples, aka the samples are unable to accurately reflect the adversarial domain. To alleviate this problem, we propose a novel Adversarial Training with Domain Adaptation (ATDA) method. Our intuition is to regard the adversarial training on FGSM adversary as a domain adaption task with limited number of target domain samples. The main idea is to learn a representation that is semantically meaningful and domain invariant on the clean domain as well as the adversarial domain. Empirical evaluations on Fashion-MNIST, SVHN, CIFAR-10 and CIFAR-100 demonstrate that ATDA can greatly improve the generalization of adversarial training and the smoothness of the learned models, and outperforms state-of-the-art methods on standard benchmark datasets. To show the transfer ability of our method, we also extend ATDA to the adversarial training on iterative attacks such as PGD-Adversial Training (PAT) and the defense performance is improved considerably. |
Tasks | Adversarial Attack, Domain Adaptation |
Published | 2018-10-01 |
URL | http://arxiv.org/abs/1810.00740v7 |
http://arxiv.org/pdf/1810.00740v7.pdf | |
PWC | https://paperswithcode.com/paper/improving-the-generalization-of-adversarial |
Repo | https://github.com/cxmscb/ATDA |
Framework | tf |
Inverse Cooking: Recipe Generation from Food Images
Title | Inverse Cooking: Recipe Generation from Food Images |
Authors | Amaia Salvador, Michal Drozdzal, Xavier Giro-i-Nieto, Adriana Romero |
Abstract | People enjoy food photography because they appreciate food. Behind each meal there is a story described in a complex recipe and, unfortunately, by simply looking at a food image we do not have access to its preparation process. Therefore, in this paper we introduce an inverse cooking system that recreates cooking recipes given food images. Our system predicts ingredients as sets by means of a novel architecture, modeling their dependencies without imposing any order, and then generates cooking instructions by attending to both image and its inferred ingredients simultaneously. We extensively evaluate the whole system on the large-scale Recipe1M dataset and show that (1) we improve performance w.r.t. previous baselines for ingredient prediction; (2) we are able to obtain high quality recipes by leveraging both image and ingredients; (3) our system is able to produce more compelling recipes than retrieval-based approaches according to human judgment. We make code and models publicly available. |
Tasks | Recipe Generation |
Published | 2018-12-14 |
URL | https://arxiv.org/abs/1812.06164v2 |
https://arxiv.org/pdf/1812.06164v2.pdf | |
PWC | https://paperswithcode.com/paper/inverse-cooking-recipe-generation-from-food |
Repo | https://github.com/krutikabapat/Inverse_Cooking_recipe_Generation_from_food_images |
Framework | tf |
The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs
Title | The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs |
Authors | Boris Ivanovic, Marco Pavone |
Abstract | Developing safe human-robot interaction systems is a necessary step towards the widespread integration of autonomous agents in society. A key component of such systems is the ability to reason about the many potential futures (e.g. trajectories) of other agents in the scene. Towards this end, we present the Trajectron, a graph-structured model that predicts many potential future trajectories of multiple agents simultaneously in both highly dynamic and multimodal scenarios (i.e. where the number of agents in the scene is time-varying and there are many possible highly-distinct futures for each agent). It combines tools from recurrent sequence modeling and variational deep generative modeling to produce a distribution of future trajectories for each agent in a scene. We demonstrate the performance of our model on several datasets, obtaining state-of-the-art results on standard trajectory prediction metrics as well as introducing a new metric for comparing models that output distributions. |
Tasks | Decision Making, Trajectory Prediction |
Published | 2018-10-14 |
URL | https://arxiv.org/abs/1810.05993v3 |
https://arxiv.org/pdf/1810.05993v3.pdf | |
PWC | https://paperswithcode.com/paper/modeling-multimodal-dynamic-spatiotemporal |
Repo | https://github.com/StanfordASL/Trajectron |
Framework | pytorch |
Evaluating and Understanding the Robustness of Adversarial Logit Pairing
Title | Evaluating and Understanding the Robustness of Adversarial Logit Pairing |
Authors | Logan Engstrom, Andrew Ilyas, Anish Athalye |
Abstract | We evaluate the robustness of Adversarial Logit Pairing, a recently proposed defense against adversarial examples. We find that a network trained with Adversarial Logit Pairing achieves 0.6% accuracy in the threat model in which the defense is considered. We provide a brief overview of the defense and the threat models/claims considered, as well as a discussion of the methodology and results of our attack, which may offer insights into the reasons underlying the vulnerability of ALP to adversarial attack. |
Tasks | Adversarial Attack |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10272v2 |
http://arxiv.org/pdf/1807.10272v2.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-and-understanding-the-robustness |
Repo | https://github.com/labsix/adversarial-logit-pairing-analysis |
Framework | tf |