Paper Group ANR 87
Adaptive Period Embedding for Representing Oriented Objects in Aerial Images. Universal Adversarial Perturbations to Understand Robustness of Texture vs. Shape-biased Training. Automatic Intracranial Brain Segmentation from Computed Tomography Head Images. A study on the Interpretability of Neural Retrieval Models using DeepSHAP. Active Learning fo …
Adaptive Period Embedding for Representing Oriented Objects in Aerial Images
Title | Adaptive Period Embedding for Representing Oriented Objects in Aerial Images |
Authors | Yixing Zhu, Xueqing Wu, Jun Du |
Abstract | We propose a novel method for representing oriented objects in aerial images named Adaptive Period Embedding (APE). While traditional object detection methods represent object with horizontal bounding boxes, the objects in aerial images are oritented. Calculating the angle of object is an yet challenging task. While almost all previous object detectors for aerial images directly regress the angle of objects, they use complex rules to calculate the angle, and their performance is limited by the rule design. In contrast, our method is based on the angular periodicity of oriented objects. The angle is represented by two two-dimensional periodic vectors whose periods are different, the vector is continuous as shape changes. The label generation rule is more simple and reasonable compared with previous methods. The proposed method is general and can be applied to other oriented detector. Besides, we propose a novel IoU calculation method for long objects named length independent IoU (LIIoU). We intercept part of the long side of the target box to get the maximum IoU between the proposed box and the intercepted target box. Thereby, some long boxes will have corresponding positive samples. Our method reaches the 1st place of DOAI2019 competition task1 (oriented object) held in workshop on Detecting Objects in Aerial Images in conjunction with IEEE CVPR 2019. |
Tasks | Object Detection |
Published | 2019-06-22 |
URL | https://arxiv.org/abs/1906.09447v1 |
https://arxiv.org/pdf/1906.09447v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-period-embedding-for-representing |
Repo | |
Framework | |
Universal Adversarial Perturbations to Understand Robustness of Texture vs. Shape-biased Training
Title | Universal Adversarial Perturbations to Understand Robustness of Texture vs. Shape-biased Training |
Authors | Kenneth T. Co, Luis Muñoz-González, Leslie Kanthan, Ben Glocker, Emil C. Lupu |
Abstract | Convolutional Neural Networks (CNNs) used on image classification tasks such as ImageNet have been shown to be biased towards recognizing textures rather than shapes. Recent work has attempted to alleviate this by augmenting the training dataset with shape-based examples to create Stylized-ImageNet. However, in this paper we show that models trained on this modified dataset remain as vulnerable to Universal Adversarial Perturbations (UAPs) as those trained in ImageNet. We use UAPs to evaluate, compare, and understand the robustness of CNN models with varying degrees of shape-based training. We also find that a posteriori fine-tuning on ImageNet negates features learned from training on Stylized-ImageNet. This study reveals an important current limitation and highlights the need for further research into robustness of CNNs for visual recognition. |
Tasks | Image Classification |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.10364v2 |
https://arxiv.org/pdf/1911.10364v2.pdf | |
PWC | https://paperswithcode.com/paper/universal-adversarial-perturbations-to |
Repo | |
Framework | |
Automatic Intracranial Brain Segmentation from Computed Tomography Head Images
Title | Automatic Intracranial Brain Segmentation from Computed Tomography Head Images |
Authors | Bhavya Ajani |
Abstract | Fast and automatic algorithm to segment Brain (intracranial region) from computed tomography (CT) head images using combination of HU thresholding, identification of intracranial voxels through ray intersection with cranium, special binary erosion and connected components per slice. |
Tasks | Brain Segmentation, Computed Tomography (CT) |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.09726v1 |
https://arxiv.org/pdf/1906.09726v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-intracranial-brain-segmentation |
Repo | |
Framework | |
A study on the Interpretability of Neural Retrieval Models using DeepSHAP
Title | A study on the Interpretability of Neural Retrieval Models using DeepSHAP |
Authors | Zeon Trevor Fernando, Jaspreet Singh, Avishek Anand |
Abstract | A recent trend in IR has been the usage of neural networks to learn retrieval models for text based adhoc search. While various approaches and architectures have yielded significantly better performance than traditional retrieval models such as BM25, it is still difficult to understand exactly why a document is relevant to a query. In the ML community several approaches for explaining decisions made by deep neural networks have been proposed – including DeepSHAP which modifies the DeepLift algorithm to estimate the relative importance (shapley values) of input features for a given decision by comparing the activations in the network for a given image against the activations caused by a reference input. In image classification, the reference input tends to be a plain black image. While DeepSHAP has been well studied for image classification tasks, it remains to be seen how we can adapt it to explain the output of Neural Retrieval Models (NRMs). In particular, what is a good “black” image in the context of IR? In this paper we explored various reference input document construction techniques. Additionally, we compared the explanations generated by DeepSHAP to LIME (a model agnostic approach) and found that the explanations differ considerably. Our study raises concerns regarding the robustness and accuracy of explanations produced for NRMs. With this paper we aim to shed light on interesting problems surrounding interpretability in NRMs and highlight areas of future work. |
Tasks | Image Classification |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06484v1 |
https://arxiv.org/pdf/1907.06484v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-on-the-interpretability-of-neural |
Repo | |
Framework | |
Active Learning for Risk-Sensitive Inverse Reinforcement Learning
Title | Active Learning for Risk-Sensitive Inverse Reinforcement Learning |
Authors | Rui Chen, Wenshuo Wang, Zirui Zhao, Ding Zhao |
Abstract | One typical assumption in inverse reinforcement learning (IRL) is that human experts act to optimize the expected utility of a stochastic cost with a fixed distribution. This assumption deviates from actual human behaviors under ambiguity. Risk-sensitive inverse reinforcement learning (RS-IRL) bridges such gap by assuming that humans act according to a random cost with respect to a set of subjectively distorted distributions instead of a fixed one. Such assumption provides the additional flexibility to model human’s risk preferences, represented by a risk envelope, in safe-critical tasks. However, like other learning from demonstration techniques, RS-IRL could also suffer inefficient learning due to redundant demonstrations. Inspired by the concept of active learning, this research derives a probabilistic disturbance sampling scheme to enable an RS-IRL agent to query expert support that is likely to expose unrevealed boundaries of the expert’s risk envelope. Experimental results confirm that our approach accelerates the convergence of RS-IRL algorithms with lower variance while still guaranteeing unbiased convergence. |
Tasks | Active Learning |
Published | 2019-09-14 |
URL | https://arxiv.org/abs/1909.07843v2 |
https://arxiv.org/pdf/1909.07843v2.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-for-risk-sensitive-inverse |
Repo | |
Framework | |
A Review of Keyphrase Extraction
Title | A Review of Keyphrase Extraction |
Authors | Eirini Papagiannopoulou, Grigorios Tsoumakas |
Abstract | Keyphrase extraction is a textual information processing task concerned with the automatic extraction of representative and characteristic phrases from a document that express all the key aspects of its content. Keyphrases constitute a succinct conceptual summary of a document, which is very useful in digital information management systems for semantic indexing, faceted search, document clustering and classification. This article introduces keyphrase extraction, provides a well-structured review of the existing work, offers interesting insights on the different evaluation approaches, highlights open issues and presents a comparative experimental study of popular unsupervised techniques on five datasets. |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05044v2 |
https://arxiv.org/pdf/1905.05044v2.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-keyphrase-extraction |
Repo | |
Framework | |
Cross-domain Network Representations
Title | Cross-domain Network Representations |
Authors | Shan Xue, Jie Lu, Guangquan Zhang |
Abstract | The purpose of network representation is to learn a set of latent features by obtaining community information from network structures to provide knowledge for machine learning tasks. Recent research has driven significant progress in network representation by employing random walks as the network sampling strategy. Nevertheless, existing approaches rely on domain-specifically rich community structures and fail in the network that lack topological information in its own domain. In this paper, we propose a novel algorithm for cross-domain network representation, named as CDNR. By generating the random walks from a structural rich domain and transferring the knowledge on the random walks across domains, it enables a network representation for the structural scarce domain as well. To be specific, CDNR is realized by a cross-domain two-layer node-scale balance algorithm and a cross-domain two-layer knowledge transfer algorithm in the framework of cross-domain two-layer random walk learning. Experiments on various real-world datasets demonstrate the effectiveness of CDNR for universal networks in an unsupervised way. |
Tasks | Transfer Learning |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00205v1 |
https://arxiv.org/pdf/1908.00205v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-network-representations |
Repo | |
Framework | |
Learning Improvement Heuristics for Solving the Travelling Salesman Problem
Title | Learning Improvement Heuristics for Solving the Travelling Salesman Problem |
Authors | Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang, Andrew Lim |
Abstract | Recent studies in using deep learning to solve the Travelling Salesman Problem (TSP) focus on construction heuristics, the solution of which may still be far from optimality. To improve solution quality, additional procedures such as sampling or beam search are required. However, they are still based on the same construction policy, which is less effective in refining a solution. In this paper, we propose to directly learn the improvement heuristics for solving TSP based on deep reinforcement learning.We first present a reinforcement learning formulation for the improvement heuristic, where the policy guides selection of the next solution. Then, we propose a deep architecture as the policy network based on self-attention. Extensive experiments show that, improvement policies learned by our approach yield better results than state-of-the-art methods, even from random initial solutions. Moreover, the learned policies are more effective than the traditional hand-crafted ones, and robust to different initial solutions with either high or poor quality. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.05784v1 |
https://arxiv.org/pdf/1912.05784v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-improvement-heuristics-for-solving |
Repo | |
Framework | |
A mathematical theory of cooperative communication
Title | A mathematical theory of cooperative communication |
Authors | Pei Wang, Junqi Wang, Pushpi Paranamana, Patrick Shafto |
Abstract | Cooperative communication plays a central role in theories of human cognition, language, development, and culture, and is increasingly relevant in human-algorithm and robot interaction. Existing models are algorithmic in nature and do not shed light on the statistical problem solved in cooperation or on constraints imposed by violations of common ground. We present a mathematical theory of cooperative communication that unifies three broad classes of algorithmic models as approximations of Optimal Transport (OT). We derive a statistical interpretation for the problem approximated by existing models in terms of entropy minimization, or likelihood maximizing, plans. We show that some models are provably robust to violations of common ground, even supporting online, approximate recovery from discovered violations, and derive conditions under which other models are provably not robust. We do so using gradient-based methods which introduce novel algorithmic-level perspectives on cooperative communication. Our mathematical approach complements and extends empirical research, providing strong theoretical tools derivation of a priori constraints on models and implications for cooperative communication in theory and practice. |
Tasks | |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02822v1 |
https://arxiv.org/pdf/1910.02822v1.pdf | |
PWC | https://paperswithcode.com/paper/a-mathematical-theory-of-cooperative |
Repo | |
Framework | |
Situated GAIL: Multitask imitation using task-conditioned adversarial inverse reinforcement learning
Title | Situated GAIL: Multitask imitation using task-conditioned adversarial inverse reinforcement learning |
Authors | Kyoichiro Kobayashi, Takato Horii, Ryo Iwaki, Yukie Nagai, Minoru Asada |
Abstract | Generative adversarial imitation learning (GAIL) has attracted increasing attention in the field of robot learning. It enables robots to learn a policy to achieve a task demonstrated by an expert while simultaneously estimating the reward function behind the expert’s behaviors. However, this framework is limited to learning a single task with a single reward function. This study proposes an extended framework called situated GAIL (S-GAIL), in which a task variable is introduced to both the discriminator and generator of the GAIL framework. The task variable has the roles of discriminating different contexts and making the framework learn different reward functions and policies for multiple tasks. To achieve the early convergence of learning and robustness during reward estimation, we introduce a term to adjust the entropy regularization coefficient in the generator’s objective function. Our experiments using two setups (navigation in a discrete grid world and arm reaching in a continuous space) demonstrate that the proposed framework can acquire multiple reward functions and policies more effectively than existing frameworks. The task variable enables our framework to differentiate contexts while sharing common knowledge among multiple tasks. |
Tasks | Imitation Learning |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00238v1 |
https://arxiv.org/pdf/1911.00238v1.pdf | |
PWC | https://paperswithcode.com/paper/situated-gail-multitask-imitation-using-task |
Repo | |
Framework | |
Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources
Title | Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources |
Authors | Edwin Simpson, Steven Reece, Stephen J. Roberts |
Abstract | Unstructured data from diverse sources, such as social media and aerial imagery, can provide valuable up-to-date information for intelligent situation assessment. Mining these different information sources could bring major benefits to applications such as situation awareness in disaster zones and mapping the spread of diseases. Such applications depend on classifying the situation across a region of interest, which can be depicted as a spatial “heatmap”. Annotating unstructured data using crowdsourcing or automated classifiers produces individual classifications at sparse locations that typically contain many errors. We propose a novel Bayesian approach that models the relevance, error rates and bias of each information source, enabling us to learn a spatial Gaussian Process classifier by aggregating data from multiple sources with varying reliability and relevance. Our method does not require gold-labelled data and can make predictions at any location in an area of interest given only sparse observations. We show empirically that our approach can handle noisy and biased data sources, and that simultaneously inferring reliability and transferring information between neighbouring reports leads to more accurate predictions. We demonstrate our method on two real-world problems from disaster response, showing how our approach reduces the amount of crowdsourced data required and can be used to generate valuable heatmap visualisations from SMS messages and satellite images. |
Tasks | |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.03063v1 |
http://arxiv.org/pdf/1904.03063v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-heatmaps-probabilistic |
Repo | |
Framework | |
Improving Machine Hearing on Limited Data Sets
Title | Improving Machine Hearing on Limited Data Sets |
Authors | Pavol Harar, Roswitha Bammer, Anna Breger, Monika Dörfler, Zdenek Smekal |
Abstract | Convolutional neural network (CNN) architectures have originated and revolutionized machine learning for images. In order to take advantage of CNNs in predictive modeling with audio data, standard FFT-based signal processing methods are often applied to convert the raw audio waveforms into an image-like representations (e.g. spectrograms). Even though conventional images and spectrograms differ in their feature properties, this kind of pre-processing reduces the amount of training data necessary for successful training. In this contribution we investigate how input and target representations interplay with the amount of available training data in a music information retrieval setting. We compare the standard mel-spectrogram inputs with a newly proposed representation, called Mel scattering. Furthermore, we investigate the impact of additional target data representations by using an augmented target loss function which incorporates unused available information. We observe that all proposed methods outperform the standard mel-transform representation when using a limited data set and discuss their strengths and limitations. The source code for reproducibility of our experiments as well as intermediate results and model checkpoints are available in an online repository. |
Tasks | Information Retrieval, Music Information Retrieval |
Published | 2019-03-21 |
URL | https://arxiv.org/abs/1903.08950v3 |
https://arxiv.org/pdf/1903.08950v3.pdf | |
PWC | https://paperswithcode.com/paper/machines-listening-to-music-the-role-of |
Repo | |
Framework | |
Deep Saliency Models : The Quest For The Loss Function
Title | Deep Saliency Models : The Quest For The Loss Function |
Authors | Alexandre Bruckert, Hamed R. Tavakoli, Zhi Liu, Marc Christie, Olivier Le Meur |
Abstract | Recent advances in deep learning have pushed the performances of visual saliency models way further than it has ever been. Numerous models in the literature present new ways to design neural networks, to arrange gaze pattern data, or to extract as much high and low-level image features as possible in order to create the best saliency representation. However, one key part of a typical deep learning model is often neglected: the choice of the loss function. In this work, we explore some of the most popular loss functions that are used in deep saliency models. We demonstrate that on a fixed network architecture, modifying the loss function can significantly improve (or depreciate) the results, hence emphasizing the importance of the choice of the loss function when designing a model. We also introduce new loss functions that have never been used for saliency prediction to our knowledge. And finally, we show that a linear combination of several well-chosen loss functions leads to significant improvements in performances on different datasets as well as on a different network architecture, hence demonstrating the robustness of a combined metric. |
Tasks | Saliency Prediction |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02336v1 |
https://arxiv.org/pdf/1907.02336v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-saliency-models-the-quest-for-the-loss |
Repo | |
Framework | |
Coupling Rendering and Generative Adversarial Networks for Artificial SAS Image Generation
Title | Coupling Rendering and Generative Adversarial Networks for Artificial SAS Image Generation |
Authors | Albert Reed, Isaac Gerg, John McKay, Daniel Brown, David Williams, Suren Jayasuriya |
Abstract | Acquisition of Synthetic Aperture Sonar (SAS) datasets is bottlenecked by the costly deployment of SAS imaging systems, and even when data acquisition is possible,the data is often skewed towards containing barren seafloor rather than objects of interest. We present a novel pipeline, called SAS GAN, which couples an optical renderer with a generative adversarial network (GAN) to synthesize realistic SAS images of targets on the seafloor. This coupling enables high levels of SAS image realism while enabling control over image geometry and parameters. We demonstrate qualitative results by presenting examples of images created with our pipeline. We also present quantitative results through the use of t-SNE and the Fr'echet Inception Distance to argue that our generated SAS imagery potentially augments SAS datasets more effectively than an off-the-shelf GAN. |
Tasks | Image Generation |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06436v2 |
https://arxiv.org/pdf/1909.06436v2.pdf | |
PWC | https://paperswithcode.com/paper/coupling-rendering-and-generative-adversarial |
Repo | |
Framework | |
Bayesian Active Learning for Structured Output Design
Title | Bayesian Active Learning for Structured Output Design |
Authors | Kota Matsui, Shunya Kusakawa, Keisuke Ando, Kentaro Kutsukake, Toru Ujihara, Ichiro Takeuchi |
Abstract | In this paper, we propose an active learning method for an inverse problem that aims to find an input that achieves a desired structured-output. The proposed method provides new acquisition functions for minimizing the error between the desired structured-output and the prediction of a Gaussian process model, by effectively incorporating the correlation between multiple outputs of the underlying multi-valued black box output functions. The effectiveness of the proposed method is verified by applying it to two synthetic shape search problem and real data. In the real data experiment, we tackle the input parameter search which achieves the desired crystal growth rate in silicon carbide (SiC) crystal growth modeling, that is a problem of materials informatics. |
Tasks | Active Learning |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.03671v1 |
https://arxiv.org/pdf/1911.03671v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-active-learning-for-structured |
Repo | |
Framework | |