Paper Group ANR 354
Streaming Active Deep Forest for Evolving Data Stream Classification. Towards Robust and Reproducible Active Learning Using Neural Networks. Adaptive Region-Based Active Learning. Active Learning for Sound Event Detection. Efficient active learning of sparse halfspaces with arbitrary bounded noise. Deep Domain-Adversarial Image Generation for Domai …
Streaming Active Deep Forest for Evolving Data Stream Classification
Title | Streaming Active Deep Forest for Evolving Data Stream Classification |
Authors | Anh Vu Luong, Tien Thanh Nguyen, Alan Wee-Chung Liew |
Abstract | In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite of its huge success in the batch learning setting, no effort has been made to adapt Deep Forest to the context of evolving data streams. In this work, we introduce the Streaming Deep Forest (SDF) algorithm, a high-performance deep ensemble method specially adapted to stream classification. We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context. We compare the proposed methods to state-of-the-art streaming algorithms in a wide range of datasets. The results show that by following the AVU active learning strategy, SDF with only 70% of labeling budget significantly outperforms other methods trained with all instances. |
Tasks | Active Learning |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11816v1 |
https://arxiv.org/pdf/2002.11816v1.pdf | |
PWC | https://paperswithcode.com/paper/streaming-active-deep-forest-for-evolving |
Repo | |
Framework | |
Towards Robust and Reproducible Active Learning Using Neural Networks
Title | Towards Robust and Reproducible Active Learning Using Neural Networks |
Authors | Prateek Munjal, Nasir Hayat, Munawar Hayat, Jamshid Sourati, Shadab Khan |
Abstract | Active learning (AL) is a promising ML paradigm that has the potential to parse through large unlabeled data and help reduce annotation cost in domains where labeling entire data can be prohibitive. Recently proposed neural network based AL methods use different heuristics to accomplish this goal. In this study, we show that recent AL methods offer a gain over random baseline under a brittle combination of experimental conditions. We demonstrate that such marginal gains vanish when experimental factors are changed, leading to reproducibility issues and suggesting that AL methods lack robustness. We also observe that with a properly tuned model, which employs recently proposed regularization techniques, the performance significantly improves for all AL methods including the random sampling baseline, and performance differences among the AL methods become negligible. Based on these observations, we suggest a set of experiments that are critical to assess the true effectiveness of an AL method. To facilitate these experiments we also present an open source toolkit. We believe our findings and recommendations will help advance reproducible research in robust AL using neural networks. |
Tasks | Active Learning |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09564v1 |
https://arxiv.org/pdf/2002.09564v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-and-reproducible-active |
Repo | |
Framework | |
Adaptive Region-Based Active Learning
Title | Adaptive Region-Based Active Learning |
Authors | Corinna Cortes, Giulia DeSalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang |
Abstract | We present a new active learning algorithm that adaptively partitions the input space into a finite number of regions, and subsequently seeks a distinct predictor for each region, both phases actively requesting labels. We prove theoretical guarantees for both the generalization error and the label complexity of our algorithm, and analyze the number of regions defined by the algorithm under some mild assumptions. We also report the results of an extensive suite of experiments on several real-world datasets demonstrating substantial empirical benefits over existing single-region and non-adaptive region-based active learning baselines. |
Tasks | Active Learning |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07348v1 |
https://arxiv.org/pdf/2002.07348v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-region-based-active-learning |
Repo | |
Framework | |
Active Learning for Sound Event Detection
Title | Active Learning for Sound Event Detection |
Authors | Shuyang Zhao, Toni Heittola, Tuomas Virtanen |
Abstract | This paper proposes an active learning system for sound event detection (SED). It aims at maximizing the accuracy of a learned SED model with limited annotation effort. The proposed system analyzes an initially unlabeled audio dataset, from which it selects sound segments for manual annotation. The candidate segments are generated based on a proposed change point detection approach, and the selection is based on the principle of mismatch-first farthest-traversal. During the training of SED models, recordings are used as training inputs, preserving the long-term context for annotated segments. The proposed system clearly outperforms reference methods in the two datasets used for evaluation (TUT Rare Sound 2017 and TAU Spatial Sound 2019). Training with recordings as context outperforms training with only annotated segments. Mismatch-first farthest-traversal outperforms reference sample selection methods based on random sampling and uncertainty sampling. Remarkably, the required annotation effort can be greatly reduced on the dataset where target sound events are rare: by annotating only 2% of the training data, the achieved SED performance is similar to annotating all the training data. |
Tasks | Active Learning, Change Point Detection, Sound Event Detection |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05033v1 |
https://arxiv.org/pdf/2002.05033v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-for-sound-event-detection |
Repo | |
Framework | |
Efficient active learning of sparse halfspaces with arbitrary bounded noise
Title | Efficient active learning of sparse halfspaces with arbitrary bounded noise |
Authors | Chicheng Zhang, Jie Shen, Pranjal Awasthi |
Abstract | In this work we study active learning of homogeneous $s$-sparse halfspaces in $\mathbb{R}^d$ under label noise. Even in the absence of label noise this is a challenging problem and only recently have label complexity bounds of the form $\tilde{O} \left(s \cdot \mathrm{polylog}(d, \frac{1}{\epsilon}) \right)$ been established in \citet{zhang2018efficient} for computationally efficient algorithms under the broad class of isotropic log-concave distributions. In contrast, under high levels of label noise, the label complexity bounds achieved by computationally efficient algorithms are much worse. When the label noise satisfies the {\em Massart} condition~\citep{massart2006risk}, i.e., each label is flipped with probability at most $\eta$ for a parameter $\eta \in [0,\frac 1 2)$, the work of \citet{awasthi2016learning} provides a computationally efficient active learning algorithm under isotropic log-concave distributions with label complexity $\tilde{O} \left(s^{\mathrm{poly}{(1/(1-2\eta))}} \mathrm{poly}(\log d, \frac{1}{\epsilon}) \right)$. Hence the algorithm is label-efficient only when the noise rate $\eta$ is a constant. In this work, we substantially improve on the state of the art by designing a polynomial time algorithm for active learning of $s$-sparse halfspaces under bounded noise and isotropic log-concave distributions, with a label complexity of $\tilde{O} \left(\frac{s}{(1-2\eta)^4} \mathrm{polylog} (d, \frac 1 \epsilon) \right)$. Hence, our new algorithm is label-efficient even for noise rates close to $\frac{1}{2}$. Prior to our work, such a result was not known even for the random classification noise model. Our algorithm builds upon existing margin-based algorithmic framework and at each iteration performs a sequence of online mirror descent updates on a carefully chosen loss sequence, and uses a novel gradient update rule that accounts for the bounded noise. |
Tasks | Active Learning |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04840v1 |
https://arxiv.org/pdf/2002.04840v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-active-learning-of-sparse-1 |
Repo | |
Framework | |
Deep Domain-Adversarial Image Generation for Domain Generalisation
Title | Deep Domain-Adversarial Image Generation for Domain Generalisation |
Authors | Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, Tao Xiang |
Abstract | Machine learning models typically suffer from the domain shift problem when trained on a source dataset and evaluated on a target dataset of different distribution. To overcome this problem, domain generalisation (DG) methods aim to leverage data from multiple source domains so that a trained model can generalise to unseen domains. In this paper, we propose a novel DG approach based on \emph{Deep Domain-Adversarial Image Generation} (DDAIG). Specifically, DDAIG consists of three components, namely a label classifier, a domain classifier and a domain transformation network (DoTNet). The goal for DoTNet is to map the source training data to unseen domains. This is achieved by having a learning objective formulated to ensure that the generated data can be correctly classified by the label classifier while fooling the domain classifier. By augmenting the source training data with the generated unseen domain data, we can make the label classifier more robust to unknown domain changes. Extensive experiments on four DG datasets demonstrate the effectiveness of our approach. |
Tasks | Image Generation |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.06054v1 |
https://arxiv.org/pdf/2003.06054v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-domain-adversarial-image-generation-for |
Repo | |
Framework | |
Interpretable Companions for Black-Box Models
Title | Interpretable Companions for Black-Box Models |
Authors | Danqing Pan, Tong Wang, Satoshi Hara |
Abstract | We present an interpretable companion model for any pre-trained black-box classifiers. The idea is that for any input, a user can decide to either receive a prediction from the black-box model, with high accuracy but no explanations, or employ a companion rule to obtain an interpretable prediction with slightly lower accuracy. The companion model is trained from data and the predictions of the black-box model, with the objective combining area under the transparency–accuracy curve and model complexity. Our model provides flexible choices for practitioners who face the dilemma of choosing between always using interpretable models and always using black-box models for a predictive task, so users can, for any given input, take a step back to resort to an interpretable prediction if they find the predictive performance satisfying, or stick to the black-box model if the rules are unsatisfying. To show the value of companion models, we design a human evaluation on more than a hundred people to investigate the tolerable accuracy loss to gain interpretability for humans. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03494v2 |
https://arxiv.org/pdf/2002.03494v2.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-companions-for-black-box-models |
Repo | |
Framework | |
Mixup Regularization for Region Proposal based Object Detectors
Title | Mixup Regularization for Region Proposal based Object Detectors |
Authors | Shahine Bouabid, Vincent Delaitre |
Abstract | Mixup - a neural network regularization technique based on linear interpolation of labeled sample pairs - has stood out by its capacity to improve model’s robustness and generalizability through a surprisingly simple formalism. However, its extension to the field of object detection remains unclear as the interpolation of bounding boxes cannot be naively defined. In this paper, we propose to leverage the inherent region mapping structure of anchors to introduce a mixup-driven training regularization for region proposal based object detectors. The proposed method is benchmarked on standard datasets with challenging detection settings. Our experiments show an enhanced robustness to image alterations along with an ability to decontextualize detections, resulting in an improved generalization power. |
Tasks | Object Detection |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02065v1 |
https://arxiv.org/pdf/2003.02065v1.pdf | |
PWC | https://paperswithcode.com/paper/mixup-regularization-for-region-proposal |
Repo | |
Framework | |
Learning reduced systems via deep neural networks with memory
Title | Learning reduced systems via deep neural networks with memory |
Authors | Xiaohan Fu, Lo-bin Chang, Dongbin Xiu |
Abstract | We present a general numerical approach for constructing governing equations for unknown dynamical systems when only data on a subset of the state variables are available. The unknown equations for these observed variables are thus a reduced system of the complete set of state variables. Reduced systems possess memory integrals, based on the well known Mori-Zwanzig (MZ) formulism. Our numerical strategy to recover the reduced system starts by formulating a discrete approximation of the memory integral in the MZ formulation. The resulting unknown approximate MZ equations are of finite dimensional, in the sense that a finite number of past history data are involved. We then present a deep neural network structure that directly incorporates the history terms to produce memory in the network. The approach is suitable for any practical systems with finite memory length. We then use a set of numerical examples to demonstrate the effectiveness of our method. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09451v1 |
https://arxiv.org/pdf/2003.09451v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-reduced-systems-via-deep-neural |
Repo | |
Framework | |
Generalised Lipschitz Regularisation Equals Distributional Robustness
Title | Generalised Lipschitz Regularisation Equals Distributional Robustness |
Authors | Zac Cranko, Zhan Shi, Xinhua Zhang, Richard Nock, Simon Kornblith |
Abstract | The problem of adversarial examples has highlighted the need for a theory of regularisation that is general enough to apply to exotic function classes, such as universal approximators. In response, we give a very general equality result regarding the relationship between distributional robustness and regularisation, as defined with a transportation cost uncertainty set. The theory allows us to (tightly) certify the robustness properties of a Lipschitz-regularised model with very mild assumptions. As a theoretical application we show a new result explicating the connection between adversarial learning and distributional robustness. We then give new results for how to achieve Lipschitz regularisation of kernel classifiers, which are demonstrated experimentally. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04197v1 |
https://arxiv.org/pdf/2002.04197v1.pdf | |
PWC | https://paperswithcode.com/paper/generalised-lipschitz-regularisation-equals |
Repo | |
Framework | |
Knowledge Graphs on the Web – an Overview
Title | Knowledge Graphs on the Web – an Overview |
Authors | Nicolas Heist, Sven Hertling, Daniel Ringler, Heiko Paulheim |
Abstract | Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowledge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chapter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. |
Tasks | Knowledge Graphs |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00719v3 |
https://arxiv.org/pdf/2003.00719v3.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graphs-on-the-web-an-overview |
Repo | |
Framework | |
Evolutionary Bin Packing for Memory-Efficient Dataflow Inference Acceleration on FPGA
Title | Evolutionary Bin Packing for Memory-Efficient Dataflow Inference Acceleration on FPGA |
Authors | Mairin Kroes, Lucian Petrica, Sorin Cotofana, Michaela Blott |
Abstract | Convolutional neural network (CNN) dataflow inference accelerators implemented in Field Programmable Gate Arrays (FPGAs) have demonstrated increased energy efficiency and lower latency compared to CNN execution on CPUs or GPUs. However, the complex shapes of CNN parameter memories do not typically map well to FPGA on-chip memories (OCM), which results in poor OCM utilization and ultimately limits the size and types of CNNs which can be effectively accelerated on FPGAs. In this work, we present a design methodology that improves the mapping efficiency of CNN parameters to FPGA OCM. We frame the mapping as a bin packing problem and determine that traditional bin packing algorithms are not well suited to solve the problem within FPGA- and CNN-specific constraints. We hybridize genetic algorithms and simulated annealing with traditional bin packing heuristics to create flexible mappers capable of grouping parameter memories such that each group optimally fits FPGA on-chip memories. We evaluate these algorithms on a variety of FPGA inference accelerators. Our hybrid mappers converge to optimal solutions in a matter of seconds for all CNN use-cases, achieve an increase of up to 65% in OCM utilization efficiency for deep CNNs, and are up to 200$\times$ faster than current state-of-the-art simulated annealing approaches. |
Tasks | |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.12449v1 |
https://arxiv.org/pdf/2003.12449v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-bin-packing-for-memory-efficient |
Repo | |
Framework | |
Optimized Generic Feature Learning for Few-shot Classification across Domains
Title | Optimized Generic Feature Learning for Few-shot Classification across Domains |
Authors | Tonmoy Saikia, Thomas Brox, Cordelia Schmid |
Abstract | To learn models or features that generalize across tasks and domains is one of the grand goals of machine learning. In this paper, we propose to use cross-domain, cross-task data as validation objective for hyper-parameter optimization (HPO) to improve on this goal. Given a rich enough search space, optimization of hyper-parameters learn features that maximize validation performance and, due to the objective, generalize across tasks and domains. We demonstrate the effectiveness of this strategy on few-shot image classification within and across domains. The learned features outperform all previous few-shot and meta-learning approaches. |
Tasks | Few-Shot Image Classification, Image Classification, Meta-Learning |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.07926v1 |
https://arxiv.org/pdf/2001.07926v1.pdf | |
PWC | https://paperswithcode.com/paper/optimized-generic-feature-learning-for-few |
Repo | |
Framework | |
Adversarial Attacks on Convolutional Neural Networks in Facial Recognition Domain
Title | Adversarial Attacks on Convolutional Neural Networks in Facial Recognition Domain |
Authors | Yigit Alparslan, Jeremy Keim-Shenk, Shweta Khade, Rachel Greenstadt |
Abstract | Numerous recent studies have demonstrated how Deep Neural Network (DNN) classifiers can be fooled by adversarial examples, in which an attacker adds perturbations to an original sample, causing the classifier to misclassify the sample. Adversarial attacks that render DNNs vulnerable in real life represent a serious threat, given the consequences of improperly functioning autonomous vehicles, malware filters, or biometric authentication systems. In this paper, we apply Fast Gradient Sign Method to introduce perturbations to a facial image dataset and then test the output on a different classifier that we trained ourselves, to analyze transferability of this method. Next, we craft a variety of different attack algorithms on a facial image dataset, with the intention of developing untargeted black-box approaches assuming minimal adversarial knowledge, to further assess the robustness of DNNs in the facial recognition realm. We explore modifying single optimal pixels by a large amount, or modifying all pixels by a smaller amount, or combining these two attack approaches. While our single-pixel attacks achieved about a 15% average decrease in classifier confidence level for the actual class, the all-pixel attacks were more successful and achieved up to an 84% average decrease in confidence, along with an 81.6% misclassification rate, in the case of the attack that we tested with the highest levels of perturbation. Even with these high levels of perturbation, the face images remained fairly clearly identifiable to a human. We hope our research may help to advance the study of adversarial attacks on DNNs and defensive mechanisms to counteract them, particularly in the facial recognition domain. |
Tasks | Autonomous Vehicles |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11137v1 |
https://arxiv.org/pdf/2001.11137v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-attacks-on-convolutional-neural |
Repo | |
Framework | |
Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement Learning
Title | Challenges and Countermeasures for Adversarial Attacks on Deep Reinforcement Learning |
Authors | Inaam Ilahi, Muhammad Usama, Junaid Qadir, Muhammad Umar Janjua, Ala Al-Fuqaha, Dinh Thai Hoang, Dusit Niyato |
Abstract | Deep Reinforcement Learning (DRL) has numerous applications in the real world thanks to its outstanding ability in quickly adapting to the surrounding environments. Despite its great advantages, DRL is susceptible to adversarial attacks, which precludes its use in real-life critical systems and applications (e.g., smart grids, traffic controls, and autonomous vehicles) unless its vulnerabilities are addressed and mitigated. Thus, this paper provides a comprehensive survey that discusses emerging attacks in DRL-based systems and the potential countermeasures to defend against these attacks. We first cover some fundamental backgrounds about DRL and present emerging adversarial attacks on machine learning techniques. We then investigate more details of the vulnerabilities that the adversary can exploit to attack DRL along with the state-of-the-art countermeasures to prevent such attacks. Finally, we highlight open issues and research challenges for developing solutions to deal with attacks for DRL-based intelligent systems. |
Tasks | Autonomous Vehicles |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09684v1 |
https://arxiv.org/pdf/2001.09684v1.pdf | |
PWC | https://paperswithcode.com/paper/challenges-and-countermeasures-for |
Repo | |
Framework | |