Paper Group ANR 375
Adversarial Attacks on Monocular Depth Estimation. Fair Correlation Clustering. Runtime Performances of Randomized Search Heuristics for the Dynamic Weighted Vertex Cover Problem. ScaIL: Classifier Weights Scaling for Class Incremental Learning. Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs. Transfer Learning for HVAC Sys …
Adversarial Attacks on Monocular Depth Estimation
Title | Adversarial Attacks on Monocular Depth Estimation |
Authors | Ziqi Zhang, Xinge Zhu, Yingwei Li, Xiangqun Chen, Yao Guo |
Abstract | Recent advances of deep learning have brought exceptional performance on many computer vision tasks such as semantic segmentation and depth estimation. However, the vulnerability of deep neural networks towards adversarial examples have caused grave concerns for real-world deployment. In this paper, we present to the best of our knowledge the first systematic study of adversarial attacks on monocular depth estimation, an important task of 3D scene understanding in scenarios such as autonomous driving and robot navigation. In order to understand the impact of adversarial attacks on depth estimation, we first define a taxonomy of different attack scenarios for depth estimation, including non-targeted attacks, targeted attacks and universal attacks. We then adapt several state-of-the-art attack methods for classification on the field of depth estimation. Besides, multi-task attacks are introduced to further improve the attack performance for universal attacks. Experimental results show that it is possible to generate significant errors on depth estimation. In particular, we demonstrate that our methods can conduct targeted attacks on given objects (such as a car), resulting in depth estimation 3-4x away from the ground truth (e.g., from 20m to 80m). |
Tasks | Autonomous Driving, Depth Estimation, Monocular Depth Estimation, Robot Navigation, Scene Understanding, Semantic Segmentation |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10315v1 |
https://arxiv.org/pdf/2003.10315v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-attacks-on-monocular-depth |
Repo | |
Framework | |
Fair Correlation Clustering
Title | Fair Correlation Clustering |
Authors | Sara Ahmadian, Alessandro Epasto, Ravi Kumar, Mohammad Mahdian |
Abstract | In this paper, we study correlation clustering under fairness constraints. Fair variants of $k$-median and $k$-center clustering have been studied recently, and approximation algorithms using a notion called fairlet decomposition have been proposed. We obtain approximation algorithms for fair correlation clustering under several important types of fairness constraints. Our results hinge on obtaining a fairlet decomposition for correlation clustering by introducing a novel combinatorial optimization problem. We define a fairlet decomposition with cost similar to the $k$-median cost and this allows us to obtain approximation algorithms for a wide range of fairness constraints. We complement our theoretical results with an in-depth analysis of our algorithms on real graphs where we show that fair solutions to correlation clustering can be obtained with limited increase in cost compared to the state-of-the-art (unfair) algorithms. |
Tasks | Combinatorial Optimization |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02274v2 |
https://arxiv.org/pdf/2002.02274v2.pdf | |
PWC | https://paperswithcode.com/paper/fair-correlation-clustering |
Repo | |
Framework | |
Runtime Performances of Randomized Search Heuristics for the Dynamic Weighted Vertex Cover Problem
Title | Runtime Performances of Randomized Search Heuristics for the Dynamic Weighted Vertex Cover Problem |
Authors | Feng Shi, Frank Neumann, Jianxin Wang |
Abstract | Randomized search heuristics such as evolutionary algorithms are frequently applied to dynamic combinatorial optimization problems. Within this paper, we present a dynamic model of the classic Weighted Vertex Cover problem and analyze the runtime performances of the well-studied algorithms Randomized Local Search and (1+1) EA adapted to it, to contribute to the theoretical understanding of evolutionary computing for problems with dynamic changes. In our investigations, we use an edge-based representation based on the dual form of the Linear Programming formulation for the problem and study the expected runtime that the adapted algorithms require to maintain a 2-approximate solution when the given weighted graph is modified by an edge-editing or weight-editing operation. Considering the weights on the vertices may be exponentially large with respect to the size of the graph, the step size adaption strategy is incorporated, with or without the 1/5-th rule that is employed to control the increasing/decreasing rate of the step size. Our results show that three of the four algorithms presented in the paper can recompute 2-approximate solutions for the studied dynamic changes in polynomial expected runtime, but the (1+1) EA with 1/5-th Rule requires pseudo-polynomial expected runtime. |
Tasks | Combinatorial Optimization |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.08903v1 |
https://arxiv.org/pdf/2001.08903v1.pdf | |
PWC | https://paperswithcode.com/paper/runtime-performances-of-randomized-search |
Repo | |
Framework | |
ScaIL: Classifier Weights Scaling for Class Incremental Learning
Title | ScaIL: Classifier Weights Scaling for Class Incremental Learning |
Authors | Eden Belouadah, Adrian Popescu |
Abstract | Incremental learning is useful if an AI agent needs to integrate data from a stream. The problem is non trivial if the agent runs on a limited computational budget and has a bounded memory of past data. In a deep learning approach, the constant computational budget requires the use of a fixed architecture for all incremental states. The bounded memory generates data imbalance in favor of new classes and a prediction bias toward them appears. This bias is commonly countered by introducing a data balancing step in addition to the basic network training. We depart from this approach and propose simple but efficient scaling of past class classifier weights to make them more comparable to those of new classes. Scaling exploits incremental state level statistics and is applied to the classifiers learned in the initial state of classes in order to profit from all their available data. We also question the utility of the widely used distillation loss component of incremental learning algorithms by comparing it to vanilla fine tuning in presence of a bounded memory. Evaluation is done against competitive baselines using four public datasets. Results show that the classifier weights scaling and the removal of the distillation are both beneficial. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05755v1 |
https://arxiv.org/pdf/2001.05755v1.pdf | |
PWC | https://paperswithcode.com/paper/scail-classifier-weights-scaling-for-class |
Repo | |
Framework | |
Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs
Title | Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs |
Authors | Jieming Shi, Tianyuan Jin, Renchi Yang, Xiaokui Xiao, Yin Yang |
Abstract | Given a graph G and a node u in G, a single source SimRank query evaluates the similarity between u and every node v in G. Existing approaches to single source SimRank computation incur either long query response time, or expensive pre-computation, which needs to be performed again whenever the graph G changes. Consequently, to our knowledge none of them is ideal for scenarios in which (i) query processing must be done in realtime, and (ii) the underlying graph G is massive, with frequent updates. Motivated by this, we propose SimPush, a novel algorithm that answers single source SimRank queries without any pre-computation, and at the same time achieves significantly higher query processing speed than even the fastest known index-based solutions. Further, SimPush provides rigorous result quality guarantees, and its high performance does not rely on any strong assumption of the underlying graph. Specifically, compared to existing methods, SimPush employs a radically different algorithmic design that focuses on (i) identifying a small number of nodes relevant to the query, and subsequently (ii) computing statistics and performing residue push from these nodes only. We prove the correctness of SimPush, analyze its time complexity, and compare its asymptotic performance with that of existing methods. Meanwhile, we evaluate the practical performance of SimPush through extensive experiments on 8 real datasets. The results demonstrate that SimPush consistently outperforms all existing solutions, often by over an order of magnitude. In particular, on a commodity machine, SimPush answers a single source SimRank query on a web graph containing over 133 million nodes and 5.4 billion edges in under 62 milliseconds, with 0.00035 empirical error, while the fastest index-based competitor needs 1.18 seconds. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08082v1 |
https://arxiv.org/pdf/2002.08082v1.pdf | |
PWC | https://paperswithcode.com/paper/realtime-index-free-single-source-simrank |
Repo | |
Framework | |
Transfer Learning for HVAC System Fault Detection
Title | Transfer Learning for HVAC System Fault Detection |
Authors | Chase P. Dowling, Baosen Zhang |
Abstract | Faults in HVAC systems degrade thermal comfort and energy efficiency in buildings and have received significant attention from the research community, with data driven methods gaining in popularity. Yet the lack of labeled data, such as normal versus faulty operational status, has slowed the application of machine learning to HVAC systems. In addition, for any particular building, there may be an insufficient number of observed faults over a reasonable amount of time for training. To overcome these challenges, we present a transfer methodology for a novel Bayesian classifier designed to distinguish between normal operations and faulty operations. The key is to train this classifier on a building with a large amount of sensor and fault data (for example, via simulation or standard test data) then transfer the classifier to a new building using a small amount of normal operations data from the new building. We demonstrate a proof-of-concept for transferring a classifier between architecturally similar buildings in different climates and show few samples are required to maintain classification precision and recall. |
Tasks | Fault Detection, Transfer Learning |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01060v1 |
https://arxiv.org/pdf/2002.01060v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-for-hvac-system-fault |
Repo | |
Framework | |
Near real-time map building with multi-class image set labelling and classification of road conditions using convolutional neural networks
Title | Near real-time map building with multi-class image set labelling and classification of road conditions using convolutional neural networks |
Authors | Sheela Ramanna, Cenker Sengoz, Scott Kehler, Dat Pham |
Abstract | Weather is an important factor affecting transportation and road safety. In this paper, we leverage state-of-the-art convolutional neural networks in labelling images taken by street and highway cameras located across across North America. Road camera snapshots were used in experiments with multiple deep learning frameworks to classify images by road condition. The training data for these experiments used images labelled as dry, wet, snow/ice, poor, and offline. The experiments tested different configurations of six convolutional neural networks (VGG-16, ResNet50, Xception, InceptionResNetV2, EfficientNet-B0 and EfficientNet-B4) to assess their suitability to this problem. The precision, accuracy, and recall were measured for each framework configuration. In addition, the training sets were varied both in overall size and by size of individual classes. The final training set included 47,000 images labelled using the five aforementioned classes. The EfficientNet-B4 framework was found to be most suitable to this problem, achieving validation accuracy of 90.6%, although EfficientNet-B0 achieved an accuracy of 90.3% with half the execution time. It was observed that VGG-16 with transfer learning proved to be very useful for data acquisition and pseudo-labelling with limited hardware resources, throughout this project. The EfficientNet-B4 framework was then placed into a real-time production environment, where images could be classified in real-time on an ongoing basis. The classified images were then used to construct a map showing real-time road conditions at various camera locations across North America. The choice of these frameworks and our analysis take into account unique requirements of real-time map building functions. A detailed analysis of the process of semi-automated dataset labelling using these frameworks is also presented in this paper. |
Tasks | Transfer Learning |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09947v1 |
https://arxiv.org/pdf/2001.09947v1.pdf | |
PWC | https://paperswithcode.com/paper/near-real-time-map-building-with-multi-class |
Repo | |
Framework | |
Knowledge Graph Embedding for Link Prediction: A Comparative Analysis
Title | Knowledge Graph Embedding for Link Prediction: A Comparative Analysis |
Authors | Andrea Rossi, Donatella Firmani, Antonio Matinata, Paolo Merialdo, Denilson Barbosa |
Abstract | Knowledge Graphs (KGs) have found many applications in industry and academic settings, which in turn, have motivated considerable research efforts towards large-scale information extraction from a variety of sources. Despite such efforts, it is well known that even state-of-the-art KGs suffer from incompleteness. Link Prediction (LP), the task of predicting missing facts among entities already a KG, is a promising and widely studied task aimed at addressing KG incompleteness. Among the recent LP techniques, those based on KG embeddings have achieved very promising performances in some benchmarks. Despite the fast growing literature in the subject, insufficient attention has been paid to the effect of the various design choices in those methods. Moreover, the standard practice in this area is to report accuracy by aggregating over a large number of test facts in which some entities are over-represented; this allows LP methods to exhibit good performance by just attending to structural properties that include such entities, while ignoring the remaining majority of the KG. This analysis provides a comprehensive comparison of embedding-based LP methods, extending the dimensions of analysis beyond what is commonly available in the literature. We experimentally compare effectiveness and efficiency of 16 state-of-the-art methods, consider a rule-based baseline, and report detailed analysis over the most popular benchmarks in the literature. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00819v2 |
https://arxiv.org/pdf/2002.00819v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graph-embedding-for-link-prediction |
Repo | |
Framework | |
Dynamic Multi-objective Optimization of the Travelling Thief Problem
Title | Dynamic Multi-objective Optimization of the Travelling Thief Problem |
Authors | Daniel Herring, Michael Kirley, Xin Yao |
Abstract | Investigation of detailed and complex optimisation problem formulations that reflect realistic scenarios is a burgeoning field of research. A growing body of work exists for the Travelling Thief Problem, including multi-objective formulations and comparisons of exact and approximate methods to solve it. However, as many realistic scenarios are non-static in time, dynamic formulations have yet to be considered for the TTP. Definition of dynamics within three areas of the TTP problem are addressed; in the city locations, availability map and item values. Based on the elucidation of solution conservation between initial sets and obtained non-dominated sets, we define a range of initialisation mechanisms using solutions generated via solvers, greedily and randomly. These are then deployed to seed the population after a change and the performance in terms of hypervolume and spread is presented for comparison. Across a range of problems with varying TSP-component and KP-component sizes, we observe interesting trends in line with existing conclusions; there is little benefit to using randomisation as a strategy for initialisation of solution populations when the optimal TSP and KP component solutions can be exploited. Whilst these separate optima don’t guarantee good TTP solutions, when combined, provide better initial performance and therefore in some examined instances, provides the best response to dynamic changes. A combined approach that mixes solution generation methods to provide a composite population in response to dynamic changes provides improved performance in some instances for the different dynamic TTP formulations. Potential for further development of a more cooperative combined method are realised to more cohesively exploit known information about the problems. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02636v1 |
https://arxiv.org/pdf/2002.02636v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-multi-objective-optimization-of-the |
Repo | |
Framework | |
Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings
Title | Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings |
Authors | Luke Melas-Kyriazi, George Han, Celine Liang |
Abstract | Over the past year, the emergence of transfer learning with large-scale language models (LM) has led to dramatic performance improvements across a broad range of natural language understanding tasks. However, the size and memory footprint of these large LMs makes them difficult to deploy in many scenarios (e.g. on mobile phones). Recent research points to knowledge distillation as a potential solution, showing that when training data for a given task is abundant, it is possible to distill a large (teacher) LM into a small task-specific (student) network with minimal loss of performance. However, when such data is scarce, there remains a significant performance gap between large pretrained LMs and smaller task-specific models, even when training via distillation. In this paper, we bridge this gap with a novel training approach, called generation-distillation, that leverages large finetuned LMs in two ways: (1) to generate new (unlabeled) training examples, and (2) to distill their knowledge into a small network using these examples. Across three low-resource text classification datsets, we achieve comparable performance to BERT while using 300x fewer parameters, and we outperform prior approaches to distillation for text classification while using 3x fewer parameters. |
Tasks | Text Classification, Transfer Learning |
Published | 2020-01-25 |
URL | https://arxiv.org/abs/2002.00733v1 |
https://arxiv.org/pdf/2002.00733v1.pdf | |
PWC | https://paperswithcode.com/paper/generation-distillation-for-efficient-natural-1 |
Repo | |
Framework | |
Health State Estimation
Title | Health State Estimation |
Authors | Nitish Nag |
Abstract | Life’s most valuable asset is health. Continuously understanding the state of our health and modeling how it evolves is essential if we wish to improve it. Given the opportunity that people live with more data about their life today than any other time in history, the challenge rests in interweaving this data with the growing body of knowledge to compute and model the health state of an individual continually. This dissertation presents an approach to build a personal model and dynamically estimate the health state of an individual by fusing multi-modal data and domain knowledge. The system is stitched together from four essential abstraction elements: 1. the events in our life, 2. the layers of our biological systems (from molecular to an organism), 3. the functional utilities that arise from biological underpinnings, and 4. how we interact with these utilities in the reality of daily life. Connecting these four elements via graph network blocks forms the backbone by which we instantiate a digital twin of an individual. Edges and nodes in this graph structure are then regularly updated with learning techniques as data is continuously digested. Experiments demonstrate the use of dense and heterogeneous real-world data from a variety of personal and environmental sensors to monitor individual cardiovascular health state. State estimation and individual modeling is the fundamental basis to depart from disease-oriented approaches to a total health continuum paradigm. Precision in predicting health requires understanding state trajectory. By encasing this estimation within a navigational approach, a systematic guidance framework can plan actions to transition a current state towards a desired one. This work concludes by presenting this framework of combining the health state and personal graph model to perpetually plan and assist us in living life towards our goals. |
Tasks | |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.09312v1 |
https://arxiv.org/pdf/2003.09312v1.pdf | |
PWC | https://paperswithcode.com/paper/health-state-estimation |
Repo | |
Framework | |
Implicit Regularization of Random Feature Models
Title | Implicit Regularization of Random Feature Models |
Authors | Arthur Jacot, Berfin Şimşek, Francesco Spadaro, Clément Hongler, Franck Gabriel |
Abstract | Random Feature (RF) models are used as efficient parametric approximations of kernel methods. We investigate, by means of random matrix theory, the connection between Gaussian RF models and Kernel Ridge Regression (KRR). For a Gaussian RF model with $P$ features, $N$ data points, and a ridge $\lambda$, we show that the average (i.e. expected) RF predictor is close to a KRR predictor with an effective ridge $\tilde{\lambda}$. We show that $\tilde{\lambda} > \lambda$ and $\tilde{\lambda} \searrow \lambda$ monotonically as $P$ grows, thus revealing the implicit regularization effect of finite RF sampling. We then compare the risk (i.e. test error) of the $\tilde{\lambda}$-KRR predictor with the average risk of the $\lambda$-RF predictor and obtain a precise and explicit bound on their difference. Finally, we empirically find an extremely good agreement between the test errors of the average $\lambda$-RF predictor and $\tilde{\lambda}$-KRR predictor. |
Tasks | |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08404v1 |
https://arxiv.org/pdf/2002.08404v1.pdf | |
PWC | https://paperswithcode.com/paper/implicit-regularization-of-random-feature |
Repo | |
Framework | |
Q-learning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning
Title | Q-learning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning |
Authors | Adithya M. Devraj, Sean P. Meyn |
Abstract | It has been a trend in the Reinforcement Learning literature to derive sample complexity bounds: a bound on how many experiences with the environment are required to obtain an $\varepsilon$-optimal policy. In the discounted cost, infinite horizon setting, all of the known bounds have a factor that is a polynomial in $1/(1-\beta)$, where $\beta < 1$ is the discount factor. For a large discount factor, these bounds seem to imply that a very large number of samples is required to achieve an $\varepsilon$-optimal policy. The objective of the present work is to introduce a new class of algorithms that have sample complexity uniformly bounded for all $\beta < 1$. One may argue that this is impossible, due to a recent min-max lower bound. The explanation is that this previous lower bound is for a specific problem, which we modify, without compromising the ultimate objective of obtaining an $\varepsilon$-optimal policy. Specifically, we show that the asymptotic variance of the Q-learning algorithm, with an optimized step-size sequence, is a quadratic function of $1/(1-\beta)$; an expected, and essentially known result. The new relative Q-learning algorithm proposed here is shown to have asymptotic variance that is a quadratic in $1/(1- \rho \beta)$, where $1 - \rho > 0$ is the spectral gap of an optimal transition matrix. |
Tasks | Q-Learning |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10301v1 |
https://arxiv.org/pdf/2002.10301v1.pdf | |
PWC | https://paperswithcode.com/paper/q-learning-with-uniformly-bounded-variance |
Repo | |
Framework | |
Neural Network Compression Framework for fast model inference
Title | Neural Network Compression Framework for fast model inference |
Authors | Alexander Kozlov, Ivan Lazarevich, Vasily Shamporov, Nikolay Lyalyushkin, Yury Gorbachev |
Abstract | In this work we present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF). It leverages recent advances of various network compression methods and implements some of them, such as sparsity, quantization, and binarization. These methods allow getting more hardware-friendly models which can be efficiently run on general-purpose hardware computation units (CPU, GPU) or special Deep Learning accelerators. We show that the developed methods can be successfully applied to a wide range of models to accelerate the inference time while keeping the original accuracy. The framework can be used within the training samples, which are supplied with it, or as a standalone package that can be seamlessly integrated into the existing training code with minimal adaptations. Currently, a PyTorch version of NNCF is available as a part of OpenVINO Training Extensions at https://github.com/opencv/openvino_training_extensions/tree/develop/pytorch_toolkit/nncf |
Tasks | Neural Network Compression, Quantization |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08679v3 |
https://arxiv.org/pdf/2002.08679v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-compression-framework-for-fast |
Repo | |
Framework | |
Parametric Probabilistic Quantum Memory
Title | Parametric Probabilistic Quantum Memory |
Authors | Rodrigo S. Sousa, Priscila G. M. dos Santos, Tiago M. L. Veras, Wilson R. de Oliveira, Adenilton J. da Silva |
Abstract | Probabilistic Quantum Memory (PQM) is a data structure that computes the distance from a binary input to all binary patterns stored in superposition on the memory. This data structure allows the development of heuristics to speed up artificial neural networks architecture selection. In this work, we propose an improved parametric version of the PQM to perform pattern classification, and we also present a PQM quantum circuit suitable for Noisy Intermediate Scale Quantum (NISQ) computers. We present a classical evaluation of a parametric PQM network classifier on public benchmark datasets. We also perform experiments to verify the viability of PQM on a 5-qubit quantum computer. |
Tasks | |
Published | 2020-01-11 |
URL | https://arxiv.org/abs/2001.04798v1 |
https://arxiv.org/pdf/2001.04798v1.pdf | |
PWC | https://paperswithcode.com/paper/parametric-probabilistic-quantum-memory |
Repo | |
Framework | |