Paper Group ANR 1321
Attention Guided Anomaly Localization in Images. Adversarial FDI Attack against AC State Estimation with ANN. Object-Centric Task and Motion Planning in Dynamic Environments. Texture Underfitting for Domain Adaptation. Towards White-box Benchmarks for Algorithm Control. Experience Replay Optimization. Deep Parametric Indoor Lighting Estimation. Who …
Attention Guided Anomaly Localization in Images
Title | Attention Guided Anomaly Localization in Images |
Authors | Shashanka Venkataramanan, Kuan-Chuan Peng, Rajat Vikram Singh, Abhijit Mahalanobis |
Abstract | Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific threshold to localize anomalies. Without the need of anomalous training images, we propose Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information. In the unsupervised setting, we propose an attention expansion loss where we encourage CAVGA to focus on all normal regions in the image. Furthermore, in the weakly-supervised setting we propose a complementary guided attention loss, where we encourage the attention map to focus on all normal regions while minimizing the attention map corresponding to anomalous regions in the image. CAVGA outperforms the state-of-the-art (SOTA) anomaly localization methods on MVTec Anomaly Detection (MVTAD), modified ShanghaiTech Campus (mSTC) and Large-scale Attention based Glaucoma (LAG) datasets in the unsupervised setting and when using only 2% anomalous images in the weakly-supervised setting. CAVGA also outperforms SOTA anomaly detection methods on the MNIST, CIFAR-10, Fashion-MNIST, MVTAD, mSTC and LAG datasets. |
Tasks | Anomaly Detection |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08616v2 |
https://arxiv.org/pdf/1911.08616v2.pdf | |
PWC | https://paperswithcode.com/paper/attention-guided-anomaly-detection-and |
Repo | |
Framework | |
Adversarial FDI Attack against AC State Estimation with ANN
Title | Adversarial FDI Attack against AC State Estimation with ANN |
Authors | Tian Liu, Tao Shu |
Abstract | Artificial neural network (ANN) provides superior accuracy for nonlinear alternating current (AC) state estimation (SE) in smart grid over traditional methods. However, research has discovered that ANN could be easily fooled by adversarial examples. In this paper, we initiate a new study of adversarial false data injection (FDI) attack against AC SE with ANN: by injecting a deliberate attack vector into measurements, the attacker can degrade the accuracy of ANN SE while remaining undetected. We propose a population-based algorithm and a gradient-based algorithm to generate attack vectors. The performance of these algorithms is evaluated through simulations on IEEE 9-bus, 14-bus and 30-bus systems under various attack scenarios. Simulation results show that DE is more effective than SLSQP on all simulation cases. The attack examples generated by DE algorithm successfully degrade the ANN SE accuracy with high probability. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11328v1 |
https://arxiv.org/pdf/1906.11328v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-fdi-attack-against-ac-state |
Repo | |
Framework | |
Object-Centric Task and Motion Planning in Dynamic Environments
Title | Object-Centric Task and Motion Planning in Dynamic Environments |
Authors | Toki Migimatsu, Jeannette Bohg |
Abstract | We address the problem of applying Task and Motion Planning (TAMP) in real world environments. TAMP combines symbolic and geometric reasoning to produce sequential manipulation plans, typically specified as joint-space trajectories, which are valid only as long as the environment is static and perception and control are highly accurate. In case of any changes in the environment, slow re-planning is required. We propose a TAMP algorithm that optimizes over Cartesian frames defined relative to target objects. The resulting plan then remains valid even if the objects are moving and can be executed by reactive controllers that adapt to these changes in real time. We apply our TAMP framework to a torque-controlled robot in a pick and place setting and demonstrate its ability to adapt to changing environments, inaccurate perception, and imprecise control, both in simulation and the real world. |
Tasks | Motion Planning |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04679v2 |
https://arxiv.org/pdf/1911.04679v2.pdf | |
PWC | https://paperswithcode.com/paper/object-centric-task-and-motion-planning-in |
Repo | |
Framework | |
Texture Underfitting for Domain Adaptation
Title | Texture Underfitting for Domain Adaptation |
Authors | Jan-Nico Zaech, Dengxin Dai, Martin Hahner, Luc Van Gool |
Abstract | Comprehensive semantic segmentation is one of the key components for robust scene understanding and a requirement to enable autonomous driving. Driven by large scale datasets, convolutional neural networks show impressive results on this task. However, a segmentation algorithm generalizing to various scenes and conditions would require an enormously diverse dataset, making the labour intensive data acquisition and labeling process prohibitively expensive. Under the assumption of structural similarities between segmentation maps, domain adaptation promises to resolve this challenge by transferring knowledge from existing, potentially simulated datasets to new environments where no supervision exists. While the performance of this approach is contingent on the concept that neural networks learn a high level understanding of scene structure, recent work suggests that neural networks are biased towards overfitting to texture instead of learning structural and shape information. Considering the ideas underlying semantic segmentation, we employ random image stylization to augment the training dataset and propose a training procedure that facilitates texture underfitting to improve the performance of domain adaptation. In experiments with supervised as well as unsupervised methods for the task of synthetic-to-real domain adaptation, we show that our approach outperforms conventional training methods. |
Tasks | Autonomous Driving, Domain Adaptation, Image Stylization, Scene Understanding, Semantic Segmentation |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11215v1 |
https://arxiv.org/pdf/1908.11215v1.pdf | |
PWC | https://paperswithcode.com/paper/texture-underfitting-for-domain-adaptation |
Repo | |
Framework | |
Towards White-box Benchmarks for Algorithm Control
Title | Towards White-box Benchmarks for Algorithm Control |
Authors | André Biedenkapp, H. Furkan Bozkurt, Frank Hutter, Marius Lindauer |
Abstract | The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on tuned hyperparameter configurations. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However there is still a lot of untapped potential through adjusting an algorithm’s hyperparameters online since different hyperparameters are potentially optimal at different stages of the algorithm. We formulate the problem of adjusting an algorithm’s hyperparameters for a given instance on the fly as a contextual MDP, making reinforcement learning (RL) the prime candidate to solve the resulting algorithm control problem in a data-driven way. Furthermore, inspired by applications of algorithm configuration, we introduce new white-box benchmarks suitable to study algorithm control. We show that on short sequences, algorithm configuration is a valid choice, but that with increasing sequence length a black-box view on the problem quickly becomes infeasible and RL performs better. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07644v2 |
https://arxiv.org/pdf/1906.07644v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-white-box-benchmarks-for-algorithm |
Repo | |
Framework | |
Experience Replay Optimization
Title | Experience Replay Optimization |
Authors | Daochen Zha, Kwei-Herng Lai, Kaixiong Zhou, Xia Hu |
Abstract | Experience replay enables reinforcement learning agents to memorize and reuse past experiences, just as humans replay memories for the situation at hand. Contemporary off-policy algorithms either replay past experiences uniformly or utilize a rule-based replay strategy, which may be sub-optimal. In this work, we consider learning a replay policy to optimize the cumulative reward. Replay learning is challenging because the replay memory is noisy and large, and the cumulative reward is unstable. To address these issues, we propose a novel experience replay optimization (ERO) framework which alternately updates two policies: the agent policy, and the replay policy. The agent is updated to maximize the cumulative reward based on the replayed data, while the replay policy is updated to provide the agent with the most useful experiences. The conducted experiments on various continuous control tasks demonstrate the effectiveness of ERO, empirically showing promise in experience replay learning to improve the performance of off-policy reinforcement learning algorithms. |
Tasks | Continuous Control |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08387v1 |
https://arxiv.org/pdf/1906.08387v1.pdf | |
PWC | https://paperswithcode.com/paper/experience-replay-optimization |
Repo | |
Framework | |
Deep Parametric Indoor Lighting Estimation
Title | Deep Parametric Indoor Lighting Estimation |
Authors | Marc-André Gardner, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Christian Gagné, Jean-François Lalonde |
Abstract | We present a method to estimate lighting from a single image of an indoor scene. Previous work has used an environment map representation that does not account for the localized nature of indoor lighting. Instead, we represent lighting as a set of discrete 3D lights with geometric and photometric parameters. We train a deep neural network to regress these parameters from a single image, on a dataset of environment maps annotated with depth. We propose a differentiable layer to convert these parameters to an environment map to compute our loss; this bypasses the challenge of establishing correspondences between estimated and ground truth lights. We demonstrate, via quantitative and qualitative evaluations, that our representation and training scheme lead to more accurate results compared to previous work, while allowing for more realistic 3D object compositing with spatially-varying lighting. |
Tasks | |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.08812v1 |
https://arxiv.org/pdf/1910.08812v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-parametric-indoor-lighting-estimation |
Repo | |
Framework | |
Who wants accurate models? Arguing for a different metrics to take classification models seriously
Title | Who wants accurate models? Arguing for a different metrics to take classification models seriously |
Authors | Federico Cabitza, Andrea Campagner |
Abstract | With the increasing availability of AI-based decision support, there is an increasing need for their certification by both AI manufacturers and notified bodies, as well as the pragmatic (real-world) validation of these systems. Therefore, there is the need for meaningful and informative ways to assess the performance of AI systems in clinical practice. Common metrics (like accuracy scores and areas under the ROC curve) have known problems and they do not take into account important information about the preferences of clinicians and the needs of their specialist practice, like the likelihood and impact of errors and the complexity of cases. In this paper, we present a new accuracy measure, the H-accuracy (Ha), which we claim is more informative in the medical domain (and others of similar needs) for the elements it encompasses. We also provide proof that the H-accuracy is a generalization of the balanced accuracy and establish a relation between the H-accuracy and the Net Benefit. Finally, we illustrate an experimentation in two user studies to show the descriptive power of the Ha score and how complementary and differently informative measures can be derived from its formulation (a Python script to compute Ha is also made available). |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09246v2 |
https://arxiv.org/pdf/1910.09246v2.pdf | |
PWC | https://paperswithcode.com/paper/who-wants-accurate-models-arguing-for-a |
Repo | |
Framework | |
The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation
Title | The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation |
Authors | Hermann Blum, Paul-Edouard Sarlin, Juan Nieto, Roland Siegwart, Cesar Cadena |
Abstract | Deep learning has enabled impressive progress in the accuracy of semantic segmentation. Yet, the ability to estimate uncertainty and detect failure is key for safety-critical applications like autonomous driving. Existing uncertainty estimates have mostly been evaluated on simple tasks, and it is unclear whether these methods generalize to more complex scenarios. We present Fishyscapes, the first public benchmark for uncertainty estimation in a real-world task of semantic segmentation for urban driving. It evaluates pixel-wise uncertainty estimates towards the detection of anomalous objects in front of the vehicle. We~adapt state-of-the-art methods to recent semantic segmentation models and compare approaches based on softmax confidence, Bayesian learning, and embedding density. Our results show that anomaly detection is far from solved even for ordinary situations, while our benchmark allows measuring advancements beyond the state-of-the-art. |
Tasks | Anomaly Detection, Autonomous Driving, Semantic Segmentation |
Published | 2019-04-05 |
URL | https://arxiv.org/abs/1904.03215v3 |
https://arxiv.org/pdf/1904.03215v3.pdf | |
PWC | https://paperswithcode.com/paper/the-fishyscapes-benchmark-measuring-blind |
Repo | |
Framework | |
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers
Title | Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers |
Authors | Yao Ma, Alex Olshevsky, Venkatesh Saligrama, Csaba Szepesvari |
Abstract | We consider worker skill estimation for the single-coin Dawid-Skene crowdsourcing model. In practice, skill-estimation is challenging because worker assignments are sparse and irregular due to the arbitrary and uncontrolled availability of workers. We formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlations between workers. We show that the correlation matrix can be successfully recovered and skills are identifiable if and only if the sampling matrix (observed components) does not have a bipartite connected component. We then propose a projected gradient descent scheme and show that skill estimates converge to the desired global optima for such sampling matrices. Our proof is original and the results are surprising in light of the fact that even the weighted rank-one matrix factorization problem is NP-hard in general. Next, we derive sample complexity bounds in terms of spectral properties of the signless Laplacian of the sampling matrix. Our proposed scheme achieves state-of-art performance on a number of real-world datasets. |
Tasks | Matrix Completion |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11608v1 |
http://arxiv.org/pdf/1904.11608v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-descent-for-sparse-rank-one-matrix-1 |
Repo | |
Framework | |
Adversarial-Based Knowledge Distillation for Multi-Model Ensemble and Noisy Data Refinement
Title | Adversarial-Based Knowledge Distillation for Multi-Model Ensemble and Noisy Data Refinement |
Authors | Zhiqiang Shen, Zhankui He, Wanyun Cui, Jiahui Yu, Yutong Zheng, Chenchen Zhu, Marios Savvides |
Abstract | Generic Image recognition is a fundamental and fairly important visual problem in computer vision. One of the major challenges of this task lies in the fact that single image usually has multiple objects inside while the labels are still one-hot, another one is noisy and sometimes missing labels when annotated by humans. In this paper, we focus on tackling these challenges accompanying with two different image recognition problems: multi-model ensemble and noisy data recognition with a unified framework. As is well-known, usually the best performing deep neural models are ensembles of multiple base-level networks, as it can mitigate the variation or noise containing in the dataset. Unfortunately, the space required to store these many networks, and the time required to execute them at runtime, prohibit their use in applications where test sets are large (e.g., ImageNet). In this paper, we present a method for compressing large, complex trained ensembles into a single network, where the knowledge from a variety of trained deep neural networks (DNNs) is distilled and transferred to a single DNN. In order to distill diverse knowledge from different trained (teacher) models, we propose to use adversarial-based learning strategy where we define a block-wise training loss to guide and optimize the predefined student network to recover the knowledge in teacher models, and to promote the discriminator network to distinguish teacher vs. student features simultaneously. Extensive experiments on CIFAR-10/100, SVHN, ImageNet and iMaterialist Challenge Dataset demonstrate the effectiveness of our MEAL method. On ImageNet, our ResNet-50 based MEAL achieves top-1/5 21.79%/5.99% val error, which outperforms the original model by 2.06%/1.14%. On iMaterialist Challenge Dataset, our MEAL obtains a remarkable improvement of top-3 1.15% (official evaluation metric) on a strong baseline model of ResNet-101. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08520v1 |
https://arxiv.org/pdf/1908.08520v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-based-knowledge-distillation-for |
Repo | |
Framework | |
Motion Planning through Demonstration to Deal with Complex Motions in Assembly Process
Title | Motion Planning through Demonstration to Deal with Complex Motions in Assembly Process |
Authors | Yan Wang, Kensuke Harada, Weiwei Wan |
Abstract | Complex and skillful motions in actual assembly process are challenging for the robot to generate with existing motion planning approaches, because some key poses during the human assembly can be too skillful for the robot to realize automatically. In order to deal with this problem, this paper develops a motion planning method using skillful motions from demonstration, which can be applied to complete robotic assembly process including complex and skillful motions. In order to demonstrate conveniently without redundant third-party devices, we attach augmented reality (AR) markers to the manipulated object to track and capture poses of the object during the human assembly process, which are employed as key poses to execute motion planning by the planner. Derivative of every key pose serves as criterion to determine the priority of use of key poses in order to accelerate the motion planning. The effectiveness of the presented method is verified through some numerical examples and actual robot experiments. |
Tasks | Motion Planning |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01821v1 |
https://arxiv.org/pdf/1910.01821v1.pdf | |
PWC | https://paperswithcode.com/paper/motion-planning-through-demonstration-to-deal |
Repo | |
Framework | |
S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition
Title | S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition |
Authors | Hyungtae Lee, Sungmin Eum, Heesung Kwon |
Abstract | We present a novel event recognition approach called Spatially-preserved Doubly-injected Object Detection CNN (S-DOD-CNN), which incorporates the spatially preserved object detection information in both a direct and an indirect way. Indirect injection is carried out by simply sharing the weights between the object detection modules and the event recognition module. Meanwhile, our novelty lies in the fact that we have preserved the spatial information for the direct injection. Once multiple regions-of-intereset (RoIs) are acquired, their feature maps are computed and then projected onto a spatially-preserving combined feature map using one of the four RoI Projection approaches we present. In our architecture, combined feature maps are generated for object detection which are directly injected to the event recognition module. Our method provides the state-of-the-art accuracy for malicious event recognition. |
Tasks | Object Detection |
Published | 2019-02-11 |
URL | https://arxiv.org/abs/1902.04051v2 |
https://arxiv.org/pdf/1902.04051v2.pdf | |
PWC | https://paperswithcode.com/paper/s-dod-cnn-doubly-injecting-spatially |
Repo | |
Framework | |
The importance of space and time in neuromorphic cognitive agents
Title | The importance of space and time in neuromorphic cognitive agents |
Authors | Giacomo Indiveri, Yulia Sandamirskaya |
Abstract | Artificial neural networks and computational neuroscience models have made tremendous progress, allowing computers to achieve impressive results in artificial intelligence (AI) applications, such as image recognition, natural language processing, or autonomous driving. Despite this remarkable progress, biological neural systems consume orders of magnitude less energy than today’s artificial neural networks and are much more agile and adaptive. This efficiency and adaptivity gap is partially explained by the computing substrate of biological neural processing systems that is fundamentally different from the way today’s computers are built. Biological systems use in-memory computing elements operating in a massively parallel way rather than time-multiplexed computing units that are reused in a sequential fashion. Moreover, activity of biological neurons follows continuous-time dynamics in real, physical time, instead of operating on discrete temporal cycles abstracted away from real-time. Here, we present neuromorphic processing devices that emulate the biological style of processing by using parallel instances of mixed-signal analog/digital circuits that operate in real time. We argue that this approach brings significant advantages in efficiency of computation. We show examples of embodied neuromorphic agents that use such devices to interact with the environment and exhibit autonomous learning. |
Tasks | Autonomous Driving |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09791v1 |
http://arxiv.org/pdf/1902.09791v1.pdf | |
PWC | https://paperswithcode.com/paper/the-importance-of-space-and-time-in |
Repo | |
Framework | |
Entity Extraction with Knowledge from Web Scale Corpora
Title | Entity Extraction with Knowledge from Web Scale Corpora |
Authors | Zeyi Wen, Zeyu Huang, Rui Zhang |
Abstract | Entity extraction is an important task in text mining and natural language processing. A popular method for entity extraction is by comparing substrings from free text against a dictionary of entities. In this paper, we present several techniques as a post-processing step for improving the effectiveness of the existing entity extraction technique. These techniques utilise models trained with the web-scale corpora which makes our techniques robust and versatile. Experiments show that our techniques bring a notable improvement on efficiency and effectiveness. |
Tasks | Entity Extraction |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09373v1 |
https://arxiv.org/pdf/1911.09373v1.pdf | |
PWC | https://paperswithcode.com/paper/entity-extraction-with-knowledge-from-web |
Repo | |
Framework | |