January 27, 2020

3110 words 15 mins read

Paper Group ANR 1321

Attention Guided Anomaly Localization in Images. Adversarial FDI Attack against AC State Estimation with ANN. Object-Centric Task and Motion Planning in Dynamic Environments. Texture Underfitting for Domain Adaptation. Towards White-box Benchmarks for Algorithm Control. Experience Replay Optimization. Deep Parametric Indoor Lighting Estimation. Who …

Attention Guided Anomaly Localization in Images


Title	Attention Guided Anomaly Localization in Images
Authors	Shashanka Venkataramanan, Kuan-Chuan Peng, Rajat Vikram Singh, Abhijit Mahalanobis
Abstract	Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific threshold to localize anomalies. Without the need of anomalous training images, we propose Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information. In the unsupervised setting, we propose an attention expansion loss where we encourage CAVGA to focus on all normal regions in the image. Furthermore, in the weakly-supervised setting we propose a complementary guided attention loss, where we encourage the attention map to focus on all normal regions while minimizing the attention map corresponding to anomalous regions in the image. CAVGA outperforms the state-of-the-art (SOTA) anomaly localization methods on MVTec Anomaly Detection (MVTAD), modified ShanghaiTech Campus (mSTC) and Large-scale Attention based Glaucoma (LAG) datasets in the unsupervised setting and when using only 2% anomalous images in the weakly-supervised setting. CAVGA also outperforms SOTA anomaly detection methods on the MNIST, CIFAR-10, Fashion-MNIST, MVTAD, mSTC and LAG datasets.
Tasks	Anomaly Detection
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08616v2
PDF	https://arxiv.org/pdf/1911.08616v2.pdf
PWC	https://paperswithcode.com/paper/attention-guided-anomaly-detection-and
Repo
Framework

Adversarial FDI Attack against AC State Estimation with ANN


Title	Adversarial FDI Attack against AC State Estimation with ANN
Authors	Tian Liu, Tao Shu
Abstract	Artificial neural network (ANN) provides superior accuracy for nonlinear alternating current (AC) state estimation (SE) in smart grid over traditional methods. However, research has discovered that ANN could be easily fooled by adversarial examples. In this paper, we initiate a new study of adversarial false data injection (FDI) attack against AC SE with ANN: by injecting a deliberate attack vector into measurements, the attacker can degrade the accuracy of ANN SE while remaining undetected. We propose a population-based algorithm and a gradient-based algorithm to generate attack vectors. The performance of these algorithms is evaluated through simulations on IEEE 9-bus, 14-bus and 30-bus systems under various attack scenarios. Simulation results show that DE is more effective than SLSQP on all simulation cases. The attack examples generated by DE algorithm successfully degrade the ANN SE accuracy with high probability.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11328v1
PDF	https://arxiv.org/pdf/1906.11328v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-fdi-attack-against-ac-state
Repo
Framework

Object-Centric Task and Motion Planning in Dynamic Environments


Title	Object-Centric Task and Motion Planning in Dynamic Environments
Authors	Toki Migimatsu, Jeannette Bohg
Abstract	We address the problem of applying Task and Motion Planning (TAMP) in real world environments. TAMP combines symbolic and geometric reasoning to produce sequential manipulation plans, typically specified as joint-space trajectories, which are valid only as long as the environment is static and perception and control are highly accurate. In case of any changes in the environment, slow re-planning is required. We propose a TAMP algorithm that optimizes over Cartesian frames defined relative to target objects. The resulting plan then remains valid even if the objects are moving and can be executed by reactive controllers that adapt to these changes in real time. We apply our TAMP framework to a torque-controlled robot in a pick and place setting and demonstrate its ability to adapt to changing environments, inaccurate perception, and imprecise control, both in simulation and the real world.
Tasks	Motion Planning
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04679v2
PDF	https://arxiv.org/pdf/1911.04679v2.pdf
PWC	https://paperswithcode.com/paper/object-centric-task-and-motion-planning-in
Repo
Framework

Texture Underfitting for Domain Adaptation


Title	Texture Underfitting for Domain Adaptation
Authors	Jan-Nico Zaech, Dengxin Dai, Martin Hahner, Luc Van Gool
Abstract	Comprehensive semantic segmentation is one of the key components for robust scene understanding and a requirement to enable autonomous driving. Driven by large scale datasets, convolutional neural networks show impressive results on this task. However, a segmentation algorithm generalizing to various scenes and conditions would require an enormously diverse dataset, making the labour intensive data acquisition and labeling process prohibitively expensive. Under the assumption of structural similarities between segmentation maps, domain adaptation promises to resolve this challenge by transferring knowledge from existing, potentially simulated datasets to new environments where no supervision exists. While the performance of this approach is contingent on the concept that neural networks learn a high level understanding of scene structure, recent work suggests that neural networks are biased towards overfitting to texture instead of learning structural and shape information. Considering the ideas underlying semantic segmentation, we employ random image stylization to augment the training dataset and propose a training procedure that facilitates texture underfitting to improve the performance of domain adaptation. In experiments with supervised as well as unsupervised methods for the task of synthetic-to-real domain adaptation, we show that our approach outperforms conventional training methods.
Tasks	Autonomous Driving, Domain Adaptation, Image Stylization, Scene Understanding, Semantic Segmentation
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11215v1
PDF	https://arxiv.org/pdf/1908.11215v1.pdf
PWC	https://paperswithcode.com/paper/texture-underfitting-for-domain-adaptation
Repo
Framework

Towards White-box Benchmarks for Algorithm Control


Title	Towards White-box Benchmarks for Algorithm Control
Authors	André Biedenkapp, H. Furkan Bozkurt, Frank Hutter, Marius Lindauer
Abstract	The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on tuned hyperparameter configurations. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However there is still a lot of untapped potential through adjusting an algorithm’s hyperparameters online since different hyperparameters are potentially optimal at different stages of the algorithm. We formulate the problem of adjusting an algorithm’s hyperparameters for a given instance on the fly as a contextual MDP, making reinforcement learning (RL) the prime candidate to solve the resulting algorithm control problem in a data-driven way. Furthermore, inspired by applications of algorithm configuration, we introduce new white-box benchmarks suitable to study algorithm control. We show that on short sequences, algorithm configuration is a valid choice, but that with increasing sequence length a black-box view on the problem quickly becomes infeasible and RL performs better.
Tasks
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07644v2
PDF	https://arxiv.org/pdf/1906.07644v2.pdf
PWC	https://paperswithcode.com/paper/towards-white-box-benchmarks-for-algorithm
Repo
Framework

Experience Replay Optimization


Title	Experience Replay Optimization
Authors	Daochen Zha, Kwei-Herng Lai, Kaixiong Zhou, Xia Hu
Abstract	Experience replay enables reinforcement learning agents to memorize and reuse past experiences, just as humans replay memories for the situation at hand. Contemporary off-policy algorithms either replay past experiences uniformly or utilize a rule-based replay strategy, which may be sub-optimal. In this work, we consider learning a replay policy to optimize the cumulative reward. Replay learning is challenging because the replay memory is noisy and large, and the cumulative reward is unstable. To address these issues, we propose a novel experience replay optimization (ERO) framework which alternately updates two policies: the agent policy, and the replay policy. The agent is updated to maximize the cumulative reward based on the replayed data, while the replay policy is updated to provide the agent with the most useful experiences. The conducted experiments on various continuous control tasks demonstrate the effectiveness of ERO, empirically showing promise in experience replay learning to improve the performance of off-policy reinforcement learning algorithms.
Tasks	Continuous Control
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08387v1
PDF	https://arxiv.org/pdf/1906.08387v1.pdf
PWC	https://paperswithcode.com/paper/experience-replay-optimization
Repo
Framework

Deep Parametric Indoor Lighting Estimation


Title	Deep Parametric Indoor Lighting Estimation
Authors	Marc-André Gardner, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Christian Gagné, Jean-François Lalonde
Abstract	We present a method to estimate lighting from a single image of an indoor scene. Previous work has used an environment map representation that does not account for the localized nature of indoor lighting. Instead, we represent lighting as a set of discrete 3D lights with geometric and photometric parameters. We train a deep neural network to regress these parameters from a single image, on a dataset of environment maps annotated with depth. We propose a differentiable layer to convert these parameters to an environment map to compute our loss; this bypasses the challenge of establishing correspondences between estimated and ground truth lights. We demonstrate, via quantitative and qualitative evaluations, that our representation and training scheme lead to more accurate results compared to previous work, while allowing for more realistic 3D object compositing with spatially-varying lighting.
Tasks
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08812v1
PDF	https://arxiv.org/pdf/1910.08812v1.pdf
PWC	https://paperswithcode.com/paper/deep-parametric-indoor-lighting-estimation
Repo
Framework

Who wants accurate models? Arguing for a different metrics to take classification models seriously


Title	Who wants accurate models? Arguing for a different metrics to take classification models seriously
Authors	Federico Cabitza, Andrea Campagner
Abstract	With the increasing availability of AI-based decision support, there is an increasing need for their certification by both AI manufacturers and notified bodies, as well as the pragmatic (real-world) validation of these systems. Therefore, there is the need for meaningful and informative ways to assess the performance of AI systems in clinical practice. Common metrics (like accuracy scores and areas under the ROC curve) have known problems and they do not take into account important information about the preferences of clinicians and the needs of their specialist practice, like the likelihood and impact of errors and the complexity of cases. In this paper, we present a new accuracy measure, the H-accuracy (Ha), which we claim is more informative in the medical domain (and others of similar needs) for the elements it encompasses. We also provide proof that the H-accuracy is a generalization of the balanced accuracy and establish a relation between the H-accuracy and the Net Benefit. Finally, we illustrate an experimentation in two user studies to show the descriptive power of the Ha score and how complementary and differently informative measures can be derived from its formulation (a Python script to compute Ha is also made available).
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09246v2
PDF	https://arxiv.org/pdf/1910.09246v2.pdf
PWC	https://paperswithcode.com/paper/who-wants-accurate-models-arguing-for-a
Repo
Framework


Title	The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation
Authors	Hermann Blum, Paul-Edouard Sarlin, Juan Nieto, Roland Siegwart, Cesar Cadena
Abstract	Deep learning has enabled impressive progress in the accuracy of semantic segmentation. Yet, the ability to estimate uncertainty and detect failure is key for safety-critical applications like autonomous driving. Existing uncertainty estimates have mostly been evaluated on simple tasks, and it is unclear whether these methods generalize to more complex scenarios. We present Fishyscapes, the first public benchmark for uncertainty estimation in a real-world task of semantic segmentation for urban driving. It evaluates pixel-wise uncertainty estimates towards the detection of anomalous objects in front of the vehicle. We~adapt state-of-the-art methods to recent semantic segmentation models and compare approaches based on softmax confidence, Bayesian learning, and embedding density. Our results show that anomaly detection is far from solved even for ordinary situations, while our benchmark allows measuring advancements beyond the state-of-the-art.
Tasks	Anomaly Detection, Autonomous Driving, Semantic Segmentation
Published	2019-04-05
URL	https://arxiv.org/abs/1904.03215v3
PDF	https://arxiv.org/pdf/1904.03215v3.pdf
PWC	https://paperswithcode.com/paper/the-fishyscapes-benchmark-measuring-blind
Repo
Framework

Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers


Title	Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers
Authors	Yao Ma, Alex Olshevsky, Venkatesh Saligrama, Csaba Szepesvari
Abstract	We consider worker skill estimation for the single-coin Dawid-Skene crowdsourcing model. In practice, skill-estimation is challenging because worker assignments are sparse and irregular due to the arbitrary and uncontrolled availability of workers. We formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlations between workers. We show that the correlation matrix can be successfully recovered and skills are identifiable if and only if the sampling matrix (observed components) does not have a bipartite connected component. We then propose a projected gradient descent scheme and show that skill estimates converge to the desired global optima for such sampling matrices. Our proof is original and the results are surprising in light of the fact that even the weighted rank-one matrix factorization problem is NP-hard in general. Next, we derive sample complexity bounds in terms of spectral properties of the signless Laplacian of the sampling matrix. Our proposed scheme achieves state-of-art performance on a number of real-world datasets.
Tasks	Matrix Completion
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11608v1
PDF	http://arxiv.org/pdf/1904.11608v1.pdf
PWC	https://paperswithcode.com/paper/gradient-descent-for-sparse-rank-one-matrix-1
Repo
Framework


Title	Adversarial-Based Knowledge Distillation for Multi-Model Ensemble and Noisy Data Refinement
Authors	Zhiqiang Shen, Zhankui He, Wanyun Cui, Jiahui Yu, Yutong Zheng, Chenchen Zhu, Marios Savvides
Abstract	Generic Image recognition is a fundamental and fairly important visual problem in computer vision. One of the major challenges of this task lies in the fact that single image usually has multiple objects inside while the labels are still one-hot, another one is noisy and sometimes missing labels when annotated by humans. In this paper, we focus on tackling these challenges accompanying with two different image recognition problems: multi-model ensemble and noisy data recognition with a unified framework. As is well-known, usually the best performing deep neural models are ensembles of multiple base-level networks, as it can mitigate the variation or noise containing in the dataset. Unfortunately, the space required to store these many networks, and the time required to execute them at runtime, prohibit their use in applications where test sets are large (e.g., ImageNet). In this paper, we present a method for compressing large, complex trained ensembles into a single network, where the knowledge from a variety of trained deep neural networks (DNNs) is distilled and transferred to a single DNN. In order to distill diverse knowledge from different trained (teacher) models, we propose to use adversarial-based learning strategy where we define a block-wise training loss to guide and optimize the predefined student network to recover the knowledge in teacher models, and to promote the discriminator network to distinguish teacher vs. student features simultaneously. Extensive experiments on CIFAR-10/100, SVHN, ImageNet and iMaterialist Challenge Dataset demonstrate the effectiveness of our MEAL method. On ImageNet, our ResNet-50 based MEAL achieves top-1/5 21.79%/5.99% val error, which outperforms the original model by 2.06%/1.14%. On iMaterialist Challenge Dataset, our MEAL obtains a remarkable improvement of top-3 1.15% (official evaluation metric) on a strong baseline model of ResNet-101.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08520v1
PDF	https://arxiv.org/pdf/1908.08520v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-based-knowledge-distillation-for
Repo
Framework

Motion Planning through Demonstration to Deal with Complex Motions in Assembly Process


Title	Motion Planning through Demonstration to Deal with Complex Motions in Assembly Process
Authors	Yan Wang, Kensuke Harada, Weiwei Wan
Abstract	Complex and skillful motions in actual assembly process are challenging for the robot to generate with existing motion planning approaches, because some key poses during the human assembly can be too skillful for the robot to realize automatically. In order to deal with this problem, this paper develops a motion planning method using skillful motions from demonstration, which can be applied to complete robotic assembly process including complex and skillful motions. In order to demonstrate conveniently without redundant third-party devices, we attach augmented reality (AR) markers to the manipulated object to track and capture poses of the object during the human assembly process, which are employed as key poses to execute motion planning by the planner. Derivative of every key pose serves as criterion to determine the priority of use of key poses in order to accelerate the motion planning. The effectiveness of the presented method is verified through some numerical examples and actual robot experiments.
Tasks	Motion Planning
Published	2019-10-04
URL	https://arxiv.org/abs/1910.01821v1
PDF	https://arxiv.org/pdf/1910.01821v1.pdf
PWC	https://paperswithcode.com/paper/motion-planning-through-demonstration-to-deal
Repo
Framework

S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition


Title	S-DOD-CNN: Doubly Injecting Spatially-Preserved Object Information for Event Recognition
Authors	Hyungtae Lee, Sungmin Eum, Heesung Kwon
Abstract	We present a novel event recognition approach called Spatially-preserved Doubly-injected Object Detection CNN (S-DOD-CNN), which incorporates the spatially preserved object detection information in both a direct and an indirect way. Indirect injection is carried out by simply sharing the weights between the object detection modules and the event recognition module. Meanwhile, our novelty lies in the fact that we have preserved the spatial information for the direct injection. Once multiple regions-of-intereset (RoIs) are acquired, their feature maps are computed and then projected onto a spatially-preserving combined feature map using one of the four RoI Projection approaches we present. In our architecture, combined feature maps are generated for object detection which are directly injected to the event recognition module. Our method provides the state-of-the-art accuracy for malicious event recognition.
Tasks	Object Detection
Published	2019-02-11
URL	https://arxiv.org/abs/1902.04051v2
PDF	https://arxiv.org/pdf/1902.04051v2.pdf
PWC	https://paperswithcode.com/paper/s-dod-cnn-doubly-injecting-spatially
Repo
Framework

The importance of space and time in neuromorphic cognitive agents


Title	The importance of space and time in neuromorphic cognitive agents
Authors	Giacomo Indiveri, Yulia Sandamirskaya
Abstract	Artificial neural networks and computational neuroscience models have made tremendous progress, allowing computers to achieve impressive results in artificial intelligence (AI) applications, such as image recognition, natural language processing, or autonomous driving. Despite this remarkable progress, biological neural systems consume orders of magnitude less energy than today’s artificial neural networks and are much more agile and adaptive. This efficiency and adaptivity gap is partially explained by the computing substrate of biological neural processing systems that is fundamentally different from the way today’s computers are built. Biological systems use in-memory computing elements operating in a massively parallel way rather than time-multiplexed computing units that are reused in a sequential fashion. Moreover, activity of biological neurons follows continuous-time dynamics in real, physical time, instead of operating on discrete temporal cycles abstracted away from real-time. Here, we present neuromorphic processing devices that emulate the biological style of processing by using parallel instances of mixed-signal analog/digital circuits that operate in real time. We argue that this approach brings significant advantages in efficiency of computation. We show examples of embodied neuromorphic agents that use such devices to interact with the environment and exhibit autonomous learning.
Tasks	Autonomous Driving
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09791v1
PDF	http://arxiv.org/pdf/1902.09791v1.pdf
PWC	https://paperswithcode.com/paper/the-importance-of-space-and-time-in
Repo
Framework

Entity Extraction with Knowledge from Web Scale Corpora


Title	Entity Extraction with Knowledge from Web Scale Corpora
Authors	Zeyi Wen, Zeyu Huang, Rui Zhang
Abstract	Entity extraction is an important task in text mining and natural language processing. A popular method for entity extraction is by comparing substrings from free text against a dictionary of entities. In this paper, we present several techniques as a post-processing step for improving the effectiveness of the existing entity extraction technique. These techniques utilise models trained with the web-scale corpora which makes our techniques robust and versatile. Experiments show that our techniques bring a notable improvement on efficiency and effectiveness.
Tasks	Entity Extraction
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09373v1
PDF	https://arxiv.org/pdf/1911.09373v1.pdf
PWC	https://paperswithcode.com/paper/entity-extraction-with-knowledge-from-web
Repo
Framework