Paper Group ANR 45
Context Spaces as the Cornerstone of a Near-Transparent & Self-Reorganizing Semantic Desktop. Semantic Image Retrieval by Uniting Deep Neural Networks and Cognitive Architectures. Semi-parametric Image Synthesis. Transparency and Explanation in Deep Reinforcement Learning Neural Networks. Oblique Stripe Removal in Remote Sensing Images via Oriented …
Context Spaces as the Cornerstone of a Near-Transparent & Self-Reorganizing Semantic Desktop
Title | Context Spaces as the Cornerstone of a Near-Transparent & Self-Reorganizing Semantic Desktop |
Authors | Christian Jilek, Markus Schröder, Sven Schwarz, Heiko Maus, Andreas Dengel |
Abstract | Existing Semantic Desktops are still reproached for being too complicated to use or not scaling well. Besides, a real “killer app” is still missing. In this paper, we present a new prototype inspired by NEPOMUK and its successors having a semantic graph and ontologies as its basis. In addition, we introduce the idea of context spaces that users can directly interact with and work on. To make them available in all applications without further ado, the system is transparently integrated using mostly standard protocols complemented by a sidebar for advanced features. By exploiting collected context information and applying Managed Forgetting features (like hiding, condensation or deletion), the system is able to dynamically reorganize itself, which also includes a kind of tidy-up-itself functionality. We therefore expect it to be more scalable while providing new levels of user support. An early prototype has been implemented and is presented in this demo. |
Tasks | |
Published | 2018-05-06 |
URL | http://arxiv.org/abs/1805.02181v1 |
http://arxiv.org/pdf/1805.02181v1.pdf | |
PWC | https://paperswithcode.com/paper/context-spaces-as-the-cornerstone-of-a-near |
Repo | |
Framework | |
Semantic Image Retrieval by Uniting Deep Neural Networks and Cognitive Architectures
Title | Semantic Image Retrieval by Uniting Deep Neural Networks and Cognitive Architectures |
Authors | Alexey Potapov, Innokentii Zhdanov, Oleg Scherbakov, Nikolai Skorobogatko, Hugo Latapie, Enzo Fenoglio |
Abstract | Image and video retrieval by their semantic content has been an important and challenging task for years, because it ultimately requires bridging the symbolic/subsymbolic gap. Recent successes in deep learning enabled detection of objects belonging to many classes greatly outperforming traditional computer vision techniques. However, deep learning solutions capable of executing retrieval queries are still not available. We propose a hybrid solution consisting of a deep neural network for object detection and a cognitive architecture for query execution. Specifically, we use YOLOv2 and OpenCog. Queries allowing the retrieval of video frames containing objects of specified classes and specified spatial arrangement are implemented. |
Tasks | Image Retrieval, Object Detection, Video Retrieval |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.06946v1 |
http://arxiv.org/pdf/1806.06946v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-image-retrieval-by-uniting-deep |
Repo | |
Framework | |
Semi-parametric Image Synthesis
Title | Semi-parametric Image Synthesis |
Authors | Xiaojuan Qi, Qifeng Chen, Jiaya Jia, Vladlen Koltun |
Abstract | We present a semi-parametric approach to photographic image synthesis from semantic layouts. The approach combines the complementary strengths of parametric and nonparametric techniques. The nonparametric component is a memory bank of image segments constructed from a training set of images. Given a novel semantic layout at test time, the memory bank is used to retrieve photographic references that are provided as source material to a deep network. The synthesis is performed by a deep network that draws on the provided photographic material. Experiments on multiple semantic segmentation datasets show that the presented approach yields considerably more realistic images than recent purely parametric techniques. The results are shown in the supplementary video at https://youtu.be/U4Q98lenGLQ |
Tasks | Image Generation, Image-to-Image Translation, Semantic Segmentation |
Published | 2018-04-29 |
URL | http://arxiv.org/abs/1804.10992v1 |
http://arxiv.org/pdf/1804.10992v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-parametric-image-synthesis |
Repo | |
Framework | |
Transparency and Explanation in Deep Reinforcement Learning Neural Networks
Title | Transparency and Explanation in Deep Reinforcement Learning Neural Networks |
Authors | Rahul Iyer, Yuezhang Li, Huao Li, Michael Lewis, Ramitha Sundar, Katia Sycara |
Abstract | Autonomous AI systems will be entering human society in the near future to provide services and work alongside humans. For those systems to be accepted and trusted, the users should be able to understand the reasoning process of the system, i.e. the system should be transparent. System transparency enables humans to form coherent explanations of the system’s decisions and actions. Transparency is important not only for user trust, but also for software debugging and certification. In recent years, Deep Neural Networks have made great advances in multiple application areas. However, deep neural networks are opaque. In this paper, we report on work in transparency in Deep Reinforcement Learning Networks (DRLN). Such networks have been extremely successful in accurately learning action control in image input domains, such as Atari games. In this paper, we propose a novel and general method that (a) incorporates explicit object recognition processing into deep reinforcement learning models, (b) forms the basis for the development of “object saliency maps”, to provide visualization of internal states of DRLNs, thus enabling the formation of explanations and (c) can be incorporated in any existing deep reinforcement learning framework. We present computational results and human experiments to evaluate our approach. |
Tasks | Atari Games, Object Recognition |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06061v1 |
http://arxiv.org/pdf/1809.06061v1.pdf | |
PWC | https://paperswithcode.com/paper/transparency-and-explanation-in-deep |
Repo | |
Framework | |
Oblique Stripe Removal in Remote Sensing Images via Oriented Variation
Title | Oblique Stripe Removal in Remote Sensing Images via Oriented Variation |
Authors | Xinxin Liu, Xiliang Lu, Huanfeng Shen, Qiangqiang Yuan, Liangpei Zhang |
Abstract | Destriping is a classical problem in remote sensing image processing. Although considerable effort has been made to remove stripes, few of the existing methods can eliminate stripe noise with arbitrary orientations. This situation makes the removal of oblique stripes in the higher-level remote sensing products become an unfinished and urgent issue. To overcome the challenging problem, we propose a novel destriping model which is self-adjusted to different orientations of stripe noise. First of all, the oriented variation model is designed to accomplish the stripe orientation approximation. In this model, the stripe direction is automatically estimated and then imbedded into the constraint term to depict the along-stripe smoothness of the stripe component. Mainly based on the oriented variation model, a whole destriping framework is proposed by jointly employing an L1-norm constraint and a TV regularization to separately capture the global distribution property of stripe component and the piecewise smoothness of the clean image. The qualitative and quantitative experimental results of both orientation and destriping aspects confirm the effectiveness and stability of the proposed method. |
Tasks | |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02043v1 |
http://arxiv.org/pdf/1809.02043v1.pdf | |
PWC | https://paperswithcode.com/paper/oblique-stripe-removal-in-remote-sensing |
Repo | |
Framework | |
Producing radiologist-quality reports for interpretable artificial intelligence
Title | Producing radiologist-quality reports for interpretable artificial intelligence |
Authors | William Gale, Luke Oakden-Rayner, Gustavo Carneiro, Andrew P Bradley, Lyle J Palmer |
Abstract | Current approaches to explaining the decisions of deep learning systems for medical tasks have focused on visualising the elements that have contributed to each decision. We argue that such approaches are not enough to “open the black box” of medical decision making systems because they are missing a key component that has been used as a standard communication tool between doctors for centuries: language. We propose a model-agnostic interpretability method that involves training a simple recurrent neural network model to produce descriptive sentences to clarify the decision of deep learning classifiers. We test our method on the task of detecting hip fractures from frontal pelvic x-rays. This process requires minimal additional labelling despite producing text containing elements that the original deep learning classification model was not specifically trained to detect. The experimental results show that: 1) the sentences produced by our method consistently contain the desired information, 2) the generated sentences are preferred by doctors compared to current tools that create saliency maps, and 3) the combination of visualisations and generated text is better than either alone. |
Tasks | Decision Making |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00340v1 |
http://arxiv.org/pdf/1806.00340v1.pdf | |
PWC | https://paperswithcode.com/paper/producing-radiologist-quality-reports-for |
Repo | |
Framework | |
Direct Sparse Odometry with Rolling Shutter
Title | Direct Sparse Odometry with Rolling Shutter |
Authors | David Schubert, Nikolaus Demmel, Vladyslav Usenko, Jörg Stückler, Daniel Cremers |
Abstract | Neglecting the effects of rolling-shutter cameras for visual odometry (VO) severely degrades accuracy and robustness. In this paper, we propose a novel direct monocular VO method that incorporates a rolling-shutter model. Our approach extends direct sparse odometry which performs direct bundle adjustment of a set of recent keyframe poses and the depths of a sparse set of image points. We estimate the velocity at each keyframe and impose a constant-velocity prior for the optimization. In this way, we obtain a near real-time, accurate direct VO method. Our approach achieves improved results on challenging rolling-shutter sequences over state-of-the-art global-shutter VO. |
Tasks | Visual Odometry |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00558v1 |
http://arxiv.org/pdf/1808.00558v1.pdf | |
PWC | https://paperswithcode.com/paper/direct-sparse-odometry-with-rolling-shutter |
Repo | |
Framework | |
Estimation of Individual Treatment Effect in Latent Confounder Models via Adversarial Learning
Title | Estimation of Individual Treatment Effect in Latent Confounder Models via Adversarial Learning |
Authors | Changhee Lee, Nicholas Mastronarde, Mihaela van der Schaar |
Abstract | Estimating the individual treatment effect (ITE) from observational data is essential in medicine. A central challenge in estimating the ITE is handling confounders, which are factors that affect both an intervention and its outcome. Most previous work relies on the unconfoundedness assumption, which posits that all the confounders are measured in the observational data. However, if there are unmeasurable (latent) confounders, then confounding bias is introduced. Fortunately, noisy proxies for the latent confounders are often available and can be used to make an unbiased estimate of the ITE. In this paper, we develop a novel adversarial learning framework to make unbiased estimates of the ITE using noisy proxies. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08943v1 |
http://arxiv.org/pdf/1811.08943v1.pdf | |
PWC | https://paperswithcode.com/paper/estimation-of-individual-treatment-effect-in |
Repo | |
Framework | |
Adversarial Learning-Based On-Line Anomaly Monitoring for Assured Autonomy
Title | Adversarial Learning-Based On-Line Anomaly Monitoring for Assured Autonomy |
Authors | Naman Patel, Apoorva Nandini Saridena, Anna Choromanska, Prashanth Krishnamurthy, Farshad Khorrami |
Abstract | The paper proposes an on-line monitoring framework for continuous real-time safety/security in learning-based control systems (specifically application to a unmanned ground vehicle). We monitor validity of mappings from sensor inputs to actuator commands, controller-focused anomaly detection (CFAM), and from actuator commands to sensor inputs, system-focused anomaly detection (SFAM). CFAM is an image conditioned energy based generative adversarial network (EBGAN) in which the energy based discriminator distinguishes between proper and anomalous actuator commands. SFAM is based on an action condition video prediction framework to detect anomalies between predicted and observed temporal evolution of sensor data. We demonstrate the effectiveness of the approach on our autonomous ground vehicle for indoor environments and on Udacity dataset for outdoor environments. |
Tasks | Anomaly Detection, Video Prediction |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04539v1 |
http://arxiv.org/pdf/1811.04539v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-learning-based-on-line-anomaly |
Repo | |
Framework | |
Improving Subseasonal Forecasting in the Western U.S. with Machine Learning
Title | Improving Subseasonal Forecasting in the Western U.S. with Machine Learning |
Authors | Jessica Hwang, Paulo Orenstein, Judah Cohen, Karl Pfeiffer, Lester Mackey |
Abstract | Water managers in the western United States (U.S.) rely on longterm forecasts of temperature and precipitation to prepare for droughts and other wet weather extremes. To improve the accuracy of these longterm forecasts, the U.S. Bureau of Reclamation and the National Oceanic and Atmospheric Administration (NOAA) launched the Subseasonal Climate Forecast Rodeo, a year-long real-time forecasting challenge in which participants aimed to skillfully predict temperature and precipitation in the western U.S. two to four weeks and four to six weeks in advance. Here we present and evaluate our machine learning approach to the Rodeo and release our SubseasonalRodeo dataset, collected to train and evaluate our forecasting system. Our system is an ensemble of two regression models. The first integrates the diverse collection of meteorological measurements and dynamic model forecasts in the SubseasonalRodeo dataset and prunes irrelevant predictors using a customized multitask model selection procedure. The second uses only historical measurements of the target variable (temperature or precipitation) and introduces multitask nearest neighbor features into a weighted local linear regression. Each model alone is significantly more accurate than the debiased operational U.S. Climate Forecasting System (CFSv2), and our ensemble skill exceeds that of the top Rodeo competitor for each target variable and forecast horizon. Moreover, over 2011-2018, an ensemble of our regression models and debiased CFSv2 improves debiased CFSv2 skill by 40-50% for temperature and 129-169% for precipitation. We hope that both our dataset and our methods will help to advance the state of the art in subseasonal forecasting. |
Tasks | Model Selection |
Published | 2018-09-19 |
URL | https://arxiv.org/abs/1809.07394v3 |
https://arxiv.org/pdf/1809.07394v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-subseasonal-forecasting-in-the |
Repo | |
Framework | |
Large-scale Speaker Retrieval on Random Speaker Variability Subspace
Title | Large-scale Speaker Retrieval on Random Speaker Variability Subspace |
Authors | Suwon Shon, Younggun Lee, Taesu Kim |
Abstract | This paper describes a fast speaker search system to retrieve segments of the same voice identity in the large-scale data. A recent study shows that Locality Sensitive Hashing (LSH) enables quick retrieval of a relevant voice in the large-scale data in conjunction with i-vector while maintaining accuracy. In this paper, we proposed Random Speaker-variability Subspace (RSS) projection to map a data into LSH based hash tables. We hypothesized that rather than projecting on completely random subspace without considering data, projecting on randomly generated speaker variability space would give more chance to put the same speaker representation into the same hash bins, so we can use less number of hash tables. Multiple RSS can be generated by randomly selecting a subset of speakers from a large speaker cohort. From the experimental result, the proposed approach shows 100 times and 7 times faster than the linear search and LSH, respectively |
Tasks | |
Published | 2018-11-27 |
URL | https://arxiv.org/abs/1811.10812v2 |
https://arxiv.org/pdf/1811.10812v2.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-speaker-retrieval-on-random |
Repo | |
Framework | |
Sounderfeit: Cloning a Physical Model with Conditional Adversarial Autoencoders
Title | Sounderfeit: Cloning a Physical Model with Conditional Adversarial Autoencoders |
Authors | Stephen Sinclair |
Abstract | An adversarial autoencoder conditioned on known parameters of a physical modeling bowed string synthesizer is evaluated for use in parameter estimation and resynthesis tasks. Latent dimensions are provided to capture variance not explained by the conditional parameters. Results are compared with and without the adversarial training, and a system capable of “copying” a given parameter-signal bidirectional relationship is examined. A real-time synthesis system built on a generative, conditioned and regularized neural network is presented, allowing to construct engaging sound synthesizers based purely on recorded data. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.08008v1 |
http://arxiv.org/pdf/1802.08008v1.pdf | |
PWC | https://paperswithcode.com/paper/sounderfeit-cloning-a-physical-model-with |
Repo | |
Framework | |
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training
Title | GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training |
Authors | Mingchao Yu, Zhifeng Lin, Krishna Narra, Songze Li, Youjie Li, Nam Sung Kim, Alexander Schwing, Murali Annavaram, Salman Avestimehr |
Abstract | Data parallelism can boost the training speed of convolutional neural networks (CNN), but could suffer from significant communication costs caused by gradient aggregation. To alleviate this problem, several scalar quantization techniques have been developed to compress the gradients. But these techniques could perform poorly when used together with decentralized aggregation protocols like ring all-reduce (RAR), mainly due to their inability to directly aggregate compressed gradients. In this paper, we empirically demonstrate the strong linear correlations between CNN gradients, and propose a gradient vector quantization technique, named GradiVeQ, to exploit these correlations through principal component analysis (PCA) for substantial gradient dimension reduction. GradiVeQ enables direct aggregation of compressed gradients, hence allows us to build a distributed learning system that parallelizes GradiVeQ gradient compression and RAR communications. Extensive experiments on popular CNNs demonstrate that applying GradiVeQ slashes the wall-clock gradient aggregation time of the original RAR by more than 5X without noticeable accuracy loss, and reduces the end-to-end training time by almost 50%. The results also show that GradiVeQ is compatible with scalar quantization techniques such as QSGD (Quantized SGD), and achieves a much higher speed-up gain under the same compression ratio. |
Tasks | Dimensionality Reduction, Quantization |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03617v2 |
http://arxiv.org/pdf/1811.03617v2.pdf | |
PWC | https://paperswithcode.com/paper/gradiveq-vector-quantization-for-bandwidth |
Repo | |
Framework | |
Matheuristics to optimize maintenance scheduling and refueling of nuclear power plants
Title | Matheuristics to optimize maintenance scheduling and refueling of nuclear power plants |
Authors | Nicolas Dupin, El-Ghazali Talbi |
Abstract | Scheduling the maintenances of nuclear power plants is a complex optimization problem, formulated in 2-stage stochastic programming for the EURO/ROADEF 2010 challenge. The first level optimizes the maintenance dates and refueling decisions. The second level optimizes the production to fulfill the power demands and to ensure feasibility and costs of the first stage decisions. This paper solves a deterministic version of the problem, studying Mixed Integer Programming (MIP) formulations and matheuristics. Relaxing only two sets of constraints of the ROADEF challenge, a MIP formulation can be written using only binary variables for the maintenance dates. The MIP formulations are used to design constructive matheuristics and a Variable Neighborhood Descent (VND) local search. These matheuristics produce very high quality solutions. Some intermediate results explains results of the Challenge: the relaxation of constraints CT6 are justified and neighborhood analyses with MIP-VND justifies the choice of neighborhoods to implement for the problem. Lastly, an extension with stability costs for monthly reoptimization is considered, with efficient bi-objective matheuristics. |
Tasks | |
Published | 2018-12-13 |
URL | https://arxiv.org/abs/1812.08598v2 |
https://arxiv.org/pdf/1812.08598v2.pdf | |
PWC | https://paperswithcode.com/paper/matheuristics-to-optimize-maintenance |
Repo | |
Framework | |
Task-Oriented Hand Motion Retargeting for Dexterous Manipulation Imitation
Title | Task-Oriented Hand Motion Retargeting for Dexterous Manipulation Imitation |
Authors | Dafni Antotsiou, Guillermo Garcia-Hernando, Tae-Kyun Kim |
Abstract | Human hand actions are quite complex, especially when they involve object manipulation, mainly due to the high dimensionality of the hand and the vast action space that entails. Imitating those actions with dexterous hand models involves different important and challenging steps: acquiring human hand information, retargeting it to a hand model, and learning a policy from acquired data. In this work, we capture the hand information by using a state-of-the-art hand pose estimator. We tackle the retargeting problem from the hand pose to a 29 DoF hand model by combining inverse kinematics and PSO with a task objective optimisation. This objective encourages the virtual hand to accomplish the manipulation task, relieving the effect of the estimator’s noise and the domain gap. Our approach leads to a better success rate in the grasping task compared to our inverse kinematics baseline, allowing us to record successful human demonstrations. Furthermore, we used these demonstrations to learn a policy network using generative adversarial imitation learning (GAIL) that is able to autonomously grasp an object in the virtual space. |
Tasks | Imitation Learning |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01845v1 |
http://arxiv.org/pdf/1810.01845v1.pdf | |
PWC | https://paperswithcode.com/paper/task-oriented-hand-motion-retargeting-for |
Repo | |
Framework | |