October 18, 2019

3400 words 16 mins read

Paper Group ANR 658

Neural Network Interpretation via Fine Grained Textual Summarization. DeepMag: Source Specific Motion Magnification Using Gradient Ascent. Image Correction via Deep Reciprocating HDR Transformation. The Interplay between Lexical Resources and Natural Language Processing. Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Superv …

Neural Network Interpretation via Fine Grained Textual Summarization


Title	Neural Network Interpretation via Fine Grained Textual Summarization
Authors	Pei Guo, Connor Anderson, Kolten Pearson, Ryan Farrell
Abstract	Current visualization based network interpretation methodssuffer from lacking semantic-level information. In this paper, we introduce the novel task of interpreting classification models using fine grained textual summarization. Along with the label prediction, the network will generate a sentence explaining its decision. Constructing a fully annotated dataset of filtertext pairs is unrealistic because of image to filter response function complexity. We instead propose a weakly-supervised learning algorithm leveraging off-the-shelf image caption annotations. Central to our algorithm is the filter-level attribute probability density function (p.d.f.), learned as a conditional probability through Bayesian inference with the input image and its feature map as latent variables. We show our algorithm faithfully reflects the features learned by the model using rigorous applications like attribute based image retrieval and unsupervised text grounding. We further show that the textual summarization process can help in understanding network failure patterns and can provide clues for further improvements.
Tasks	Bayesian Inference, Image Retrieval
Published	2018-05-23
URL	http://arxiv.org/abs/1805.08969v2
PDF	http://arxiv.org/pdf/1805.08969v2.pdf
PWC	https://paperswithcode.com/paper/neural-network-interpretation-via-fine
Repo
Framework

DeepMag: Source Specific Motion Magnification Using Gradient Ascent


Title	DeepMag: Source Specific Motion Magnification Using Gradient Ascent
Authors	Weixuan Chen, Daniel McDuff
Abstract	Many important physical phenomena involve subtle signals that are difficult to observe with the unaided eye, yet visualizing them can be very informative. Current motion magnification techniques can reveal these small temporal variations in video, but require precise prior knowledge about the target signal, and cannot deal with interference motions at a similar frequency. We present DeepMag an end-to-end deep neural video-processing framework based on gradient ascent that enables automated magnification of subtle color and motion signals from a specific source, even in the presence of large motions of various velocities. While the approach is generalizable, the advantages of DeepMag are highlighted via the task of video-based physiological visualization. Through systematic quantitative and qualitative evaluation of the approach on videos with different levels of head motion, we compare the magnification of pulse and respiration to existing state-of-the-art methods. Our method produces magnified videos with substantially fewer artifacts and blurring whilst magnifying the physiological changes by a similar degree.
Tasks
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03338v1
PDF	http://arxiv.org/pdf/1808.03338v1.pdf
PWC	https://paperswithcode.com/paper/deepmag-source-specific-motion-magnification
Repo
Framework

Image Correction via Deep Reciprocating HDR Transformation


Title	Image Correction via Deep Reciprocating HDR Transformation
Authors	Xin Yang, Ke Xu, Yibing Song, Qiang Zhang, Xiaopeng Wei, Rynson Lau
Abstract	Image correction aims to adjust an input image into a visually pleasing one. Existing approaches are proposed mainly from the perspective of image pixel manipulation. They are not effective to recover the details in the under/over exposed regions. In this paper, we revisit the image formation procedure and notice that the missing details in these regions exist in the corresponding high dynamic range (HDR) data. These details are well perceived by the human eyes but diminished in the low dynamic range (LDR) domain because of the tone mapping process. Therefore, we formulate the image correction task as an HDR transformation process and propose a novel approach called Deep Reciprocating HDR Transformation (DRHT). Given an input LDR image, we first reconstruct the missing details in the HDR domain. We then perform tone mapping on the predicted HDR data to generate the output LDR image with the recovered details. To this end, we propose a united framework consisting of two CNNs for HDR reconstruction and tone mapping. They are integrated end-to-end for joint training and prediction. Experiments on the standard benchmarks demonstrate that the proposed method performs favorably against state-of-the-art image correction methods.
Tasks
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04371v1
PDF	http://arxiv.org/pdf/1804.04371v1.pdf
PWC	https://paperswithcode.com/paper/image-correction-via-deep-reciprocating-hdr
Repo
Framework

The Interplay between Lexical Resources and Natural Language Processing


Title	The Interplay between Lexical Resources and Natural Language Processing
Authors	Jose Camacho-Collados, Luis Espinosa-Anke, Mohammad Taher Pilehvar
Abstract	Incorporating linguistic, world and common sense knowledge into AI/NLP systems is currently an important research area, with several open problems and challenges. At the same time, processing and storing this knowledge in lexical resources is not a straightforward task. This tutorial proposes to address these complementary goals from two methodological perspectives: the use of NLP methods to help the process of constructing and enriching lexical resources and the use of lexical resources for improving NLP applications. Two main types of audience can benefit from this tutorial: those working on language resources who are interested in becoming acquainted with automatic NLP techniques, with the end goal of speeding and/or easing up the process of resource curation; and on the other hand, researchers in NLP who would like to benefit from the knowledge of lexical resources to improve their systems and models. The slides of the tutorial are available at https://bitbucket.org/luisespinosa/lr-nlp/
Tasks	Common Sense Reasoning
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00571v1
PDF	http://arxiv.org/pdf/1807.00571v1.pdf
PWC	https://paperswithcode.com/paper/the-interplay-between-lexical-resources-and
Repo
Framework

Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation


Title	Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation
Authors	Yunchao Wei, Huaxin Xiao, Honghui Shi, Zequn Jie, Jiashi Feng, Thomas S. Huang
Abstract	Despite the remarkable progress, weakly supervised segmentation approaches are still inferior to their fully supervised counterparts. We obverse the performance gap mainly comes from their limitation on learning to produce high-quality dense object localization maps from image-level supervision. To mitigate such a gap, we revisit the dilated convolution [1] and reveal how it can be utilized in a novel way to effectively overcome this critical limitation of weakly supervised segmentation approaches. Specifically, we find that varying dilation rates can effectively enlarge the receptive fields of convolutional kernels and more importantly transfer the surrounding discriminative information to non-discriminative object regions, promoting the emergence of these regions in the object localization maps. Then, we design a generic classification network equipped with convolutional blocks of different dilated rates. It can produce dense and reliable object localization maps and effectively benefit both weakly- and semi- supervised semantic segmentation. Despite the apparent simplicity, our proposed approach obtains superior performance over state-of-the-arts. In particular, it achieves 60.8% and 67.6% mIoU scores on Pascal VOC 2012 test set in weakly- (only image-level labels are available) and semi- (1,464 segmentation masks are available) supervised settings, which are the new state-of-the-arts.
Tasks	Object Localization, Semantic Segmentation, Semi-Supervised Semantic Segmentation
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04574v2
PDF	http://arxiv.org/pdf/1805.04574v2.pdf
PWC	https://paperswithcode.com/paper/revisiting-dilated-convolution-a-simple-1
Repo
Framework

Geometry-Aware Face Completion and Editing


Title	Geometry-Aware Face Completion and Editing
Authors	Linsen Song, Jie Cao, Linxiao Song, Yibo Hu, Ran He
Abstract	Face completion is a challenging generation task because it requires generating visually pleasing new pixels that are semantically consistent with the unmasked face region. This paper proposes a geometry-aware Face Completion and Editing NETwork (FCENet) by systematically studying facial geometry from the unmasked region. Firstly, a facial geometry estimator is learned to estimate facial landmark heatmaps and parsing maps from the unmasked face image. Then, an encoder-decoder structure generator serves to complete a face image and disentangle its mask areas conditioned on both the masked face image and the estimated facial geometry images. Besides, since low-rank property exists in manually labeled masks, a low-rank regularization term is imposed on the disentangled masks, enforcing our completion network to manage occlusion area with various shape and size. Furthermore, our network can generate diverse results from the same masked input by modifying estimated facial geometry, which provides a flexible mean to edit the completed face appearance. Extensive experimental results qualitatively and quantitatively demonstrate that our network is able to generate visually pleasing face completion results and edit face attributes as well.
Tasks	Facial Inpainting
Published	2018-09-09
URL	http://arxiv.org/abs/1809.02967v2
PDF	http://arxiv.org/pdf/1809.02967v2.pdf
PWC	https://paperswithcode.com/paper/geometry-aware-face-completion-and-editing
Repo
Framework

Making Efficient Use of a Domain Expert’s Time in Relation Extraction


Title	Making Efficient Use of a Domain Expert’s Time in Relation Extraction
Authors	Linara Adilova, Sven Giesselbach, Stefan Rüping
Abstract	Scarcity of labeled data is one of the most frequent problems faced in machine learning. This is particularly true in relation extraction in text mining, where large corpora of texts exists in many application domains, while labeling of text data requires an expert to invest much time to read the documents. Overall, state-of-the art models, like the convolutional neural network used in this paper, achieve great results when trained on large enough amounts of labeled data. However, from a practical point of view the question arises whether this is the most efficient approach when one takes the manual effort of the expert into account. In this paper, we report on an alternative approach where we first construct a relation extraction model using distant supervision, and only later make use of a domain expert to refine the results. Distant supervision provides a mean of labeling data given known relations in a knowledge base, but it suffers from noisy labeling. We introduce an active learning based extension, that allows our neural network to incorporate expert feedback and report on first results on a complex data set.
Tasks	Active Learning, Relation Extraction
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04687v1
PDF	http://arxiv.org/pdf/1807.04687v1.pdf
PWC	https://paperswithcode.com/paper/making-efficient-use-of-a-domain-experts-time
Repo
Framework

Scalable Manifold Learning for Big Data with Apache Spark


Title	Scalable Manifold Learning for Big Data with Apache Spark
Authors	Frank Schoeneman, Jaroslaw Zola
Abstract	Non-linear spectral dimensionality reduction methods, such as Isomap, remain important technique for learning manifolds. However, due to computational complexity, exact manifold learning using Isomap is currently impossible from large-scale data. In this paper, we propose a distributed memory framework implementing end-to-end exact Isomap under Apache Spark model. We show how each critical step of the Isomap algorithm can be efficiently realized using basic Spark model, without the need to provision data in the secondary storage. We show how the entire method can be implemented using PySpark, offloading compute intensive linear algebra routines to BLAS. Through experimental results, we demonstrate excellent scalability of our method, and we show that it can process datasets orders of magnitude larger than what is currently possible, using a 25-node parallel~cluster.
Tasks	Dimensionality Reduction
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10776v1
PDF	http://arxiv.org/pdf/1808.10776v1.pdf
PWC	https://paperswithcode.com/paper/scalable-manifold-learning-for-big-data-with
Repo
Framework

Performance Evaluation of Deep Learning Networks for Semantic Segmentation of Traffic Stereo-Pair Images


Title	Performance Evaluation of Deep Learning Networks for Semantic Segmentation of Traffic Stereo-Pair Images
Authors	Vlad Taran, Nikita Gordienko, Yuriy Kochura, Yuri Gordienko, Alexandr Rokovyi, Oleg Alienin, Sergii Stirenko
Abstract	Semantic image segmentation is one the most demanding task, especially for analysis of traffic conditions for self-driving cars. Here the results of application of several deep learning architectures (PSPNet and ICNet) for semantic image segmentation of traffic stereo-pair images are presented. The images from Cityscapes dataset and custom urban images were analyzed as to the segmentation accuracy and image inference time. For the models pre-trained on Cityscapes dataset, the inference time was equal in the limits of standard deviation, but the segmentation accuracy was different for various cities and stereo channels even. The distributions of accuracy (mean intersection over union - mIoU) values for each city and channel are asymmetric, long-tailed, and have many extreme outliers, especially for PSPNet network in comparison to ICNet network. Some statistical properties of these distributions (skewness, kurtosis) allow us to distinguish these two networks and open the question about relations between architecture of deep learning networks and statistical distribution of the predicted results (mIoU here). The results obtained demonstrated the different sensitivity of these networks to: (1) the local street view peculiarities in different cities that should be taken into account during the targeted fine tuning the models before their practical applications, (2) the right and left data channels in stereo-pairs. For both networks, the difference in the predicted results (mIoU here) for the right and left data channels in stereo-pairs is out of the limits of statistical error in relation to mIoU values. It means that the traffic stereo pairs can be effectively used not only for depth calculations (as it is usually used), but also as an additional data channel that can provide much more information about scene objects than simple duplication of the same street view images.
Tasks	Self-Driving Cars, Semantic Segmentation
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01896v1
PDF	http://arxiv.org/pdf/1806.01896v1.pdf
PWC	https://paperswithcode.com/paper/performance-evaluation-of-deep-learning-1
Repo
Framework

Inseparability and Conservative Extensions of Description Logic Ontologies: A Survey


Title	Inseparability and Conservative Extensions of Description Logic Ontologies: A Survey
Authors	Elena Botoeva, Boris Konev, Carsten Lutz, Vladislav Ryzhikov, Frank Wolter, Michael Zakharyaschev
Abstract	The question whether an ontology can safely be replaced by another, possibly simpler, one is fundamental for many ontology engineering and maintenance tasks. It underpins, for example, ontology versioning, ontology modularization, forgetting, and knowledge exchange. What safe replacement means depends on the intended application of the ontology. If, for example, it is used to query data, then the answers to any relevant ontology-mediated query should be the same over any relevant data set; if, in contrast, the ontology is used for conceptual reasoning, then the entailed subsumptions between concept expressions should coincide. This gives rise to different notions of ontology inseparability such as query inseparability and concept inseparability, which generalize corresponding notions of conservative extensions. We survey results on various notions of inseparability in the context of description logic ontologies, discussing their applications, useful model-theoretic characterizations, algorithms for determining whether two ontologies are inseparable (and, sometimes, for computing the difference between them if they are not), and the computational complexity of this problem.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07805v1
PDF	http://arxiv.org/pdf/1804.07805v1.pdf
PWC	https://paperswithcode.com/paper/inseparability-and-conservative-extensions-of
Repo
Framework

Identity Preserving Face Completion for Large Ocular Region Occlusion


Title	Identity Preserving Face Completion for Large Ocular Region Occlusion
Authors	Yajie Zhao, Weikai Chen, Jun Xing, Xiaoming Li, Zach Bessinger, Fuchang Liu, Wangmeng Zuo, Ruigang Yang
Abstract	We present a novel deep learning approach to synthesize complete face images in the presence of large ocular region occlusions. This is motivated by recent surge of VR/AR displays that hinder face-to-face communications. Different from the state-of-the-art face inpainting methods that have no control over the synthesized content and can only handle frontal face pose, our approach can faithfully recover the missing content under various head poses while preserving the identity. At the core of our method is a novel generative network with dedicated constraints to regularize the synthesis process. To preserve the identity, our network takes an arbitrary occlusion-free image of the target identity to infer the missing content, and its high-level CNN features as an identity prior to regularize the searching space of generator. Since the input reference image may have a different pose, a pose map and a novel pose discriminator are further adopted to supervise the learning of implicit pose transformations. Our method is capable of generating coherent facial inpainting with consistent identity over videos with large variations of head motions. Experiments on both synthesized and real data demonstrate that our method greatly outperforms the state-of-the-art methods in terms of both synthesis quality and robustness.
Tasks	Facial Inpainting
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08772v1
PDF	http://arxiv.org/pdf/1807.08772v1.pdf
PWC	https://paperswithcode.com/paper/identity-preserving-face-completion-for-large
Repo
Framework

On the k-Boundedness for Existential Rules


Title	On the k-Boundedness for Existential Rules
Authors	Stathis Delivorias, Michel Leclere, Marie-Laure Mugnier, Federico Ulliana
Abstract	The chase is a fundamental tool for existential rules. Several chase variants are known, which differ on how they handle redundancies possibly caused by the introduction of nulls. Given a chase variant, the halting problem takes as input a set of existential rules and asks if this set of rules ensures the termination of the chase for any factbase. It is well-known that this problem is undecidable for all known chase variants. The related problem of boundedness asks if a given set of existential rules is bounded, i.e., whether there is a predefined upper bound on the number of (breadth-first) steps of the chase, independently from any factbase. This problem is already undecidable in the specific case of datalog rules. However, knowing that a set of rules is bounded for some chase variant does not help much in practice if the bound is unknown. Hence, in this paper, we investigate the decidability of the k-boundedness problem, which asks whether a given set of rules is bounded by an integer k. We prove that k-boundedness is decidable for three chase variants, namely the oblivious, semi-oblivious and restricted chase.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09304v1
PDF	http://arxiv.org/pdf/1810.09304v1.pdf
PWC	https://paperswithcode.com/paper/on-the-k-boundedness-for-existential-rules
Repo
Framework

RGBD2lux: Dense light intensity estimation with an RGBD sensor


Title	RGBD2lux: Dense light intensity estimation with an RGBD sensor
Authors	Theodore Tsesmelis, Irtiza Hasan, Marco Cristani, Fabio Galasso, Alessio Del Bue
Abstract	Lighting design and modelling or industrial applications like luminaire planning and commissioning rely heavily on time consuming manual measurements or on physically coherent computational simulations. Regarding the latter,standard approaches are based on CAD modeling simulations and offline rendering, with long processing times and therefore inflexible workflows. Thus, in this paper we pro-pose a computer vision based system to measure lighting with just a single RGBD camera. The proposed method uses both depth data and images from the sensor to provide a dense measure of light intensity in the field of view of the camera. We evaluate our system on novel ground truth data and compare it to state-of-the-art commercial light-planning software. Our system provides improved performance, while being completely automated, given that the CAD model is extracted from the depth and the albedo estimated with the support of RGB images. To the best of our knowledge, this is the first automatic framework for the estimation of lighting in general indoor scenarios from RGBDinput.
Tasks
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07558v3
PDF	http://arxiv.org/pdf/1809.07558v3.pdf
PWC	https://paperswithcode.com/paper/rgbd2lux-dense-light-intensity-estimation
Repo
Framework

Controlling the Charging of Electric Vehicles with Neural Networks


Title	Controlling the Charging of Electric Vehicles with Neural Networks
Authors	Martin Pilát
Abstract	We propose and evaluate controllers for the coordination of the charging of electric vehicles. The controllers are based on neural networks and are completely de-centralized, in the sense that the charging current is completely decided by the controller itself. One of the versions of the controllers does not require any outside communication at all. We test controllers based on two different architectures of neural networks - the feed-forward networks and the echo state networks. The networks are optimized by either an evolutionary algorithm (CMA-ES) or by a gradient-based method. The results of the different architectures and the different optimization algorithms are compared in a realistic scenario. We show that the controllers are able to charge the cars while keeping the peak consumptions almost the same as when no charging is performed. Moreover, the controllers fill the valleys of the consumption thus reducing the difference between the maximum and minimum consumption in the grid.
Tasks
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05978v1
PDF	http://arxiv.org/pdf/1804.05978v1.pdf
PWC	https://paperswithcode.com/paper/controlling-the-charging-of-electric-vehicles
Repo
Framework

Detection of Unknown Anomalies in Streaming Videos with Generative Energy-based Boltzmann Models


Title	Detection of Unknown Anomalies in Streaming Videos with Generative Energy-based Boltzmann Models
Authors	Hung Vu, Tu Dinh Nguyen, Dinh Phung
Abstract	Abnormal event detection is one of the important objectives in research and practical applications of video surveillance. However, there are still three challenging problems for most anomaly detection systems in practical setting: limited labeled data, ambiguous definition of “abnormal” and expensive feature engineering steps. This paper introduces a unified detection framework to handle these challenges using energy-based models, which are powerful tools for unsupervised representation learning. Our proposed models are firstly trained on unlabeled raw pixels of image frames from an input video rather than hand-crafted visual features; and then identify the locations of abnormal objects based on the errors between the input video and its reconstruction produced by the models. To handle video stream, we develop an online version of our framework, wherein the model parameters are updated incrementally with the image frames arriving on the fly. Our experiments show that our detectors, using Restricted Boltzmann Machines (RBMs) and Deep Boltzmann Machines (DBMs) as core modules, achieve superior anomaly detection performance to unsupervised baselines and obtain accuracy comparable with the state-of-the-art approaches when evaluating at the pixel-level. More importantly, we discover that our system trained with DBMs is able to simultaneously perform scene clustering and scene reconstruction. This capacity not only distinguishes our method from other existing detectors but also offers a unique tool to investigate and understand how the model works.
Tasks	Anomaly Detection, Feature Engineering, Representation Learning, Unsupervised Representation Learning
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01090v2
PDF	http://arxiv.org/pdf/1805.01090v2.pdf
PWC	https://paperswithcode.com/paper/detection-of-unknown-anomalies-in-streaming
Repo
Framework