January 31, 2020

3284 words 16 mins read

Paper Group ANR 51

SpliceRadar: A Learned Method For Blind Image Forensics. Explaining Classifiers with Causal Concept Effect (CaCE). Object Reachability via Swaps under Strict and Weak Preferences. Designing Style Matching Conversational Agents. Convolutional Dictionary Pair Learning Network for Image Representation Learning. Physics-Informed Machine Learning Models …


Title	SpliceRadar: A Learned Method For Blind Image Forensics
Authors	Aurobrata Ghosh, Zheng Zhong, Terrance E Boult, Maneesh Singh
Abstract	Detection and localization of image manipulations like splices are gaining in importance with the easy accessibility of image editing softwares. While detection generates a verdict for an image it provides no insight into the manipulation. Localization helps explain a positive detection by identifying the pixels of the image which have been tampered. We propose a deep learning based method for splice localization without prior knowledge of a test image’s camera-model. It comprises a novel approach for learning rich filters and for suppressing image-edges. Additionally, we train our model on a surrogate task of camera model identification, which allows us to leverage large and widely available, unmanipulated, camera-tagged image databases. During inference, we assume that the spliced and host regions come from different camera-models and we segment these regions using a Gaussian-mixture model. Experiments on three test databases demonstrate results on par with and above the state-of-the-art and a good generalization ability to unknown datasets.
Tasks	Image Manipulation Detection
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11663v1
PDF	https://arxiv.org/pdf/1906.11663v1.pdf
PWC	https://paperswithcode.com/paper/spliceradar-a-learned-method-for-blind-image
Repo
Framework

Explaining Classifiers with Causal Concept Effect (CaCE)


Title	Explaining Classifiers with Causal Concept Effect (CaCE)
Authors	Yash Goyal, Amir Feder, Uri Shalit, Been Kim
Abstract	How can we understand classification decisions made by deep neural networks? Many existing explainability methods rely solely on correlations and fail to account for confounding, which may result in potentially misleading explanations. To overcome this problem, we define the Causal Concept Effect (CaCE) as the causal effect of (the presence or absence of) a human-interpretable concept on a deep neural net’s predictions. We show that the CaCE measure can avoid errors stemming from confounding. Estimating CaCE is difficult in situations where we cannot easily simulate the do-operator. To mitigate this problem, we use a generative model, specifically a Variational AutoEncoder (VAE), to measure VAE-CaCE. In an extensive experimental analysis, we show that the VAE-CaCE is able to estimate the true concept causal effect, compared to baselines for a number of datasets including high dimensional images.
Tasks	Causal Inference
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07165v2
PDF	https://arxiv.org/pdf/1907.07165v2.pdf
PWC	https://paperswithcode.com/paper/explaining-classifiers-with-causal-concept
Repo
Framework

Object Reachability via Swaps under Strict and Weak Preferences


Title	Object Reachability via Swaps under Strict and Weak Preferences
Authors	Sen Huang, Mingyu Xiao
Abstract	The \textsc{Housing Market} problem is a widely studied resource allocation problem. In this problem, each agent can only receive a single object and has preferences over all objects. Starting from an initial endowment, we want to reach a certain assignment via a sequence of rational trades. We first consider whether an object is reachable for a given agent under a social network, where a trade between two agents is allowed if they are neighbors in the network and no participant has a deficit from the trade. Assume that the preferences of the agents are strict (no tie among objects is allowed). This problem is polynomially solvable in a star-network and NP-complete in a tree-network. It is left as a challenging open problem whether the problem is polynomially solvable when the network is a path. We answer this open problem positively by giving a polynomial-time algorithm. Then we show that when the preferences of the agents are weak (ties among objects are allowed), the problem becomes NP-hard even when the network is a path. In addition, we consider the computational complexity of finding different optimal assignments for the problem under the network being a path or a star.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07557v1
PDF	https://arxiv.org/pdf/1909.07557v1.pdf
PWC	https://paperswithcode.com/paper/object-reachability-via-swaps-under-strict
Repo
Framework

Designing Style Matching Conversational Agents


Title	Designing Style Matching Conversational Agents
Authors	Deepali Aneja, Rens Hoegen, Daniel McDuff, Mary Czerwinski
Abstract	Advances in machine intelligence have enabled conversational interfaces that have the potential to radically change the way humans interact with machines. However, even with the progress in the abilities of these agents, there remain critical gaps in their capacity for natural interactions. One limitation is that the agents are often monotonic in behavior and do not adapt to their partner. We built two end-to-end conversational agents: a voice-based agent that can engage in naturalistic, multi-turn dialogue and align with the interlocutor’s conversational style, and a 2nd, expressive, embodied conversational agent (ECA) that can recognize human behavior during open-ended conversations and automatically align its responses to the visual and conversational style of the other party. The embodied conversational agent leverages multimodal inputs to produce rich and perceptually valid vocal and facial responses (e.g., lip syncing and expressions) during the conversation. Based on empirical results from a set of user studies, we highlight several significant challenges in building such systems and provide design guidelines for multi-turn dialogue interactions using style adaptation for future research.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07514v1
PDF	https://arxiv.org/pdf/1910.07514v1.pdf
PWC	https://paperswithcode.com/paper/designing-style-matching-conversational
Repo
Framework

Convolutional Dictionary Pair Learning Network for Image Representation Learning


Title	Convolutional Dictionary Pair Learning Network for Image Representation Learning
Authors	Zhao Zhang, Yulin Sun, Yang Wang, Zhengjun Zha, Shuicheng Yan, Meng Wang
Abstract	Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are powerful image representation learning systems based on different mechanisms and principles, however whether we can seamlessly integrate them to improve the per-formance is noteworthy exploring. To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which integrates the learning schemes of the CNN and dictionary pair learning into a unified framework. Generally, the architecture of CDPL-Net includes two convolutional/pooling layers and two dictionary pair learn-ing (DPL) layers in the representation learning module. Besides, it uses two fully-connected layers as the multi-layer perception layer in the nonlinear classification module. In particular, the DPL layer can jointly formulate the discriminative synthesis and analysis representations driven by minimizing the batch based reconstruction error over the flatted feature maps from the convolution/pooling layer. Moreover, DPL layer uses l1-norm on the analysis dictionary so that sparse representation can be delivered, and the embedding process will also be robust to noise. To speed up the training process of DPL layer, the efficient stochastic gradient descent is used. Extensive simulations on real databases show that our CDPL-Net can deliver enhanced performance over other state-of-the-art methods.
Tasks	Dictionary Learning, Representation Learning
Published	2019-12-17
URL	https://arxiv.org/abs/1912.12138v3
PDF	https://arxiv.org/pdf/1912.12138v3.pdf
PWC	https://paperswithcode.com/paper/convolutional-dictionary-pair-learning
Repo
Framework

Physics-Informed Machine Learning Models for Predicting the Progress of Reactive-Mixing


Title	Physics-Informed Machine Learning Models for Predicting the Progress of Reactive-Mixing
Authors	M. K. Mudunuru, S. Karra
Abstract	This paper presents a physics-informed machine learning (ML) framework to construct reduced-order models (ROMs) for reactive-transport quantities of interest (QoIs) based on high-fidelity numerical simulations. QoIs include species decay, product yield, and degree of mixing. The ROMs for QoIs are applied to quantify and understand how the chemical species evolve over time. First, high-resolution datasets for constructing ROMs are generated by solving anisotropic reaction-diffusion equations using a non-negative finite element formulation for different input parameters. Non-negative finite element formulation ensures that the species concentration is non-negative (which is needed for computing QoIs) on coarse computational grids even under high anisotropy. The reactive-mixing model input parameters are a time-scale associated with flipping of velocity, a spatial-scale controlling small/large vortex structures of velocity, a perturbation parameter of the vortex-based velocity, anisotropic dispersion strength/contrast, and molecular diffusion. Second, random forests, F-test, and mutual information criterion are used to evaluate the importance of model inputs/features with respect to QoIs. Third, Support Vector Machines (SVM) and Support Vector Regression (SVR) are used to construct ROMs based on the model inputs. Then, SVR-ROMs are used to predict scaling of QoIs. Qualitatively, SVR-ROMs are able to describe the trends observed in the scaling law associated with QoIs. Fourth, the scaling law’s exponent dependence on model inputs/features are evaluated using $k$-means clustering. Finally, in terms of the computational cost, the proposed SVM-ROMs and SVR-ROMs are $\mathcal{O}(10^7)$ times faster than running a high-fidelity numerical simulation for evaluating QoIs.
Tasks
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10929v1
PDF	https://arxiv.org/pdf/1908.10929v1.pdf
PWC	https://paperswithcode.com/paper/physics-informed-machine-learning-models-for
Repo
Framework

Majorization Minimization Technique for Optimally Solving Deep Dictionary Learning


Title	Majorization Minimization Technique for Optimally Solving Deep Dictionary Learning
Authors	Vanika Singhal, Angshul Majumdar
Abstract	The concept of deep dictionary learning has been recently proposed. Unlike shallow dictionary learning which learns single level of dictionary to represent the data, it uses multiple layers of dictionaries. So far, the problem could only be solved in a greedy fashion; this was achieved by learning a single layer of dictionary in each stage where the coefficients from the previous layer acted as inputs to the subsequent layer (only the first layer used the training samples as inputs). This was not optimal; there was feedback from shallower to deeper layers but not the other way. This work proposes an optimal solution to deep dictionary learning whereby all the layers of dictionaries are solved simultaneously. We employ the Majorization Minimization approach. Experiments have been carried out on benchmark datasets; it shows that optimal learning indeed improves over greedy piecemeal learning. Comparison with other unsupervised deep learning tools (stacked denoising autoencoder, deep belief network, contractive autoencoder and K-sparse autoencoder) show that our method supersedes their performance both in accuracy and speed.
Tasks	Denoising, Dictionary Learning
Published	2019-12-11
URL	https://arxiv.org/abs/1912.10801v1
PDF	https://arxiv.org/pdf/1912.10801v1.pdf
PWC	https://paperswithcode.com/paper/majorization-minimization-technique-for
Repo
Framework

On the Transformation of Latent Space in Autoencoders


Title	On the Transformation of Latent Space in Autoencoders
Authors	Jaehoon Cha, Kyeong Soo Kim, Sanghyuk Lee
Abstract	Noting the importance of the latent variables in inference and learning, we propose a novel framework for autoencoders based on the homeomorphic transformation of latent variables, which could reduce the distance between vectors in the transformed space, while preserving the topological properties of the original space, and investigate the effect of the latent space transformation on learning generative models and denoising corrupted data. The experimental results demonstrate that our generative and denoising models based on the proposed framework can provide better performance than conventional variational and denoising autoencoders due to the transformation, where we evaluate the performance of generative and denoising models in terms of the Hausdorff distance between the sets of training and processed i.e., either generated or denoised images, which can objectively measure their differences, as well as through direct comparison of the visual characteristics of the processed images.
Tasks	Denoising
Published	2019-01-24
URL	https://arxiv.org/abs/1901.08479v2
PDF	https://arxiv.org/pdf/1901.08479v2.pdf
PWC	https://paperswithcode.com/paper/on-the-transformation-of-latent-space-in
Repo
Framework

Graph Representation for Face Analysis in Image Collections


Title	Graph Representation for Face Analysis in Image Collections
Authors	Domingo Mery, Florencia Valdes
Abstract	Given an image collection of a social event with a huge number of pictures, it is very useful to have tools that can be used to analyze how the individuals –that are present in the collection– interact with each other. In this paper, we propose an optimal graph representation that is based on the `connectivity' of them. The connectivity of a pair of subjects gives a score that represents how` connected’ they are. It is estimated based on co-occurrence, closeness, facial expressions, and the orientation of the head when they are looking to each other. In our proposed graph, the nodes represent the subjects of the collection, and the edges correspond to their connectivities. The location of the nodes is estimated according to their connectivity (the closer the nodes, the more connected are the subjects). Finally, we developed a graphical user interface in which we can click onto the nodes (or the edges) to display the corresponding images of the collection in which the subject of the nodes (or the connected subjects) are present. We present relevant results by analyzing a wedding celebration, a sitcom video, a volleyball game and images extracted from Twitter given a hashtag. We believe that this tool can be very helpful to detect the existing social relations in an image collection.
Tasks
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11970v1
PDF	https://arxiv.org/pdf/1911.11970v1.pdf
PWC	https://paperswithcode.com/paper/graph-representation-for-face-analysis-in
Repo
Framework

M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning


Title	M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning
Authors	Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis
Abstract	Incremental learning targets at achieving good performance on new categories without forgetting old ones. Knowledge distillation has been shown critical in preserving the performance on old classes. Conventional methods, however, sequentially distill knowledge only from the last model, leading to performance degradation on the old classes in later incremental learning steps. In this paper, we propose a multi-model and multi-level knowledge distillation strategy. Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots. In addition, we incorporate an auxiliary distillation to further preserve knowledge encoded at the intermediate feature levels. To make the model more memory efficient, we adapt mask based pruning to reconstruct all previous models with a small memory footprint. Experiments on standard incremental learning benchmarks show that our method preserves the knowledge on old classes better and improves the overall performance over standard distillation techniques.
Tasks
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01769v1
PDF	http://arxiv.org/pdf/1904.01769v1.pdf
PWC	https://paperswithcode.com/paper/m2kd-multi-model-and-multi-level-knowledge
Repo
Framework

Community Detection and Growth Potential Prediction Using the Stochastic Block Model and the Long Short-term Memory from Patent Citation Networks


Title	Community Detection and Growth Potential Prediction Using the Stochastic Block Model and the Long Short-term Memory from Patent Citation Networks
Authors	Kensei Nakai, Hirofumi Nonaka, Asahi Hentona, Yuki Kanai, Takeshi Sakumoto, Shotaro Kataoka, Elisa Claire Alemán Carreón, Toru Hiraoka
Abstract	Scoring patent documents is very useful for technology management. However, conventional methods are based on static models and, thus, do not reflect the growth potential of the technology cluster of the patent. Because even if the cluster of a patent has no hope of growing, we recognize the patent is important if PageRank or other ranking score is high. Therefore, there arises a necessity of developing citation network clustering and prediction of future citations. In our research, clustering of patent citation networks by Stochastic Block Model was done with the aim of enabling corporate managers and investors to evaluate the scale and life cycle of technology. As a result, we confirmed nested SBM is appropriate for graph clustering of patent citation networks. Also, a high MAPE value was obtained and the direction accuracy achieved a value greater than 50% when predicting growth potential for each cluster by using LSTM.
Tasks	Community Detection, Graph Clustering
Published	2019-04-23
URL	http://arxiv.org/abs/1904.12986v1
PDF	http://arxiv.org/pdf/1904.12986v1.pdf
PWC	https://paperswithcode.com/paper/190412986
Repo
Framework

High efficiency rl agent


Title	High efficiency rl agent
Authors	Jingbin Liu, Xinyang Gu, Dexiang Zhang, Shuai Liu
Abstract	Now a day, model free algorithm achieve state of art performance on many RL problems, but the low efficiency of model free algorithm limited the usage. We combine model base RL, soft actor-critic framework, and curiosity. proposed an agent called RMC, giving a promise way to achieve good performance while maintain data efficiency. We suppress the performance of SAC and achieve state of the art performance, both on efficiency and stability. Meanwhile we can solving POMDP problem and achieve great generalization from MDP to POMDP.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11494v3
PDF	https://arxiv.org/pdf/1908.11494v3.pdf
PWC	https://paperswithcode.com/paper/high-efficiency-rl-agent
Repo
Framework

Minimal Solutions for Relative Pose with a Single Affine Correspondence


Title	Minimal Solutions for Relative Pose with a Single Affine Correspondence
Authors	Banglei Guan, Ji Zhao, Zhang Li, Fang Sun, Friedrich Fraundorfer
Abstract	In this paper we present four cases of minimal solutions for two-view relative pose estimation by exploiting the affine transformation between feature points and we demonstrate efficient solvers for these cases. It is shown, that under the planar motion assumption or with knowledge of a vertical direction, a single affine correspondence is sufficient to recover the relative camera pose. The four cases considered are two-view planar relative motion for calibrated cameras as a closed-form and a least-squares solution, a closed-form solution for unknown focal length and the case of a known vertical direction. These algorithms can be used efficiently for outlier detection within a RANSAC loop and for initial motion estimation. All the methods are evaluated on both synthetic data and real-world datasets from the KITTI benchmark. The experimental results demonstrate that our methods outperform comparable state-of-the-art methods in accuracy with the benefit of a reduced number of needed RANSAC iterations.
Tasks	Motion Estimation, Outlier Detection, Pose Estimation
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10776v1
PDF	https://arxiv.org/pdf/1912.10776v1.pdf
PWC	https://paperswithcode.com/paper/minimal-solutions-for-relative-pose-with-a
Repo
Framework

Kernel-Free Image Deblurring with a Pair of Blurred/Noisy Images


Title	Kernel-Free Image Deblurring with a Pair of Blurred/Noisy Images
Authors	Chunzhi Gu, Xuequan Lu, Ying He, Chao Zhang
Abstract	Complex blur like the mixup of space-variant and space-invariant blur, which is hard to be modeled mathematically, widely exists in real images. In the real world, a common type of blur occurs when capturing images in low-light environments. In this paper, we propose a novel image deblurring method that does not need to estimate blur kernels. We utilize a pair of images which can be easily acquired in low-light situations: (1) a blurred image taken with low shutter speed and low ISO noise, and (2) a noisy image captured with high shutter speed and high ISO noise. Specifically, the blurred image is first sliced into patches, and we extend the Gaussian mixture model (GMM) to model the underlying intensity distribution of each patch using the corresponding patches in the noisy image. We compute patch correspondences by analyzing the optical flow between the two images. The Expectation-Maximization (EM) algorithm is utilized to estimate the involved parameters in the GMM. To preserve sharp features, we add an additional bilateral term to the objective function in the M-step. We eventually add a detail layer to the deblurred image for refinement. Extensive experiments on both synthetic and real-world data demonstrate that our method outperforms state-of-the-art techniques, in terms of robustness, visual quality and quantitative metrics. We will make our dataset and source code publicly available.
Tasks	Deblurring, Optical Flow Estimation
Published	2019-03-26
URL	https://arxiv.org/abs/1903.10667v3
PDF	https://arxiv.org/pdf/1903.10667v3.pdf
PWC	https://paperswithcode.com/paper/kernel-free-image-deblurring-with-a-pair-of
Repo
Framework

Down-Scaling with Learned Kernels in Multi-Scale Deep Neural Networks for Non-Uniform Single Image Deblurring


Title	Down-Scaling with Learned Kernels in Multi-Scale Deep Neural Networks for Non-Uniform Single Image Deblurring
Authors	Dongwon Park, Jisoo Kim, Se Young Chun
Abstract	Multi-scale approach has been used for blind image / video deblurring problems to yield excellent performance for both conventional and recent deep-learning-based state-of-the-art methods. Bicubic down-sampling is a typical choice for multi-scale approach to reduce spatial dimension after filtering with a fixed kernel. However, this fixed kernel may be sub-optimal since it may destroy important information for reliable deblurring such as strong edges. We propose convolutional neural network (CNN)-based down-scale methods for multi-scale deep-learning-based non-uniform single image deblurring. We argue that our CNN-based down-scaling effectively reduces the spatial dimension of the original image, while learned kernels with multiple channels may well-preserve necessary details for deblurring tasks. For each scale, we adopt to use RCAN (Residual Channel Attention Networks) as a backbone network to further improve performance. Our proposed method yielded state-of-the-art performance on GoPro dataset by large margin. Our proposed method was able to achieve 2.59dB higher PSNR than the current state-of-the-art method by Tao. Our proposed CNN-based down-scaling was the key factor for this excellent performance since the performance of our network without it was decreased by 1.98dB. The same networks trained with GoPro set were also evaluated on large-scale Su dataset and our proposed method yielded 1.15dB better PSNR than the Tao’s method. Qualitative comparisons on Lai dataset also confirmed the superior performance of our proposed method over other state-of-the-art methods.
Tasks	Deblurring
Published	2019-03-25
URL	http://arxiv.org/abs/1903.10157v1
PDF	http://arxiv.org/pdf/1903.10157v1.pdf
PWC	https://paperswithcode.com/paper/down-scaling-with-learned-kernels-in-multi
Repo
Framework