Paper Group ANR 810
Semantic Segmentation for Compound figures. Automating concept-drift detection by self-evaluating predictive model degradation. Multi-owner Secure Encrypted Search Using Searching Adversarial Networks. Speech-driven facial animation using polynomial fusion of features. Visual Appearance Based Person Retrieval in Unconstrained Environment Videos. Ca …
Semantic Segmentation for Compound figures
Title | Semantic Segmentation for Compound figures |
Authors | Weixin Jiang, Eric Schwenker, Maria Chan, Oliver Cossairt |
Abstract | Scientific literature contains large volumes of unstructured data,with over 30% of figures constructed as a combination of multiple images, these compound figures cannot be analyzed directly with existing information retrieval tools. In this paper, we propose a semantic segmentation approach for compound figure separation, decomposing the compound figures into “master images”. Each master image is one part of a compound figure governed by a subfigure label (typically “(a), (b), (c), etc”). In this way, the separated subfigures can be easily associated with the description information in the caption. In particular, we propose an anchor-based master image detection algorithm, which leverages the correlation between master images and subfigure labels and locates the master images in a two-step manner. First, a subfigure label detector is built to extract the global layout information of the compound figure. Second, the layout information is combined with local features to locate the master images. We validate the effectiveness of proposed method on our labeled testing dataset both quantitatively and qualitatively. |
Tasks | Information Retrieval, Semantic Segmentation |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07142v1 |
https://arxiv.org/pdf/1912.07142v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-segmentation-for-compound-figures |
Repo | |
Framework | |
Automating concept-drift detection by self-evaluating predictive model degradation
Title | Automating concept-drift detection by self-evaluating predictive model degradation |
Authors | Tania Cerquitelli, Stefano Proto, Francesco Ventura, Daniele Apiletti, Elena Baralis |
Abstract | A key aspect of automating predictive machine learning entails the capability of properly triggering the update of the trained model. To this aim, suitable automatic solutions to self-assess the prediction quality and the data distribution drift between the original training set and the new data have to be devised. In this paper, we propose a novel methodology to automatically detect prediction-quality degradation of machine learning models due to class-based concept drift, i.e., when new data contains samples that do not fit the set of class labels known by the currently-trained predictive model. Experiments on synthetic and real-world public datasets show the effectiveness of the proposed methodology in automatically detecting and describing concept drift caused by changes in the class-label data distributions. |
Tasks | |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08120v1 |
https://arxiv.org/pdf/1907.08120v1.pdf | |
PWC | https://paperswithcode.com/paper/automating-concept-drift-detection-by-self |
Repo | |
Framework | |
Multi-owner Secure Encrypted Search Using Searching Adversarial Networks
Title | Multi-owner Secure Encrypted Search Using Searching Adversarial Networks |
Authors | Kai Chen, Zhongrui Lin, Jian Wan, Lei Xu, Chungen Xu |
Abstract | Searchable symmetric encryption (SSE) for multi-owner model draws much attention as it enables data users to perform searches over encrypted cloud data outsourced by data owners. However, implementing secure and precise query, efficient search and flexible dynamic system maintenance at the same time in SSE remains a challenge. To address this, this paper proposes secure and efficient multi-keyword ranked search over encrypted cloud data for multi-owner model based on searching adversarial networks. We exploit searching adversarial networks to achieve optimal pseudo-keyword padding, and obtain the optimal game equilibrium for query precision and privacy protection strength. Maximum likelihood search balanced tree is generated by probabilistic learning, which achieves efficient search and brings the computational complexity close to $\mathcal{O}(\log N)$. In addition, we enable flexible dynamic system maintenance with balanced index forest that makes full use of distributed computing. Compared with previous works, our solution maintains query precision above 95% while ensuring adequate privacy protection, and introduces low overhead on computation, communication and storage. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02784v2 |
https://arxiv.org/pdf/1908.02784v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-client-secure-encrypted-search-using |
Repo | |
Framework | |
Speech-driven facial animation using polynomial fusion of features
Title | Speech-driven facial animation using polynomial fusion of features |
Authors | Triantafyllos Kefalas, Konstantinos Vougioukas, Yannis Panagakis, Stavros Petridis, Jean Kossaifi, Maja Pantic |
Abstract | Speech-driven facial animation involves using a speech signal to generate realistic videos of talking faces. Recent deep learning approaches to facial synthesis rely on extracting low-dimensional representations and concatenating them, followed by a decoding step of the concatenated vector. This accounts for only first-order interactions of the features and ignores higher-order interactions. In this paper we propose a polynomial fusion layer that models the joint representation of the encodings by a higher-order polynomial, with the parameters modelled by a tensor decomposition. We demonstrate the suitability of this approach through experiments on generated videos evaluated on a range of metrics on video quality, audiovisual synchronisation and generation of blinks. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.05833v2 |
https://arxiv.org/pdf/1912.05833v2.pdf | |
PWC | https://paperswithcode.com/paper/speech-driven-facial-animation-using |
Repo | |
Framework | |
Visual Appearance Based Person Retrieval in Unconstrained Environment Videos
Title | Visual Appearance Based Person Retrieval in Unconstrained Environment Videos |
Authors | Hiren Galiyawala, Mehul S Raval, Shivansh Dave |
Abstract | Visual appearance-based person retrieval is a challenging problem in surveillance. It uses attributes like height, cloth color, cloth type and gender to describe a human. Such attributes are known as soft biometrics. This paper proposes person retrieval from surveillance video using height, torso cloth type, torso cloth color and gender. The approach introduces an adaptive torso patch extraction and bounding box regression to improve the retrieval. The algorithm uses fine-tuned Mask R-CNN and DenseNet-169 for person detection and attribute classification respectively. The performance is analyzed on AVSS 2018 challenge II dataset and it achieves 11.35% improvement over state-of-the-art based on average Intersection over Union measure. |
Tasks | Human Detection, Person Retrieval |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14565v1 |
https://arxiv.org/pdf/1910.14565v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-appearance-based-person-retrieval-in |
Repo | |
Framework | |
Causal Mediation Analysis Leveraging Multiple Types of Summary Statistics Data
Title | Causal Mediation Analysis Leveraging Multiple Types of Summary Statistics Data |
Authors | Yongjin Park, Abhishek Sarkar, Khoi Nguyen, Manolis Kellis |
Abstract | Summary statistics of genome-wide association studies (GWAS) teach causal relationship between millions of genetic markers and tens and thousands of phenotypes. However, underlying biological mechanisms are yet to be elucidated. We can achieve necessary interpretation of GWAS in a causal mediation framework, looking to establish a sparse set of mediators between genetic and downstream variables, but there are several challenges. Unlike existing methods rely on strong and unrealistic assumptions, we tackle practical challenges within a principled summary-based causal inference framework. We analyzed the proposed methods in extensive simulations generated from real-world genetic data. We demonstrated only our approach can accurately redeem causal genes, even without knowing actual individual-level data, despite the presence of competing non-causal trails. |
Tasks | Causal Inference |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08540v1 |
http://arxiv.org/pdf/1901.08540v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-mediation-analysis-leveraging-multiple |
Repo | |
Framework | |
Gradient Regularized Budgeted Boosting
Title | Gradient Regularized Budgeted Boosting |
Authors | Zhixiang Eddie Xu, Matt J. Kusner, Kilian Q. Weinberger, Alice X. Zheng |
Abstract | As machine learning transitions increasingly towards real world applications controlling the test-time cost of algorithms becomes more and more crucial. Recent work, such as the Greedy Miser and Speedboost, incorporate test-time budget constraints into the training procedure and learn classifiers that provably stay within budget (in expectation). However, so far, these algorithms are limited to the supervised learning scenario where sufficient amounts of labeled data are available. In this paper we investigate the common scenario where labeled data is scarce but unlabeled data is available in abundance. We propose an algorithm that leverages the unlabeled data (through Laplace smoothing) and learns classifiers with budget constraints. Our model, based on gradient boosted regression trees (GBRT), is, to our knowledge, the first algorithm for semi-supervised budgeted learning. |
Tasks | |
Published | 2019-01-13 |
URL | http://arxiv.org/abs/1901.04065v3 |
http://arxiv.org/pdf/1901.04065v3.pdf | |
PWC | https://paperswithcode.com/paper/gradient-regularized-budgeted-boosting |
Repo | |
Framework | |
Augmenting learning using symmetry in a biologically-inspired domain
Title | Augmenting learning using symmetry in a biologically-inspired domain |
Authors | Shruti Mishra, Abbas Abdolmaleki, Arthur Guez, Piotr Trochim, Doina Precup |
Abstract | Invariances to translation, rotation and other spatial transformations are a hallmark of the laws of motion, and have widespread use in the natural sciences to reduce the dimensionality of systems of equations. In supervised learning, such as in image classification tasks, rotation, translation and scale invariances are used to augment training datasets. In this work, we use data augmentation in a similar way, exploiting symmetry in the quadruped domain of the DeepMind control suite (Tassa et al. 2018) to add to the trajectories experienced by the actor in the actor-critic algorithm of Abdolmaleki et al. (2018). In a data-limited regime, the agent using a set of experiences augmented through symmetry is able to learn faster. Our approach can be used to inject knowledge of invariances in the domain and task to augment learning in robots, and more generally, to speed up learning in realistic robotics applications. |
Tasks | Data Augmentation, Image Classification |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00528v1 |
https://arxiv.org/pdf/1910.00528v1.pdf | |
PWC | https://paperswithcode.com/paper/augmenting-learning-using-symmetry-in-a |
Repo | |
Framework | |
A Capsule-unified Framework of Deep Neural Networks for Graphical Programming
Title | A Capsule-unified Framework of Deep Neural Networks for Graphical Programming |
Authors | Yujian Li, Chuanhui Shan |
Abstract | Recently, the growth of deep learning has produced a large number of deep neural networks. How to describe these networks unifiedly is becoming an important issue. We first formalize neural networks in a mathematical definition, give their directed graph representations, and prove a generation theorem about the induced networks of connected directed acyclic graphs. Then, using the concept of capsule to extend neural networks, we set up a capsule-unified framework for deep learning, including a mathematical definition of capsules, an induced model for capsule networks and a universal backpropagation algorithm for training them. Finally, we discuss potential applications of the framework to graphical programming with standard graphical symbols of capsules, neurons, and connections. |
Tasks | |
Published | 2019-03-07 |
URL | http://arxiv.org/abs/1903.04982v2 |
http://arxiv.org/pdf/1903.04982v2.pdf | |
PWC | https://paperswithcode.com/paper/a-capsule-unified-framework-of-deep-neural |
Repo | |
Framework | |
Reward-Based Deception with Cognitive Bias
Title | Reward-Based Deception with Cognitive Bias |
Authors | Bo Wu, Murat Cubuktepe, Suda Bharadwaj, Ufuk Topcu |
Abstract | Deception plays a key role in adversarial or strategic interactions for the purpose of self-defence and survival. This paper introduces a general framework and solution to address deception. Most existing approaches for deception consider obfuscating crucial information to rational adversaries with abundant memory and computation resources. In this paper, we consider deceiving adversaries with bounded rationality and in terms of expected rewards. This problem is commonly encountered in many applications especially involving human adversaries. Leveraging the cognitive bias of humans in reward evaluation under stochastic outcomes, we introduce a framework to optimally assign resources of a limited quantity to optimally defend against human adversaries. Modeling such cognitive biases follows the so-called prospect theory from behavioral psychology literature. Then we formulate the resource allocation problem as a signomial program to minimize the defender’s cost in an environment modeled as a Markov decision process. We use police patrol hour assignment as an illustrative example and provide detailed simulation results based on real-world data. |
Tasks | |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11454v1 |
http://arxiv.org/pdf/1904.11454v1.pdf | |
PWC | https://paperswithcode.com/paper/reward-based-deception-with-cognitive-bias |
Repo | |
Framework | |
Logic and the $2$-Simplicial Transformer
Title | Logic and the $2$-Simplicial Transformer |
Authors | James Clift, Dmitry Doryn, Daniel Murfet, James Wallbridge |
Abstract | We introduce the $2$-simplicial Transformer, an extension of the Transformer which includes a form of higher-dimensional attention generalising the dot-product attention, and uses this attention to update entity representations with tensor products of value vectors. We show that this architecture is a useful inductive bias for logical reasoning in the context of deep reinforcement learning. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00668v1 |
https://arxiv.org/pdf/1909.00668v1.pdf | |
PWC | https://paperswithcode.com/paper/logic-and-the-2-simplicial-transformer |
Repo | |
Framework | |
A Statistical Model for Dynamic Networks with Neural Variational Inference
Title | A Statistical Model for Dynamic Networks with Neural Variational Inference |
Authors | Shubham Gupta, Rui M. Castro, Ambedkar Dukkipati |
Abstract | In this paper we propose a statistical model for dynamically evolving networks, together with a variational inference approach. Our model, which we call Dynamic Latent Attribute Interaction Model (DLAIM), encodes edge dependencies across different time snapshots. It represents nodes via latent attributes and uses attribute interaction matrices to model the presence of edges. Both are allowed to evolve with time, thus allowing us to capture the dynamics of the network. We develop a neural network based variational inference procedure that provides a suitable way to learn the model parameters. The main strengths of DLAIM are: (i) it is flexible as it does not impose strict assumptions on network evolution unlike existing approaches, (ii) it applies to both directed as well as undirected networks, and more importantly, (iii) learned node attributes and interaction matrices may be interpretable and therefore provide insights on the mechanisms behind network evolution. Experiments done on real world networks for the task of link forecasting demonstrate the superior performance of our model as compared to existing approaches. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11455v1 |
https://arxiv.org/pdf/1911.11455v1.pdf | |
PWC | https://paperswithcode.com/paper/a-statistical-model-for-dynamic-networks-with |
Repo | |
Framework | |
KLDivNet: An unsupervised neural network for multi-modality image registration
Title | KLDivNet: An unsupervised neural network for multi-modality image registration |
Authors | Yechong Huang, Tao Song, Jiahang Xu, Yinan Chen, Xiahai Zhuang |
Abstract | Multi-modality image registration is one of the most underlined processes in medical image analysis. Recently, convolutional neural networks (CNNs) have shown significant potential in deformable registration. However, the lack of voxel-wise ground truth challenges the training of CNNs for an accurate registration. In this work, we propose a cross-modality similarity metric, based on the KL-divergence of image variables, and implement an efficient estimation method using a CNN. This estimation network, referred to as KLDivNet, can be trained unsupervisedly. We then embed the KLDivNet into a registration network to achieve the unsupervised deformable registration for multi-modality images. We employed three datasets, i.e., AAL Brain, LiTS Liver and Hospital Liver, with both the intra- and inter-modality image registration tasks for validation. Results showed that our similarity metric was effective, and the proposed registration network delivered superior performance compared to the state-of-the-art methods. |
Tasks | Image Registration, Medical Image Registration |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08767v2 |
https://arxiv.org/pdf/1908.08767v2.pdf | |
PWC | https://paperswithcode.com/paper/mutual-information-neural-estimation-in-cnn |
Repo | |
Framework | |
Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems
Title | Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems |
Authors | Mark D. McDonnell, Hesham Mostafa, Runchun Wang, Andre van Schaik |
Abstract | Batch-normalization (BN) layers are thought to be an integrally important layer type in today’s state-of-the-art deep convolutional neural networks for computer vision tasks such as classification and detection. However, BN layers introduce complexity and computational overheads that are highly undesirable for training and/or inference on low-power custom hardware implementations of real-time embedded vision systems such as UAVs, robots and Internet of Things (IoT) devices. They are also problematic when batch sizes need to be very small during training, and innovations such as residual connections introduced more recently than BN layers could potentially have lessened their impact. In this paper we aim to quantify the benefits BN layers offer in image classification networks, in comparison with alternative choices. In particular, we study networks that use shifted-ReLU layers instead of BN layers. We found, following experiments with wide residual networks applied to the ImageNet, CIFAR 10 and CIFAR 100 image classification datasets, that BN layers do not consistently offer a significant advantage. We found that the accuracy margin offered by BN layers depends on the data set, the network size, and the bit-depth of weights. We conclude that in situations where BN layers are undesirable due to speed, memory or complexity costs, that using shifted-ReLU layers instead should be considered; we found they can offer advantages in all these areas, and often do not impose a significant accuracy cost. |
Tasks | Image Classification |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06916v2 |
https://arxiv.org/pdf/1907.06916v2.pdf | |
PWC | https://paperswithcode.com/paper/single-bit-per-weight-deep-convolutional |
Repo | |
Framework | |
Illumination Normalization via Merging Locally Enhanced Textures for Robust Face Recognition
Title | Illumination Normalization via Merging Locally Enhanced Textures for Robust Face Recognition |
Authors | Chaobing Zheng, Shiqian Wu, Wangming Xu, Shoulie Xie |
Abstract | In order to improve the accuracy of face recognition under varying illumination conditions, a local texture enhanced illumination normalization method based on fusion of differential filtering images (FDFI-LTEIN) is proposed to weaken the influence caused by illumination changes. Firstly, the dynamic range of the face image in dark or shadowed regions is expanded by logarithmic transformation. Then, the global contrast enhanced face image is convolved with difference of Gaussian filters and difference of bilateral filters, and the filtered images are weighted and merged using a coefficient selection rule based on the standard deviation (SD) of image, which can enhance image texture information while filtering out most noise. Finally, the local contrast equalization (LCE) is performed on the fused face image to reduce the influence caused by over or under saturated pixel values in highlight or dark regions. Experimental results on the Extended Yale B face database and CMU PIE face database demonstrate that the proposed method is more robust to illumination changes and achieve higher recognition accuracy when compared with other illumination normalization methods and a deep CNNs based illumination invariant face recognition method |
Tasks | Face Recognition, Robust Face Recognition |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.03904v1 |
https://arxiv.org/pdf/1905.03904v1.pdf | |
PWC | https://paperswithcode.com/paper/illumination-normalization-via-merging |
Repo | |
Framework | |