October 18, 2019

3190 words 15 mins read

Paper Group ANR 530

Paper Group ANR 530

Identifiability of Generalized Hypergeometric Distribution (GHD) Directed Acyclic Graphical Models. Human Semantic Parsing for Person Re-identification. Polyphonic Sound Event Detection by using Capsule Neural Network. Adaptive Ranking Based Constraint Handling for Explicitly Constrained Black-Box Optimization. Momentum-Space Renormalization Group …

Identifiability of Generalized Hypergeometric Distribution (GHD) Directed Acyclic Graphical Models

Title Identifiability of Generalized Hypergeometric Distribution (GHD) Directed Acyclic Graphical Models
Authors Gunwoong Park, Hyewon Park
Abstract We introduce a new class of identifiable DAG models where the conditional distribution of each node given its parents belongs to a family of generalized hypergeometric distributions (GHD). A family of generalized hypergeometric distributions includes a lot of discrete distributions such as the binomial, Beta-binomial, negative binomial, Poisson, hyper-Poisson, and many more. We prove that if the data drawn from the new class of DAG models, one can fully identify the graph structure. We further present a reliable and polynomial-time algorithm that recovers the graph from finitely many data. We show through theoretical results and numerical experiments that our algorithm is statistically consistent in high-dimensional settings (p>n) if the indegree of the graph is bounded, and out-performs state-of-the-art DAG learning algorithms.
Tasks
Published 2018-05-08
URL https://arxiv.org/abs/1805.02848v3
PDF https://arxiv.org/pdf/1805.02848v3.pdf
PWC https://paperswithcode.com/paper/learning-large-scale-generalized
Repo
Framework

Human Semantic Parsing for Person Re-identification

Title Human Semantic Parsing for Person Re-identification
Authors Mahdi M. Kalayeh, Emrah Basaran, Muhittin Gokmen, Mustafa E. Kamasak, Mubarak Shah
Abstract Person re-identification is a challenging task mainly due to factors such as background clutter, pose, illumination and camera point of view variations. These elements hinder the process of extracting robust and discriminative representations, hence preventing different identities from being successfully distinguished. To improve the representation learning, usually, local features from human body parts are extracted. However, the common practice for such a process has been based on bounding box part detection. In this paper, we propose to adopt human semantic parsing which, due to its pixel-level accuracy and capability of modeling arbitrary contours, is naturally a better alternative. Our proposed SPReID integrates human semantic parsing in person re-identification and not only considerably outperforms its counter baseline, but achieves state-of-the-art performance. We also show that by employing a \textit{simple} yet effective training strategy, standard popular deep convolutional architectures such as Inception-V3 and ResNet-152, with no modification, while operating solely on full image, can dramatically outperform current state-of-the-art. Our proposed methods improve state-of-the-art person re-identification on: Market-1501 by ~17% in mAP and ~6% in rank-1, CUHK03 by ~4% in rank-1 and DukeMTMC-reID by ~24% in mAP and ~10% in rank-1.
Tasks Person Re-Identification, Representation Learning, Semantic Parsing
Published 2018-03-31
URL http://arxiv.org/abs/1804.00216v1
PDF http://arxiv.org/pdf/1804.00216v1.pdf
PWC https://paperswithcode.com/paper/human-semantic-parsing-for-person-re
Repo
Framework

Polyphonic Sound Event Detection by using Capsule Neural Network

Title Polyphonic Sound Event Detection by using Capsule Neural Network
Authors Fabio Vesperini, Leonardo Gabrielli, Emanuele Principi, Stefano Squartini
Abstract Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. % environment. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective, size, orientation) and the detection of overlapped images. This motivated the authors to employ CapsNets to deal with the polyphonic-SED task, in which multiple sound events occur simultaneously. Specifically, we propose to exploit the capsule units to represent a set of distinctive properties for each individual sound event. Capsule units are connected through a so-called \textit{dynamic routing} that encourages learning part-whole relationships and improves the detection performance in a polyphonic context. This paper reports extensive evaluations carried out on three publicly available datasets, showing how the CapsNet-based algorithm not only outperforms standard CNNs but also allows to achieve the best results with respect to the state of the art algorithms.
Tasks Sound Event Detection
Published 2018-10-15
URL http://arxiv.org/abs/1810.06325v1
PDF http://arxiv.org/pdf/1810.06325v1.pdf
PWC https://paperswithcode.com/paper/polyphonic-sound-event-detection-by-using
Repo
Framework

Adaptive Ranking Based Constraint Handling for Explicitly Constrained Black-Box Optimization

Title Adaptive Ranking Based Constraint Handling for Explicitly Constrained Black-Box Optimization
Authors Naoki Sakamoto, Youhei Akimoto
Abstract A novel explicit constraint handling technique for the covariance matrix adaptation evolution strategy (CMA-ES) is proposed. The proposed constraint handling exhibits two invariance properties. One is the invariance to arbitrary element-wise increasing transformation of the objective and constraint functions. The other is the invariance to arbitrary affine transformation of the search space. The proposed technique virtually transforms a constrained optimization problem into an unconstrained optimization problem by considering an adaptive weighted sum of the ranking of the objective function values and the ranking of the constraint violations that are measured by the Mahalanobis distance between each candidate solution to its projection onto the boundary of the constraints. Simulation results are presented and show that the CMA-ES with the proposed constraint handling exhibits the affine invariance and performs similarly to the CMA-ES on unconstrained counterparts.
Tasks
Published 2018-11-02
URL https://arxiv.org/abs/1811.00764v2
PDF https://arxiv.org/pdf/1811.00764v2.pdf
PWC https://paperswithcode.com/paper/ranking-based-linear-constraint-handling
Repo
Framework

Momentum-Space Renormalization Group Transformation in Bayesian Image Modeling by Gaussian Graphical Model

Title Momentum-Space Renormalization Group Transformation in Bayesian Image Modeling by Gaussian Graphical Model
Authors Kazuyuki Tanaka, Masamichi Nakamura, Shun Kataoka, Masayuki Ohzeki, Muneki Yasuda
Abstract A new Bayesian modeling method is proposed by combining the maximization of the marginal likelihood with a momentum-space renormalization group transformation for Gaussian graphical models. Moreover, we present a scheme for computint the statistical averages of hyperparameters and mean square errors in our proposed method based on a momentumspace renormalization transformation.
Tasks
Published 2018-03-20
URL http://arxiv.org/abs/1804.00727v1
PDF http://arxiv.org/pdf/1804.00727v1.pdf
PWC https://paperswithcode.com/paper/momentum-space-renormalization-group
Repo
Framework

Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour

Title Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour
Authors Emilia Gómez, Carlos Castillo, Vicky Charisi, Verónica Dahl, Gustavo Deco, Blagoj Delipetrev, Nicole Dewandre, Miguel Ángel González-Ballester, Fabien Gouyon, José Hernández-Orallo, Perfecto Herrera, Anders Jonsson, Ansgar Koene, Martha Larson, Ramón López de Mántaras, Bertin Martens, Marius Miron, Rubén Moreno-Bote, Nuria Oliver, Antonio Puertas Gallardo, Heike Schweitzer, Nuria Sebastian, Xavier Serra, Joan Serrà, Songül Tolan, Karina Vold
Abstract This document contains the outcome of the first Human behaviour and machine intelligence (HUMAINT) workshop that took place 5-6 March 2018 in Barcelona, Spain. The workshop was organized in the context of a new research programme at the Centre for Advanced Studies, Joint Research Centre of the European Commission, which focuses on studying the potential impact of artificial intelligence on human behaviour. The workshop gathered an interdisciplinary group of experts to establish the state of the art research in the field and a list of future research challenges to be addressed on the topic of human and machine intelligence, algorithm’s potential impact on human cognitive capabilities and decision making, and evaluation and regulation needs. The document is made of short position statements and identification of challenges provided by each expert, and incorporates the result of the discussions carried out during the workshop. In the conclusion section, we provide a list of emerging research topics and strategies to be addressed in the near future.
Tasks Decision Making
Published 2018-06-07
URL http://arxiv.org/abs/1806.03192v1
PDF http://arxiv.org/pdf/1806.03192v1.pdf
PWC https://paperswithcode.com/paper/assessing-the-impact-of-machine-intelligence
Repo
Framework

An Adaptive Pruning Algorithm for Spoofing Localisation Based on Tropical Geometry

Title An Adaptive Pruning Algorithm for Spoofing Localisation Based on Tropical Geometry
Authors Emmanouil Theodosis, Petros Maragos
Abstract The problem of spoofing attacks is increasingly relevant as digital systems are becoming more ubiquitous. Thus the detection of such attacks and the localisation of attackers have been objects of recent study. After an attack has been detected, various algorithms have been proposed in order to localise the attacker. In this work we propose a new adaptive pruning algorithm inspired by the tropical and geometrical analysis of the traditional Viterbi pruning algorithm to solve the localisation problem. In particular, the proposed algorithm tries to localise the attacker by adapting the leniency parameter based on estimates about the state of the solution space. These estimates stem from the enclosed volume and the entropy of the solution space, as they were introduced in our previous works.
Tasks
Published 2018-11-01
URL http://arxiv.org/abs/1811.01017v1
PDF http://arxiv.org/pdf/1811.01017v1.pdf
PWC https://paperswithcode.com/paper/an-adaptive-pruning-algorithm-for-spoofing
Repo
Framework

Leveraging Class Similarity to Improve Deep Neural Network Robustness

Title Leveraging Class Similarity to Improve Deep Neural Network Robustness
Authors Pooran Singh Negi, David chan, Mohammad Mahoor
Abstract Traditionally artificial neural networks (ANNs) are trained by minimizing the cross-entropy between a provided groundtruth delta distribution (encoded as one-hot vector) and the ANN’s predictive softmax distribution. It seems, however, unacceptable to penalize networks equally for missclassification between classes. Confusing the class “Automobile” with the class “Truck” should be penalized less than confusing the class “Automobile” with the class “Donkey”. To avoid such representation issues and learn cleaner classification boundaries in the network, this paper presents a variation of cross-entropy loss which depends not only on the sample class but also on a data-driven prior “class-similarity distribution” across the classes encoded in a matrix form. We explore learning the class-similarity distribution using a datadriven method and then show that by training with our modified similarity-driven loss, we obtain slightly better generalization performance over multiple architectures and datasets as well as improved performance on noisy testing scenarios.
Tasks
Published 2018-12-23
URL http://arxiv.org/abs/1812.09744v2
PDF http://arxiv.org/pdf/1812.09744v2.pdf
PWC https://paperswithcode.com/paper/leveraging-class-similarity-to-improve-deep
Repo
Framework

PVRNet: Point-View Relation Neural Network for 3D Shape Recognition

Title PVRNet: Point-View Relation Neural Network for 3D Shape Recognition
Authors Haoxuan You, Yifan Feng, Xibin Zhao, Changqing Zou, Rongrong Ji, Yue Gao
Abstract Three-dimensional (3D) shape recognition has drawn much research attention in the field of computer vision. The advances of deep learning encourage various deep models for 3D feature representation. For point cloud and multi-view data, two popular 3D data modalities, different models are proposed with remarkable performance. However the relation between point cloud and views has been rarely investigated. In this paper, we introduce Point-View Relation Network (PVRNet), an effective network designed to well fuse the view features and the point cloud feature with a proposed relation score module. More specifically, based on the relation score module, the point-single-view fusion feature is first extracted by fusing the point cloud feature and each single view feature with point-singe-view relation, then the point-multi-view fusion feature is extracted by fusing the point cloud feature and the features of different number of views with point-multi-view relation. Finally, the point-single-view fusion feature and point-multi-view fusion feature are further combined together to achieve a unified representation for a 3D shape. Our proposed PVRNet has been evaluated on ModelNet40 dataset for 3D shape classification and retrieval. Experimental results indicate our model can achieve significant performance improvement compared with the state-of-the-art models.
Tasks 3D Shape Recognition
Published 2018-12-02
URL http://arxiv.org/abs/1812.00333v1
PDF http://arxiv.org/pdf/1812.00333v1.pdf
PWC https://paperswithcode.com/paper/pvrnet-point-view-relation-neural-network-for
Repo
Framework

Improved Explainability of Capsule Networks: Relevance Path by Agreement

Title Improved Explainability of Capsule Networks: Relevance Path by Agreement
Authors Atefeh Shahroudnejad, Arash Mohammadi, Konstantinos N. Plataniotis
Abstract Recent advancements in signal processing and machine learning domains have resulted in an extensive surge of interest in deep learning models due to their unprecedented performance and high accuracy for different and challenging problems of significant engineering importance. However, when such deep learning architectures are utilized for making critical decisions such as the ones that involve human lives (e.g., in medical applications), it is of paramount importance to understand, trust, and in one word “explain” the rational behind deep models’ decisions. Currently, deep learning models are typically considered as black-box systems, which do not provide any clue on their internal processing actions. Although some recent efforts have been initiated to explain behavior and decisions of deep networks, explainable artificial intelligence (XAI) domain is still in its infancy. In this regard, we consider capsule networks (referred to as CapsNets), which are novel deep structures; recently proposed as an alternative counterpart to convolutional neural networks (CNNs), and posed to change the future of machine intelligence. In this paper, we investigate and analyze structures and behaviors of the CapsNets and illustrate potential explainability properties of such networks. Furthermore, we show possibility of transforming deep learning architectures in to transparent networks via incorporation of capsules in different layers instead of convolution layers of the CNNs.
Tasks
Published 2018-02-27
URL http://arxiv.org/abs/1802.10204v1
PDF http://arxiv.org/pdf/1802.10204v1.pdf
PWC https://paperswithcode.com/paper/improved-explainability-of-capsule-networks
Repo
Framework

Efficient Semantic Segmentation using Gradual Grouping

Title Efficient Semantic Segmentation using Gradual Grouping
Authors Nikitha Vallurupalli, Sriharsha Annamaneni, Girish Varma, C V Jawahar, Manu Mathew, Soyeb Nagori
Abstract Deep CNNs for semantic segmentation have high memory and run time requirements. Various approaches have been proposed to make CNNs efficient like grouped, shuffled, depth-wise separable convolutions. We study the effectiveness of these techniques on a real-time semantic segmentation architecture like ERFNet for improving run time by over 5X. We apply these techniques to CNN layers partially or fully and evaluate the testing accuracies on Cityscapes dataset. We obtain accuracy vs parameters/FLOPs trade offs, giving accuracy scores for models that can run under specified runtime budgets. We further propose a novel training procedure which starts out with a dense convolution but gradually evolves towards a grouped convolution. We show that our proposed training method and efficient architecture design can improve accuracies by over 8% with depth wise separable convolutions applied on the encoder of ERFNet and attaching a light weight decoder. This results in a model which has a 5X improvement in FLOPs while only suffering a 4% degradation in accuracy with respect to ERFNet.
Tasks Real-Time Semantic Segmentation, Semantic Segmentation
Published 2018-06-22
URL http://arxiv.org/abs/1806.08522v1
PDF http://arxiv.org/pdf/1806.08522v1.pdf
PWC https://paperswithcode.com/paper/efficient-semantic-segmentation-using-gradual
Repo
Framework

Multi-pseudo Regularized Label for Generated Data in Person Re-Identification

Title Multi-pseudo Regularized Label for Generated Data in Person Re-Identification
Authors Yan Huang, Jinsong Xu, Qiang Wu, Zhedong Zheng, Zhaoxiang Zhang, Jian Zhang
Abstract Sufficient training data normally is required to train deeply learned models. However, due to the expensive manual process for labelling large number of images, the amount of available training data is always limited. To produce more data for training a deep network, Generative Adversarial Network (GAN) can be used to generate artificial sample data. However, the generated data usually does not have annotation labels. To solve this problem, in this paper, we propose a virtual label called Multi-pseudo Regularized Label (MpRL) and assign it to the generated data. With MpRL, the generated data will be used as the supplementary of real training data to train a deep neural network in a semi-supervised learning fashion. To build the corresponding relationship between the real data and generated data, MpRL assigns each generated data a proper virtual label which reflects the likelihood of the affiliation of the generated data to pre-defined training classes in the real data domain. Unlike the traditional label which usually is a single integral number, the virtual label proposed in this work is a set of weight-based values each individual of which is a number in (0,1] called multi-pseudo label and reflects the degree of relation between each generated data to every pre-defined class of real data. A comprehensive evaluation is carried out by adopting two state-of-the-art convolutional neural networks (CNNs) in our experiments to verify the effectiveness of MpRL. Experiments demonstrate that by assigning MpRL to generated data, we can further improve the person re-ID performance on five re-ID datasets, i.e., Market-1501, DukeMTMC-reID, CUHK03, VIPeR, and CUHK01. The proposed method obtains +6.29%, +6.30%, +5.58%, +5.84%, and +3.48% improvements in rank-1 accuracy over a strong CNN baseline on the five datasets respectively, and outperforms state-of-the-art methods.
Tasks Person Re-Identification
Published 2018-01-21
URL http://arxiv.org/abs/1801.06742v3
PDF http://arxiv.org/pdf/1801.06742v3.pdf
PWC https://paperswithcode.com/paper/multi-pseudo-regularized-label-for-generated
Repo
Framework

Sparse Representation and Non-Negative Matrix Factorization for image denoise

Title Sparse Representation and Non-Negative Matrix Factorization for image denoise
Authors R. M. Farouk, M. E. Abd El-aziz, A. M. Adam
Abstract Recently, the problem of blind image separation has been widely investigated, especially the medical image denoise which is the main step in medical diag-nosis. Removing the noise without affecting relevant features of the image is the main goal. Sparse decomposition over redundant dictionaries become of the most used approaches to solve this problem. NMF codes naturally favor sparse, parts-based representations. In sparse representation, signals represented as a linear combination of a redundant dictionary atoms. In this paper, we propose an algorithm based on sparse representation over the redundant dictionary and Non-Negative Matrix Factorization (N-NMF). The algorithm initializes a dic-tionary based on training samples constructed from noised image, then it searches for the best representation for the source by using the approximate matching pursuit (AMP). The proposed N-NMF gives a better reconstruction of an image from denoised one. We have compared our numerical results with different image denoising techniques and we have found the performance of the proposed technique is promising. Keywords: Image denoising, sparse representation, dictionary learning, matching pursuit, non-negative matrix factorization.
Tasks Denoising, Dictionary Learning, Image Denoising
Published 2018-07-05
URL http://arxiv.org/abs/1807.03694v1
PDF http://arxiv.org/pdf/1807.03694v1.pdf
PWC https://paperswithcode.com/paper/sparse-representation-and-non-negative-matrix
Repo
Framework

A Desirability-Based Axiomatisation for Coherent Choice Functions

Title A Desirability-Based Axiomatisation for Coherent Choice Functions
Authors Jasper De Bock, Gert de Cooman
Abstract Choice functions constitute a simple, direct and very general mathematical framework for modelling choice under uncertainty. In particular, they are able to represent the set-valued choices that typically arise from applying decision rules to imprecise-probabilistic uncertainty models. We provide them with a clear interpretation in terms of attitudes towards gambling, borrowing ideas from the theory of sets of desirable gambles, and we use this interpretation to derive a set of basic axioms. We show that these axioms lead to a full-fledged theory of coherent choice functions, which includes a representation in terms of sets of desirable gambles, and a conservative inference method.
Tasks
Published 2018-06-04
URL http://arxiv.org/abs/1806.01044v1
PDF http://arxiv.org/pdf/1806.01044v1.pdf
PWC https://paperswithcode.com/paper/a-desirability-based-axiomatisation-for
Repo
Framework

PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition

Title PVNet: A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition
Authors Haoxuan You, Yifan Feng, Rongrong Ji, Yue Gao
Abstract 3D object recognition has attracted wide research attention in the field of multimedia and computer vision. With the recent proliferation of deep learning, various deep models with different representations have achieved the state-of-the-art performance. Among them, point cloud and multi-view based 3D shape representations are promising recently, and their corresponding deep models have shown significant performance on 3D shape recognition. However, there is little effort concentrating point cloud data and multi-view data for 3D shape representation, which is, in our consideration, beneficial and compensated to each other. In this paper, we propose the Point-View Network (PVNet), the first framework integrating both the point cloud and the multi-view data towards joint 3D shape recognition. More specifically, an embedding attention fusion scheme is proposed that could employ high-level features from the multi-view data to model the intrinsic correlation and discriminability of different structure features from the point cloud data. In particular, the discriminative descriptions are quantified and leveraged as the soft attention mask to further refine the structure feature of the 3D shape. We have evaluated the proposed method on the ModelNet40 dataset for 3D shape classification and retrieval tasks. Experimental results and comparisons with state-of-the-art methods demonstrate that our framework can achieve superior performance.
Tasks 3D Object Recognition, 3D Shape Recognition, 3D Shape Representation, Object Recognition
Published 2018-08-23
URL http://arxiv.org/abs/1808.07659v1
PDF http://arxiv.org/pdf/1808.07659v1.pdf
PWC https://paperswithcode.com/paper/pvnet-a-joint-convolutional-network-of-point
Repo
Framework
comments powered by Disqus