Paper Group ANR 1151
Appearance invariant Entry-Exit matching using visual soft biometric traits. Block Randomized Optimization for Adaptive Hypergraph Learning. On the Robustness of the Backdoor-based Watermarking in Deep Neural Networks. Toward Best Practices for Explainable B2B Machine Learning. AdvFaces: Adversarial Face Synthesis. Evaluating Semantic Representatio …
Appearance invariant Entry-Exit matching using visual soft biometric traits
Title | Appearance invariant Entry-Exit matching using visual soft biometric traits |
Authors | Vinay Kumar V, P Nagabhushan |
Abstract | The problem of appearance invariant subject recognition for Entry-Exit surveillance applications is addressed. A novel Semantic Entry-Exit matching model that makes use of ancillary information about subjects such as height, build, complexion and clothing color to endorse exit of every subject who had entered private area is proposed in this paper. The proposed method is robust to variations in clothing. Each describing attribute is given equal weight while computing the matching score and hence the proposed model achieves high rank-k accuracy on benchmark datasets. The soft biometric traits used as a combination though cannot achieve high rank-1 accuracy, it helps to narrow down the search to match using reliable biometric traits such as gait and face whose learning and matching time is costlier when compared to the visual soft biometrics. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1909.05145v1 |
https://arxiv.org/pdf/1909.05145v1.pdf | |
PWC | https://paperswithcode.com/paper/appearance-invariant-entry-exit-matching |
Repo | |
Framework | |
Block Randomized Optimization for Adaptive Hypergraph Learning
Title | Block Randomized Optimization for Adaptive Hypergraph Learning |
Authors | Georgios Karantaidis, Ioannis Sarridis, Constantine Kotropoulos |
Abstract | The high-order relations between the content in social media sharing platforms are frequently modeled by a hypergraph. Either hypergraph Laplacian matrix or the adjacency matrix is a big matrix. Randomized algorithms are used for low-rank factorizations in order to approximately decompose and eventually invert such big matrices fast. Here, block randomized Singular Value Decomposition (SVD) via subspace iteration is integrated within adaptive hypergraph weight estimation for image tagging, as a first approach. Specifically, creating low-rank submatrices along the main diagonal by tessellation permits fast matrix inversions via randomized SVD. Moreover, a second approach is proposed for solving the linear system in the optimization problem of hypergraph learning by employing the conjugate gradient method. Both proposed approaches achieve high accuracy in image tagging measured by F1 score and succeed to reduce the computational requirements of adaptive hypergraph weight estimation. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08281v1 |
https://arxiv.org/pdf/1908.08281v1.pdf | |
PWC | https://paperswithcode.com/paper/block-randomized-optimization-for-adaptive |
Repo | |
Framework | |
On the Robustness of the Backdoor-based Watermarking in Deep Neural Networks
Title | On the Robustness of the Backdoor-based Watermarking in Deep Neural Networks |
Authors | Masoumeh Shafieinejad, Jiaqi Wang, Nils Lukas, Xinda Li, Florian Kerschbaum |
Abstract | Obtaining the state of the art performance of deep learning models imposes a high cost to model generators, due to the tedious data preparation and the substantial processing requirements. To protect the model from unauthorized re-distribution, watermarking approaches have been introduced in the past couple of years. We investigate the robustness and reliability of state-of-the-art deep neural network watermarking schemes. We focus on backdoor-based watermarking and propose two – a black-box and a white-box – attacks that remove the watermark. Our black-box attack steals the model and removes the watermark with minimum requirements; it just relies on public unlabeled data and a black-box access to the classification label. It does not need classification confidences or access to the model’s sensitive information such as the training data set, the trigger set or the model parameters. The white-box attack, proposes an efficient watermark removal when the parameters of the marked model are available; our white-box attack does not require access to the labeled data or the trigger set and improves the runtime of the black-box attack up to seventeen times. We as well prove the security inadequacy of the backdoor-based watermarking in keeping the watermark undetectable by proposing an attack that detects whether a model contains a watermark. Our attacks show that a recipient of a marked model can remove a backdoor-based watermark with significantly less effort than training a new model and some other techniques are needed to protect against re-distribution by a motivated attacker. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07745v2 |
https://arxiv.org/pdf/1906.07745v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-robustness-of-the-backdoor-based |
Repo | |
Framework | |
Toward Best Practices for Explainable B2B Machine Learning
Title | Toward Best Practices for Explainable B2B Machine Learning |
Authors | Kit Kuksenok |
Abstract | To design tools and data pipelines for explainable B2B machine learning (ML) systems, we need to recognize not only the immediate audience of such tools and data, but also (1) their organizational context and (2) secondary audiences. Our learnings are based on building custom ML-based chatbots for recruitment. We believe that in the B2B context, “explainable” ML means not only a system that can “explain itself” through tools and data pipelines, but also enables its domain-expert users to explain it to other stakeholders. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04837v1 |
https://arxiv.org/pdf/1906.04837v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-best-practices-for-explainable-b2b |
Repo | |
Framework | |
AdvFaces: Adversarial Face Synthesis
Title | AdvFaces: Adversarial Face Synthesis |
Authors | Debayan Deb, Jianbang Zhang, Anil K. Jain |
Abstract | Face recognition systems have been shown to be vulnerable to adversarial examples resulting from adding small perturbations to probe images. Such adversarial images can lead state-of-the-art face recognition systems to falsely reject a genuine subject (obfuscation attack) or falsely match to an impostor (impersonation attack). Current approaches to crafting adversarial face images lack perceptual quality and take an unreasonable amount of time to generate them. We propose, AdvFaces, an automated adversarial face synthesis method that learns to generate minimal perturbations in the salient facial regions via Generative Adversarial Networks. Once AdvFaces is trained, it can automatically generate imperceptible perturbations that can evade state-of-the-art face matchers with attack success rates as high as 97.22% and 24.30% for obfuscation and impersonation attacks, respectively. |
Tasks | Face Generation, Face Recognition |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05008v1 |
https://arxiv.org/pdf/1908.05008v1.pdf | |
PWC | https://paperswithcode.com/paper/advfaces-adversarial-face-synthesis |
Repo | |
Framework | |
Evaluating Semantic Representations of Source Code
Title | Evaluating Semantic Representations of Source Code |
Authors | Yaza Wainakh, Moiz Rauf, Michael Pradel |
Abstract | Learned representations of source code enable various software developer tools, e.g., to detect bugs or to predict program properties. At the core of code representations often are word embeddings of identifier names in source code, because identifiers account for the majority of source code vocabulary and convey important semantic information. Unfortunately, there currently is no generally accepted way of evaluating the quality of word embeddings of identifiers, and current evaluations are biased toward specific downstream tasks. This paper presents IdBench, the first benchmark for evaluating to what extent word embeddings of identifiers represent semantic relatedness and similarity. The benchmark is based on thousands of ratings gathered by surveying 500 software developers. We use IdBench to evaluate state-of-the-art embedding techniques proposed for natural language, an embedding technique specifically designed for source code, and lexical string distance functions, as these are often used in current developer tools. Our results show that the effectiveness of embeddings varies significantly across different embedding techniques and that the best available embeddings successfully represent semantic relatedness. On the downside, no existing embedding provides a satisfactory representation of semantic similarities, e.g., because embeddings consider identifiers with opposing meanings as similar, which may lead to fatal mistakes in downstream developer tools. IdBench provides a gold standard to guide the development of novel embeddings that address the current limitations. |
Tasks | Word Embeddings |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05177v1 |
https://arxiv.org/pdf/1910.05177v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-semantic-representations-of-source |
Repo | |
Framework | |
Satellite Pose Estimation with Deep Landmark Regression and Nonlinear Pose Refinement
Title | Satellite Pose Estimation with Deep Landmark Regression and Nonlinear Pose Refinement |
Authors | Bo Chen, Jiewei Cao, Alvaro Parra, Tat-Jun Chin |
Abstract | We propose an approach to estimate the 6DOF pose of a satellite, relative to a canonical pose, from a single image. Such a problem is crucial in many space proximity operations, such as docking, debris removal, and inter-spacecraft communications. Our approach combines machine learning and geometric optimisation, by predicting the coordinates of a set of landmarks in the input image, associating the landmarks to their corresponding 3D points on an a priori reconstructed 3D model, then solving for the object pose using non-linear optimisation. Our approach is not only novel for this specific pose estimation task, which helps to further open up a relatively new domain for machine learning and computer vision, but it also demonstrates superior accuracy and won the first place in the recent Kelvins Pose Estimation Challenge organised by the European Space Agency (ESA). |
Tasks | Pose Estimation |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11542v1 |
https://arxiv.org/pdf/1908.11542v1.pdf | |
PWC | https://paperswithcode.com/paper/satellite-pose-estimation-with-deep-landmark |
Repo | |
Framework | |
Predicting the Mumble of Wireless Channel with Sequence-to-Sequence Models
Title | Predicting the Mumble of Wireless Channel with Sequence-to-Sequence Models |
Authors | Yourui Huangfu, Jian Wang, Rong Li, Chen Xu, Xianbin Wang, Huazi Zhang, Jun Wang |
Abstract | Accurate prediction of fading channel in future is essential to realize adaptive transmission and other methods that can save power and provide gains. In practice, wireless channel model can be regarded as a new language model, and the time-varying channel can be seen as mumbling in this language, which is too complex to understand, to say nothing of prediction. Fortunately, neural networks have been proved efficient in learning language models in recent times, moreover, sequence-to-sequence (seq2seq) models can provide the state of the art performance in various tasks such as machine translation, image caption generation, and text summarization. Predicting channel with neural networks seems promising, however, vanilla neural networks cannot deal with complex-valued inputs while channel state information (CSI) is in complex domain. In this paper, we present a powerful method to understand and predict complex-valued channel by utilizing seq2seq models, the results show that seq2seq models are also expert in time series prediction, and realistic channel prediction with comparable or superior performance relative to channel estimation is attainable. |
Tasks | Language Modelling, Machine Translation, Text Summarization, Time Series, Time Series Prediction |
Published | 2019-01-14 |
URL | http://arxiv.org/abs/1901.04119v1 |
http://arxiv.org/pdf/1901.04119v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-the-mumble-of-wireless-channel |
Repo | |
Framework | |
SAERMA: Stacked Autoencoder Rule Mining Algorithm for the Interpretation of Epistatic Interactions in GWAS for Extreme Obesity
Title | SAERMA: Stacked Autoencoder Rule Mining Algorithm for the Interpretation of Epistatic Interactions in GWAS for Extreme Obesity |
Authors | Casimiro Aday Curbelo Montañez, Paul Fergus, Carl Chalmers, Nurul Ahamed Hassain Malim, Basma Abdulaimma, Denis Reilly, Francesco Falciani |
Abstract | One of the most important challenges in the analysis of high-throughput genetic data is the development of efficient computational methods to identify statistically significant Single Nucleotide Polymorphisms (SNPs). Genome-wide association studies (GWAS) use single-locus analysis where each SNP is independently tested for association with phenotypes. The limitation with this approach, however, is its inability to explain genetic variation in complex diseases. Alternative approaches are required to model the intricate relationships between SNPs. Our proposed approach extends GWAS by combining deep learning stacked autoencoders (SAEs) and association rule mining (ARM) to identify epistatic interactions between SNPs. Following traditional GWAS quality control and association analysis, the most significant SNPs are selected and used in the subsequent analysis to investigate epistasis. SAERMA controls the classification results produced in the final fully connected multi-layer feedforward artificial neural network (MLP) by manipulating the interestingness measures, support and confidence, in the rule generation process. The best classification results were achieved with 204 SNPs compressed to 100 units (77% AUC, 77% SE, 68% SP, 53% Gini, logloss=0.58, and MSE=0.20), although it was possible to achieve 73% AUC (77% SE, 63% SP, 45% Gini, logloss=0.62, and MSE=0.21) with 50 hidden units - both supported by close model interpretation. |
Tasks | |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10166v1 |
https://arxiv.org/pdf/1908.10166v1.pdf | |
PWC | https://paperswithcode.com/paper/saerma-stacked-autoencoder-rule-mining |
Repo | |
Framework | |
Tripartite Vector Representations for Better Job Recommendation
Title | Tripartite Vector Representations for Better Job Recommendation |
Authors | Mengshu Liu, Jingya Wang, Kareem Abdelfatah, Mohammed Korayem |
Abstract | Job recommendation is a crucial part of the online job recruitment business. To match the right person with the right job, a good representation of job postings is required. Such representations should ideally recommend jobs with fitting titles, aligned skill set, and reasonable commute. To address these aspects, we utilize three information graphs ( job-job, skill-skill, job-skill) from historical job data to learn a joint representation for both job titles and skills in a shared latent space. This allows us to gain a representation of job postings/ resume using both elements, which subsequently can be combined with location. In this paper, we first present how the presentation of each component is obtained, and then we discuss how these different representations are combined together into one single space to acquire the final representation. The results of comparing the proposed methodology against different base-line methods show significant improvement in terms of relevancy. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.12379v1 |
https://arxiv.org/pdf/1907.12379v1.pdf | |
PWC | https://paperswithcode.com/paper/tripartite-vector-representations-for-better |
Repo | |
Framework | |
SR-GAN: Semantic Rectifying Generative Adversarial Network for Zero-shot Learning
Title | SR-GAN: Semantic Rectifying Generative Adversarial Network for Zero-shot Learning |
Authors | Zihan Ye, Fan lyu, Linyan Li, Qiming Fu, Jinchang Ren, Fuyuan Hu |
Abstract | The existing Zero-Shot learning (ZSL) methods may suffer from the vague class attributes that are highly overlapped for different classes. Unlike these methods that ignore the discrimination among classes, in this paper, we propose to classify unseen image by rectifying the semantic space guided by the visual space. First, we pre-train a Semantic Rectifying Network (SRN) to rectify semantic space with a semantic loss and a rectifying loss. Then, a Semantic Rectifying Generative Adversarial Network (SR-GAN) is built to generate plausible visual feature of unseen class from both semantic feature and rectified semantic feature. To guarantee the effectiveness of rectified semantic features and synthetic visual features, a pre-reconstruction and a post reconstruction networks are proposed, which keep the consistency between visual feature and semantic feature. Experimental results demonstrate that our approach significantly outperforms the state-of-the-arts on four benchmark datasets. |
Tasks | Zero-Shot Learning |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.06996v1 |
http://arxiv.org/pdf/1904.06996v1.pdf | |
PWC | https://paperswithcode.com/paper/sr-gan-semantic-rectifying-generative |
Repo | |
Framework | |
Facial Emotion Recognition Using Deep Learning
Title | Facial Emotion Recognition Using Deep Learning |
Authors | Ching-Da Wu, Li-Heng Chen |
Abstract | We aim to construct a system that captures real-world facial images through the front camera on a laptop. The system is capable of processing/recognizing the captured image and predict a result in real-time. In this system, we exploit the power of deep learning technique to learn a facial emotion recognition (FER) model based on a set of labeled facial images. Finally, experiments are conducted to evaluate our model using largely used public database. |
Tasks | Emotion Recognition |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.11113v1 |
https://arxiv.org/pdf/1910.11113v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-emotion-recognition-using-deep |
Repo | |
Framework | |
Deep Transfer Learning For Whole-Brain fMRI Analyses
Title | Deep Transfer Learning For Whole-Brain fMRI Analyses |
Authors | Armin W. Thomas, Klaus-Robert Müller, Wojciech Samek |
Abstract | The application of deep learning (DL) models to the decoding of cognitive states from whole-brain functional Magnetic Resonance Imaging (fMRI) data is often hindered by the small sample size and high dimensionality of these datasets. Especially, in clinical settings, where patient data are scarce. In this work, we demonstrate that transfer learning represents a solution to this problem. Particularly, we show that a DL model, which has been previously trained on a large openly available fMRI dataset of the Human Connectome Project, outperforms a model variant with the same architecture, but which is trained from scratch, when both are applied to the data of a new, unrelated fMRI task. Even further, the pre-trained DL model variant is already able to correctly decode 67.51% of the cognitive states from a test dataset with 100 individuals, when fine-tuned on a dataset of the size of only three subjects. |
Tasks | Transfer Learning |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01953v1 |
https://arxiv.org/pdf/1907.01953v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-transfer-learning-for-whole-brain-fmri |
Repo | |
Framework | |
Deep Cross Networks with Aesthetic Preference for Cross-domain Recommendation
Title | Deep Cross Networks with Aesthetic Preference for Cross-domain Recommendation |
Authors | Jian Liu, Pengpeng Zhao, Yanchi Liu, Victor S. Sheng, Fuzheng Zhuang, Jiajie Xu, Xiaofang Zhou, Hui Xiong |
Abstract | When purchasing appearance-first products, e.g., clothes, product appearance aesthetics plays an important role in the decision process. Moreover, user’s aesthetic preference, which can be regarded as a personality trait and a basic requirement, is domain independent and could be used as a bridge between domains for knowledge transfer. However, existing work has rarely considered the aesthetic information in product photos for cross-domain recommendation. To this end, in this paper, we propose a new deep Aesthetic preference Cross-Domain Network (ACDN), in which parameters characterizing personal aesthetic preferences are shared across networks to transfer knowledge between domains. Specifically, we first leverage an aesthetic network to extract relevant features. Then, we integrate the aesthetic features into a cross-domain network to transfer users’ domain independent aesthetic preferences. Moreover, network cross-connections are introduced to enable dual knowledge transfer across domains. Finally, the experimental results on real-world data show that our proposed ACDN outperforms other benchmark methods in terms of recommendation accuracy. The results also show that users’ aesthetic preferences are effective in alleviating the data sparsity issue on the cross-domain recommendation. |
Tasks | Transfer Learning |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.13030v1 |
https://arxiv.org/pdf/1905.13030v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-cross-networks-with-aesthetic-preference |
Repo | |
Framework | |
Multi-path Learning for Object Pose Estimation Across Domains
Title | Multi-path Learning for Object Pose Estimation Across Domains |
Authors | Martin Sundermeyer, Maximilian Durner, En Yen Puang, Zoltan-Csaba Marton, Rudolph Triebel |
Abstract | We introduce a scalable approach for object pose estimation trained on simulated RGB views of multiple 3D models together. We learn an encoding of object views that does not only describe the orientation of all objects seen during training, but can also relate views of untrained objects. Our single-encoder-multi-decoder network is trained using a technique we denote “multi-path learning”: While the encoder is shared by all objects, each decoder only reconstructs views of a single object. Consequently, views of different instances do not need to be separated in the latent space and can share common features. The resulting encoder generalizes well from synthetic to real data and across various instances, categories, model types and datasets. We systematically investigate the learned encodings, their generalization capabilities and iterative refinement strategies on the ModelNet40 and T-LESS dataset. On T-LESS, we achieve state-of-the-art results with our 6D Object Detection pipeline, both in the RGB and depth domain, outperforming learning-free pipelines at much lower runtimes. |
Tasks | Object Detection, Pose Estimation |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00151v1 |
https://arxiv.org/pdf/1908.00151v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-path-learning-for-object-pose |
Repo | |
Framework | |