July 29, 2019

2985 words 15 mins read

Paper Group AWR 94

Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge. Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction. SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes. Sparsity Invariant CNNs. Oriented Response Networks. Discriminating Traces with Time. Reducing Comple …

Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge


Title	Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge
Authors	Dinh-Luan Nguyen, Kai Cao, Anil K. Jain
Abstract	We propose a fully automatic minutiae extractor, called MinutiaeNet, based on deep neural networks with compact feature representation for fast comparison of minutiae sets. Specifically, first a network, called CoarseNet, estimates the minutiae score map and minutiae orientation based on convolutional neural network and fingerprint domain knowledge (enhanced image, orientation field, and segmentation map). Subsequently, another network, called FineNet, refines the candidate minutiae locations based on score map. We demonstrate the effectiveness of using the fingerprint domain knowledge together with the deep networks. Experimental results on both latent (NIST SD27) and plain (FVC 2004) public domain fingerprint datasets provide comprehensive empirical support for the merits of our method. Further, our method finds minutiae sets that are better in terms of precision and recall in comparison with state-of-the-art on these two datasets. Given the lack of annotated fingerprint datasets with minutiae ground truth, the proposed approach to robust minutiae detection will be useful to train network-based fingerprint matching algorithms as well as for evaluating fingerprint individuality at scale. MinutiaeNet is implemented in Tensorflow: https://github.com/luannd/MinutiaeNet
Tasks
Published	2017-12-26
URL	http://arxiv.org/abs/1712.09401v1
PDF	http://arxiv.org/pdf/1712.09401v1.pdf
PWC	https://paperswithcode.com/paper/robust-minutiae-extractor-integrating-deep
Repo	https://github.com/luannd/MinutiaeNet
Framework	tf

Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction


Title	Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction
Authors	Savas Ozkan, Gozde Bozdagi Akar
Abstract	Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66:7%) . Lastly, we believe that this method can be extended to different problems such as action/event recognition in future.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.07335v1
PDF	http://arxiv.org/pdf/1708.07335v1.pdf
PWC	https://paperswithcode.com/paper/relaxed-spatio-temporal-deep-feature
Repo	https://github.com/savasozkan/real-fake-emotions
Framework	tf

SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes


Title	SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes
Authors	Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O’Connor
Abstract	We introduce SaltiNet, a deep neural network for scanpath prediction trained on 360-degree images. The model is based on a temporal-aware novel representation of saliency information named the saliency volume. The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes. Sampling strategies over these volumes are used to generate scanpaths over the 360-degree images. Our experiments show the advantages of using saliency volumes, and how they can be used for related tasks. Our source code and trained models available at https://github.com/massens/saliency-360salient-2017.
Tasks
Published	2017-07-11
URL	http://arxiv.org/abs/1707.03123v5
PDF	http://arxiv.org/pdf/1707.03123v5.pdf
PWC	https://paperswithcode.com/paper/saltinet-scan-path-prediction-on-360-degree
Repo	https://github.com/massens/saliency-360salient-2017
Framework	none

Sparsity Invariant CNNs


Title	Sparsity Invariant CNNs
Authors	Jonas Uhrig, Nick Schneider, Lukas Schneider, Uwe Franke, Thomas Brox, Andreas Geiger
Abstract	In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings and will be made available upon publication.
Tasks	Depth Estimation
Published	2017-08-22
URL	http://arxiv.org/abs/1708.06500v2
PDF	http://arxiv.org/pdf/1708.06500v2.pdf
PWC	https://paperswithcode.com/paper/sparsity-invariant-cnns
Repo	https://github.com/PeterTor/sparse_convolution
Framework	tf

Oriented Response Networks


Title	Oriented Response Networks
Authors	Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao
Abstract	Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During back-propagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks.
Tasks	Image Classification
Published	2017-01-07
URL	http://arxiv.org/abs/1701.01833v2
PDF	http://arxiv.org/pdf/1701.01833v2.pdf
PWC	https://paperswithcode.com/paper/oriented-response-networks
Repo	https://github.com/ZhouYanzhao/ORN
Framework	torch

Discriminating Traces with Time


Title	Discriminating Traces with Time
Authors	Saeid Tizpaz-Niari, Pavol Cerny, Bor-Yuh Evan Chang, Sriram Sankaranarayanan, Ashutosh Trivedi
Abstract	What properties about the internals of a program explain the possible differences in its overall running time for different inputs? In this paper, we propose a formal framework for considering this question we dub trace-set discrimination. We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP) and decision tree learning can be useful in zeroing-in on the program internals. On a set of Java benchmarks, we find that compactly-represented decision trees scalably discriminate with high accuracy—more scalably than maximum likelihood discriminants and with comparable accuracy. We demonstrate on three larger case studies how decision-tree discriminants produced by our tool are useful for debugging timing side-channel vulnerabilities (i.e., where a malicious observer infers secrets simply from passively watching execution times) and availability vulnerabilities.
Tasks
Published	2017-02-23
URL	http://arxiv.org/abs/1702.07103v1
PDF	http://arxiv.org/pdf/1702.07103v1.pdf
PWC	https://paperswithcode.com/paper/discriminating-traces-with-time
Repo	https://github.com/cuplv/Discriminer
Framework	none

Reducing Complexity of HEVC: A Deep Learning Approach


Title	Reducing Complexity of HEVC: A Deep Learning Approach
Authors	Mai Xu, Tianyi Li, Zulin Wang, Xin Deng, Ren Yang, Zhenyu Guan
Abstract	High Efficiency Video Coding (HEVC) significantly reduces bit-rates over the proceeding H.264 standard but at the expense of extremely high encoding complexity. In HEVC, the quad-tree partition of coding unit (CU) consumes a large proportion of the HEVC encoding complexity, due to the bruteforce search for rate-distortion optimization (RDO). Therefore, this paper proposes a deep learning approach to predict the CU partition for reducing the HEVC complexity at both intra- and inter-modes, which is based on convolutional neural network (CNN) and long- and short-term memory (LSTM) network. First, we establish a large-scale database including substantial CU partition data for HEVC intra- and inter-modes. This enables deep learning on the CU partition. Second, we represent the CU partition of an entire coding tree unit (CTU) in the form of a hierarchical CU partition map (HCPM). Then, we propose an early-terminated hierarchical CNN (ETH-CNN) for learning to predict the HCPM. Consequently, the encoding complexity of intra-mode HEVC can be drastically reduced by replacing the brute-force search with ETH-CNN to decide the CU partition. Third, an early-terminated hierarchical LSTM (ETH-LSTM) is proposed to learn the temporal correlation of the CU partition. Then, we combine ETH-LSTM and ETH-CNN to predict the CU partition for reducing the HEVC complexity for inter-mode. Finally, experimental results show that our approach outperforms other state-of-the-art approaches in reducing the HEVC complexity at both intra- and inter-modes.
Tasks
Published	2017-09-19
URL	http://arxiv.org/abs/1710.01218v3
PDF	http://arxiv.org/pdf/1710.01218v3.pdf
PWC	https://paperswithcode.com/paper/reducing-complexity-of-hevc-a-deep-learning
Repo	https://github.com/HEVC-Projects/CPH
Framework	none

An improved Ant Colony System for the Sequential Ordering Problem


Title	An improved Ant Colony System for the Sequential Ordering Problem
Authors	Rafał Skinderowicz
Abstract	It is not rare that the performance of one metaheuristic algorithm can be improved by incorporating ideas taken from another. In this article we present how Simulated Annealing (SA) can be used to improve the efficiency of the Ant Colony System (ACS) and Enhanced ACS when solving the Sequential Ordering Problem (SOP). Moreover, we show how the very same ideas can be applied to improve the convergence of a dedicated local search, i.e. the SOP-3-exchange algorithm. A statistical analysis of the proposed algorithms both in terms of finding suitable parameter values and the quality of the generated solutions is presented based on a series of computational experiments conducted on SOP instances from the well-known TSPLIB and SOPLIB2006 repositories. The proposed ACS-SA and EACS-SA algorithms often generate solutions of better quality than the ACS and EACS, respectively. Moreover, the EACS-SA algorithm combined with the proposed SOP-3-exchange-SA local search was able to find 10 new best solutions for the SOP instances from the SOPLIB2006 repository, thus improving the state-of-the-art results as known from the literature. Overall, the best known or improved solutions were found in 41 out of 48 cases.
Tasks
Published	2017-05-02
URL	http://arxiv.org/abs/1705.01076v1
PDF	http://arxiv.org/pdf/1705.01076v1.pdf
PWC	https://paperswithcode.com/paper/an-improved-ant-colony-system-for-the
Repo	https://github.com/RSkinderowicz/AntColonySystemSA
Framework	none

Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings


Title	Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings
Authors	Abhishek, Ashish Anand, Amit Awekar
Abstract	Fine-grained entity type classification (FETC) is the task of classifying an entity mention to a broad set of types. Distant supervision paradigm is extensively used to generate training data for this task. However, generated training data assigns same set of labels to every mention of an entity without considering its local context. Existing FETC systems have two major drawbacks: assuming training data to be noise free and use of hand crafted features. Our work overcomes both drawbacks. We propose a neural network model that jointly learns entity mentions and their context representation to eliminate use of hand crafted features. Our model treats training data as noisy and uses non-parametric variant of hinge loss function. Experiments show that the proposed model outperforms previous state-of-the-art methods on two publicly available datasets, namely FIGER (GOLD) and BBN with an average relative improvement of 2.69% in micro-F1 score. Knowledge learnt by our model on one dataset can be transferred to other datasets while using same model or other FETC systems. These approaches of transferring knowledge further improve the performance of respective models.
Tasks
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06709v1
PDF	http://arxiv.org/pdf/1702.06709v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-entity-type-classification-by
Repo	https://github.com/abhipec/fnet
Framework	tf

Mandolin: A Knowledge Discovery Framework for the Web of Data


Title	Mandolin: A Knowledge Discovery Framework for the Web of Data
Authors	Tommaso Soru, Diego Esteves, Edgard Marx, Axel-Cyrille Ngonga Ngomo
Abstract	Markov Logic Networks join probabilistic modeling with first-order logic and have been shown to integrate well with the Semantic Web foundations. While several approaches have been devised to tackle the subproblems of rule mining, grounding, and inference, no comprehensive workflow has been proposed so far. In this paper, we fill this gap by introducing a framework called Mandolin, which implements a workflow for knowledge discovery specifically on RDF datasets. Our framework imports knowledge from referenced graphs, creates similarity relationships among similar literals, and relies on state-of-the-art techniques for rule mining, grounding, and inference computation. We show that our best configuration scales well and achieves at least comparable results with respect to other statistical-relational-learning algorithms on link prediction.
Tasks	Link Prediction, Relational Reasoning
Published	2017-11-03
URL	http://arxiv.org/abs/1711.01283v1
PDF	http://arxiv.org/pdf/1711.01283v1.pdf
PWC	https://paperswithcode.com/paper/mandolin-a-knowledge-discovery-framework-for
Repo	https://github.com/AKSW/Mandolin
Framework	none

Vehicle Traffic Driven Camera Placement for Better Metropolis Security Surveillance


Title	Vehicle Traffic Driven Camera Placement for Better Metropolis Security Surveillance
Authors	Yihui He, Xiaobo Ma, Xiapu Luo, Jianfeng Li, Mengchen Zhao, Bo An, Xiaohong Guan
Abstract	Security surveillance is one of the most important issues in smart cities, especially in an era of terrorism. Deploying a number of (video) cameras is a common surveillance approach. Given the never-ending power offered by vehicles to metropolises, exploiting vehicle traffic to design camera placement strategies could potentially facilitate security surveillance. This article constitutes the first effort toward building the linkage between vehicle traffic and security surveillance, which is a critical problem for smart cities. We expect our study could influence the decision making of surveillance camera placement, and foster more research of principled ways of security surveillance beneficial to our physical-world life. Code has been made publicly available.
Tasks	Decision Making
Published	2017-04-01
URL	http://arxiv.org/abs/1705.08508v4
PDF	http://arxiv.org/pdf/1705.08508v4.pdf
PWC	https://paperswithcode.com/paper/vehicle-traffic-driven-camera-placement-for
Repo	https://github.com/yihui-he/Vehicle-Traffic-Driven-Camera-Placement
Framework	none

Interleaved Group Convolutions for Deep Neural Networks


Title	Interleaved Group Convolutions for Deep Neural Networks
Authors	Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang
Abstract	In this paper, we present a simple and modularized neural network architecture, named interleaved group convolutional neural networks (IGCNets). The main point lies in a novel building block, a pair of two successive interleaved group convolutions: primary group convolution and secondary group convolution. The two group convolutions are complementary: (i) the convolution on each partition in primary group convolution is a spatial convolution, while on each partition in secondary group convolution, the convolution is a point-wise convolution; (ii) the channels in the same secondary partition come from different primary partitions. We discuss one representative advantage: Wider than a regular convolution with the number of parameters and the computation complexity preserved. We also show that regular convolutions, group convolution with summation fusion, and the Xception block are special cases of interleaved group convolutions. Empirical results over standard benchmarks, CIFAR-$10$, CIFAR-$100$, SVHN and ImageNet demonstrate that our networks are more efficient in using parameters and computation complexity with similar or higher accuracy.
Tasks
Published	2017-07-10
URL	http://arxiv.org/abs/1707.02725v2
PDF	http://arxiv.org/pdf/1707.02725v2.pdf
PWC	https://paperswithcode.com/paper/interleaved-group-convolutions-for-deep
Repo	https://github.com/homles11/IGCV3
Framework	tf

Utilizing Domain Knowledge in End-to-End Audio Processing


Title	Utilizing Domain Knowledge in End-to-End Audio Processing
Authors	Tycho Max Sylvester Tax, Jose Luis Diez Antich, Hendrik Purwins, Lars Maaløe
Abstract	End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations. In this paper we present preliminary work that shows the feasibility of training the first layers of a deep convolutional neural network (CNN) model to learn the commonly-used log-scaled mel-spectrogram transformation. Secondly, we demonstrate that upon initializing the first layers of an end-to-end CNN classifier with the learned transformation, convergence and performance on the ESC-50 environmental sound classification dataset are similar to a CNN-based model trained on the highly pre-processed log-scaled mel-spectrogram features.
Tasks	Environmental Sound Classification
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00254v1
PDF	http://arxiv.org/pdf/1712.00254v1.pdf
PWC	https://paperswithcode.com/paper/utilizing-domain-knowledge-in-end-to-end
Repo	https://github.com/corticph/MSTmodel
Framework	tf

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples


Title	EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
Authors	Pin-Yu Chen, Yash Sharma, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh
Abstract	Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on $L_2$ and $L_\infty$ distortion metrics. However, despite the fact that $L_1$ distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting $L_1$-based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art $L_2$ attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small $L_1$ distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging $L_1$ distortion in adversarial machine learning and security implications of DNNs.
Tasks
Published	2017-09-13
URL	http://arxiv.org/abs/1709.04114v3
PDF	http://arxiv.org/pdf/1709.04114v3.pdf
PWC	https://paperswithcode.com/paper/ead-elastic-net-attacks-to-deep-neural
Repo	https://github.com/ysharma1126/EAD-Attack
Framework	tf

Diverse Weighted Bipartite b-Matching


Title	Diverse Weighted Bipartite b-Matching
Authors	Faez Ahmed, John P. Dickerson, Mark Fuge
Abstract	Bipartite matching, where agents on one side of a market are matched to agents or items on the other, is a classical problem in computer science and economics, with widespread application in healthcare, education, advertising, and general resource allocation. A practitioner’s goal is typically to maximize a matching market’s economic efficiency, possibly subject to some fairness requirements that promote equal access to resources. A natural balancing act exists between fairness and efficiency in matching markets, and has been the subject of much research. In this paper, we study a complementary goal—balancing diversity and efficiency—in a generalization of bipartite matching where agents on one side of the market can be matched to sets of agents on the other. Adapting a classical definition of the diversity of a set, we propose a quadratic programming-based approach to solving a supermodular minimization problem that balances diversity and total weight of the solution. We also provide a scalable greedy algorithm with theoretical performance bounds. We then define the price of diversity, a measure of the efficiency loss due to enforcing diversity, and give a worst-case theoretical bound. Finally, we demonstrate the efficacy of our methods on three real-world datasets, and show that the price of diversity is not bad in practice.
Tasks
Published	2017-02-23
URL	http://arxiv.org/abs/1702.07134v2
PDF	http://arxiv.org/pdf/1702.07134v2.pdf
PWC	https://paperswithcode.com/paper/diverse-weighted-bipartite-b-matching
Repo	https://github.com/faezahmed/diverse_matching
Framework	none