Paper Group AWR 94
Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge. Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction. SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes. Sparsity Invariant CNNs. Oriented Response Networks. Discriminating Traces with Time. Reducing Comple …
Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge
Title | Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge |
Authors | Dinh-Luan Nguyen, Kai Cao, Anil K. Jain |
Abstract | We propose a fully automatic minutiae extractor, called MinutiaeNet, based on deep neural networks with compact feature representation for fast comparison of minutiae sets. Specifically, first a network, called CoarseNet, estimates the minutiae score map and minutiae orientation based on convolutional neural network and fingerprint domain knowledge (enhanced image, orientation field, and segmentation map). Subsequently, another network, called FineNet, refines the candidate minutiae locations based on score map. We demonstrate the effectiveness of using the fingerprint domain knowledge together with the deep networks. Experimental results on both latent (NIST SD27) and plain (FVC 2004) public domain fingerprint datasets provide comprehensive empirical support for the merits of our method. Further, our method finds minutiae sets that are better in terms of precision and recall in comparison with state-of-the-art on these two datasets. Given the lack of annotated fingerprint datasets with minutiae ground truth, the proposed approach to robust minutiae detection will be useful to train network-based fingerprint matching algorithms as well as for evaluating fingerprint individuality at scale. MinutiaeNet is implemented in Tensorflow: https://github.com/luannd/MinutiaeNet |
Tasks | |
Published | 2017-12-26 |
URL | http://arxiv.org/abs/1712.09401v1 |
http://arxiv.org/pdf/1712.09401v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-minutiae-extractor-integrating-deep |
Repo | https://github.com/luannd/MinutiaeNet |
Framework | tf |
Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction
Title | Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction |
Authors | Savas Ozkan, Gozde Bozdagi Akar |
Abstract | Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66:7%) . Lastly, we believe that this method can be extended to different problems such as action/event recognition in future. |
Tasks | |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07335v1 |
http://arxiv.org/pdf/1708.07335v1.pdf | |
PWC | https://paperswithcode.com/paper/relaxed-spatio-temporal-deep-feature |
Repo | https://github.com/savasozkan/real-fake-emotions |
Framework | tf |
SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes
Title | SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes |
Authors | Marc Assens, Kevin McGuinness, Xavier Giro-i-Nieto, Noel E. O’Connor |
Abstract | We introduce SaltiNet, a deep neural network for scanpath prediction trained on 360-degree images. The model is based on a temporal-aware novel representation of saliency information named the saliency volume. The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes. Sampling strategies over these volumes are used to generate scanpaths over the 360-degree images. Our experiments show the advantages of using saliency volumes, and how they can be used for related tasks. Our source code and trained models available at https://github.com/massens/saliency-360salient-2017. |
Tasks | |
Published | 2017-07-11 |
URL | http://arxiv.org/abs/1707.03123v5 |
http://arxiv.org/pdf/1707.03123v5.pdf | |
PWC | https://paperswithcode.com/paper/saltinet-scan-path-prediction-on-360-degree |
Repo | https://github.com/massens/saliency-360salient-2017 |
Framework | none |
Sparsity Invariant CNNs
Title | Sparsity Invariant CNNs |
Authors | Jonas Uhrig, Nick Schneider, Lukas Schneider, Uwe Franke, Thomas Brox, Andreas Geiger |
Abstract | In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings and will be made available upon publication. |
Tasks | Depth Estimation |
Published | 2017-08-22 |
URL | http://arxiv.org/abs/1708.06500v2 |
http://arxiv.org/pdf/1708.06500v2.pdf | |
PWC | https://paperswithcode.com/paper/sparsity-invariant-cnns |
Repo | https://github.com/PeterTor/sparse_convolution |
Framework | tf |
Oriented Response Networks
Title | Oriented Response Networks |
Authors | Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao |
Abstract | Deep Convolution Neural Networks (DCNNs) are capable of learning unprecedentedly effective image representations. However, their ability in handling significant local and global image rotations remains limited. In this paper, we propose Active Rotating Filters (ARFs) that actively rotate during convolution and produce feature maps with location and orientation explicitly encoded. An ARF acts as a virtual filter bank containing the filter itself and its multiple unmaterialised rotated versions. During back-propagation, an ARF is collectively updated using errors from all its rotated versions. DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks. The oriented response produced by ORNs can also be used for image and object orientation estimation tasks. Over multiple state-of-the-art DCNN architectures, such as VGG, ResNet, and STN, we consistently observe that replacing regular filters with the proposed ARFs leads to significant reduction in the number of network parameters and improvement in classification performance. We report the best results on several commonly used benchmarks. |
Tasks | Image Classification |
Published | 2017-01-07 |
URL | http://arxiv.org/abs/1701.01833v2 |
http://arxiv.org/pdf/1701.01833v2.pdf | |
PWC | https://paperswithcode.com/paper/oriented-response-networks |
Repo | https://github.com/ZhouYanzhao/ORN |
Framework | torch |
Discriminating Traces with Time
Title | Discriminating Traces with Time |
Authors | Saeid Tizpaz-Niari, Pavol Cerny, Bor-Yuh Evan Chang, Sriram Sankaranarayanan, Ashutosh Trivedi |
Abstract | What properties about the internals of a program explain the possible differences in its overall running time for different inputs? In this paper, we propose a formal framework for considering this question we dub trace-set discrimination. We show that even though the algorithmic problem of computing maximum likelihood discriminants is NP-hard, approaches based on integer linear programming (ILP) and decision tree learning can be useful in zeroing-in on the program internals. On a set of Java benchmarks, we find that compactly-represented decision trees scalably discriminate with high accuracy—more scalably than maximum likelihood discriminants and with comparable accuracy. We demonstrate on three larger case studies how decision-tree discriminants produced by our tool are useful for debugging timing side-channel vulnerabilities (i.e., where a malicious observer infers secrets simply from passively watching execution times) and availability vulnerabilities. |
Tasks | |
Published | 2017-02-23 |
URL | http://arxiv.org/abs/1702.07103v1 |
http://arxiv.org/pdf/1702.07103v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminating-traces-with-time |
Repo | https://github.com/cuplv/Discriminer |
Framework | none |
Reducing Complexity of HEVC: A Deep Learning Approach
Title | Reducing Complexity of HEVC: A Deep Learning Approach |
Authors | Mai Xu, Tianyi Li, Zulin Wang, Xin Deng, Ren Yang, Zhenyu Guan |
Abstract | High Efficiency Video Coding (HEVC) significantly reduces bit-rates over the proceeding H.264 standard but at the expense of extremely high encoding complexity. In HEVC, the quad-tree partition of coding unit (CU) consumes a large proportion of the HEVC encoding complexity, due to the bruteforce search for rate-distortion optimization (RDO). Therefore, this paper proposes a deep learning approach to predict the CU partition for reducing the HEVC complexity at both intra- and inter-modes, which is based on convolutional neural network (CNN) and long- and short-term memory (LSTM) network. First, we establish a large-scale database including substantial CU partition data for HEVC intra- and inter-modes. This enables deep learning on the CU partition. Second, we represent the CU partition of an entire coding tree unit (CTU) in the form of a hierarchical CU partition map (HCPM). Then, we propose an early-terminated hierarchical CNN (ETH-CNN) for learning to predict the HCPM. Consequently, the encoding complexity of intra-mode HEVC can be drastically reduced by replacing the brute-force search with ETH-CNN to decide the CU partition. Third, an early-terminated hierarchical LSTM (ETH-LSTM) is proposed to learn the temporal correlation of the CU partition. Then, we combine ETH-LSTM and ETH-CNN to predict the CU partition for reducing the HEVC complexity for inter-mode. Finally, experimental results show that our approach outperforms other state-of-the-art approaches in reducing the HEVC complexity at both intra- and inter-modes. |
Tasks | |
Published | 2017-09-19 |
URL | http://arxiv.org/abs/1710.01218v3 |
http://arxiv.org/pdf/1710.01218v3.pdf | |
PWC | https://paperswithcode.com/paper/reducing-complexity-of-hevc-a-deep-learning |
Repo | https://github.com/HEVC-Projects/CPH |
Framework | none |
An improved Ant Colony System for the Sequential Ordering Problem
Title | An improved Ant Colony System for the Sequential Ordering Problem |
Authors | Rafał Skinderowicz |
Abstract | It is not rare that the performance of one metaheuristic algorithm can be improved by incorporating ideas taken from another. In this article we present how Simulated Annealing (SA) can be used to improve the efficiency of the Ant Colony System (ACS) and Enhanced ACS when solving the Sequential Ordering Problem (SOP). Moreover, we show how the very same ideas can be applied to improve the convergence of a dedicated local search, i.e. the SOP-3-exchange algorithm. A statistical analysis of the proposed algorithms both in terms of finding suitable parameter values and the quality of the generated solutions is presented based on a series of computational experiments conducted on SOP instances from the well-known TSPLIB and SOPLIB2006 repositories. The proposed ACS-SA and EACS-SA algorithms often generate solutions of better quality than the ACS and EACS, respectively. Moreover, the EACS-SA algorithm combined with the proposed SOP-3-exchange-SA local search was able to find 10 new best solutions for the SOP instances from the SOPLIB2006 repository, thus improving the state-of-the-art results as known from the literature. Overall, the best known or improved solutions were found in 41 out of 48 cases. |
Tasks | |
Published | 2017-05-02 |
URL | http://arxiv.org/abs/1705.01076v1 |
http://arxiv.org/pdf/1705.01076v1.pdf | |
PWC | https://paperswithcode.com/paper/an-improved-ant-colony-system-for-the |
Repo | https://github.com/RSkinderowicz/AntColonySystemSA |
Framework | none |
Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings
Title | Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings |
Authors | Abhishek, Ashish Anand, Amit Awekar |
Abstract | Fine-grained entity type classification (FETC) is the task of classifying an entity mention to a broad set of types. Distant supervision paradigm is extensively used to generate training data for this task. However, generated training data assigns same set of labels to every mention of an entity without considering its local context. Existing FETC systems have two major drawbacks: assuming training data to be noise free and use of hand crafted features. Our work overcomes both drawbacks. We propose a neural network model that jointly learns entity mentions and their context representation to eliminate use of hand crafted features. Our model treats training data as noisy and uses non-parametric variant of hinge loss function. Experiments show that the proposed model outperforms previous state-of-the-art methods on two publicly available datasets, namely FIGER (GOLD) and BBN with an average relative improvement of 2.69% in micro-F1 score. Knowledge learnt by our model on one dataset can be transferred to other datasets while using same model or other FETC systems. These approaches of transferring knowledge further improve the performance of respective models. |
Tasks | |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06709v1 |
http://arxiv.org/pdf/1702.06709v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-entity-type-classification-by |
Repo | https://github.com/abhipec/fnet |
Framework | tf |
Mandolin: A Knowledge Discovery Framework for the Web of Data
Title | Mandolin: A Knowledge Discovery Framework for the Web of Data |
Authors | Tommaso Soru, Diego Esteves, Edgard Marx, Axel-Cyrille Ngonga Ngomo |
Abstract | Markov Logic Networks join probabilistic modeling with first-order logic and have been shown to integrate well with the Semantic Web foundations. While several approaches have been devised to tackle the subproblems of rule mining, grounding, and inference, no comprehensive workflow has been proposed so far. In this paper, we fill this gap by introducing a framework called Mandolin, which implements a workflow for knowledge discovery specifically on RDF datasets. Our framework imports knowledge from referenced graphs, creates similarity relationships among similar literals, and relies on state-of-the-art techniques for rule mining, grounding, and inference computation. We show that our best configuration scales well and achieves at least comparable results with respect to other statistical-relational-learning algorithms on link prediction. |
Tasks | Link Prediction, Relational Reasoning |
Published | 2017-11-03 |
URL | http://arxiv.org/abs/1711.01283v1 |
http://arxiv.org/pdf/1711.01283v1.pdf | |
PWC | https://paperswithcode.com/paper/mandolin-a-knowledge-discovery-framework-for |
Repo | https://github.com/AKSW/Mandolin |
Framework | none |
Vehicle Traffic Driven Camera Placement for Better Metropolis Security Surveillance
Title | Vehicle Traffic Driven Camera Placement for Better Metropolis Security Surveillance |
Authors | Yihui He, Xiaobo Ma, Xiapu Luo, Jianfeng Li, Mengchen Zhao, Bo An, Xiaohong Guan |
Abstract | Security surveillance is one of the most important issues in smart cities, especially in an era of terrorism. Deploying a number of (video) cameras is a common surveillance approach. Given the never-ending power offered by vehicles to metropolises, exploiting vehicle traffic to design camera placement strategies could potentially facilitate security surveillance. This article constitutes the first effort toward building the linkage between vehicle traffic and security surveillance, which is a critical problem for smart cities. We expect our study could influence the decision making of surveillance camera placement, and foster more research of principled ways of security surveillance beneficial to our physical-world life. Code has been made publicly available. |
Tasks | Decision Making |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1705.08508v4 |
http://arxiv.org/pdf/1705.08508v4.pdf | |
PWC | https://paperswithcode.com/paper/vehicle-traffic-driven-camera-placement-for |
Repo | https://github.com/yihui-he/Vehicle-Traffic-Driven-Camera-Placement |
Framework | none |
Interleaved Group Convolutions for Deep Neural Networks
Title | Interleaved Group Convolutions for Deep Neural Networks |
Authors | Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang |
Abstract | In this paper, we present a simple and modularized neural network architecture, named interleaved group convolutional neural networks (IGCNets). The main point lies in a novel building block, a pair of two successive interleaved group convolutions: primary group convolution and secondary group convolution. The two group convolutions are complementary: (i) the convolution on each partition in primary group convolution is a spatial convolution, while on each partition in secondary group convolution, the convolution is a point-wise convolution; (ii) the channels in the same secondary partition come from different primary partitions. We discuss one representative advantage: Wider than a regular convolution with the number of parameters and the computation complexity preserved. We also show that regular convolutions, group convolution with summation fusion, and the Xception block are special cases of interleaved group convolutions. Empirical results over standard benchmarks, CIFAR-$10$, CIFAR-$100$, SVHN and ImageNet demonstrate that our networks are more efficient in using parameters and computation complexity with similar or higher accuracy. |
Tasks | |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.02725v2 |
http://arxiv.org/pdf/1707.02725v2.pdf | |
PWC | https://paperswithcode.com/paper/interleaved-group-convolutions-for-deep |
Repo | https://github.com/homles11/IGCV3 |
Framework | tf |
Utilizing Domain Knowledge in End-to-End Audio Processing
Title | Utilizing Domain Knowledge in End-to-End Audio Processing |
Authors | Tycho Max Sylvester Tax, Jose Luis Diez Antich, Hendrik Purwins, Lars Maaløe |
Abstract | End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations. In this paper we present preliminary work that shows the feasibility of training the first layers of a deep convolutional neural network (CNN) model to learn the commonly-used log-scaled mel-spectrogram transformation. Secondly, we demonstrate that upon initializing the first layers of an end-to-end CNN classifier with the learned transformation, convergence and performance on the ESC-50 environmental sound classification dataset are similar to a CNN-based model trained on the highly pre-processed log-scaled mel-spectrogram features. |
Tasks | Environmental Sound Classification |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00254v1 |
http://arxiv.org/pdf/1712.00254v1.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-domain-knowledge-in-end-to-end |
Repo | https://github.com/corticph/MSTmodel |
Framework | tf |
EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples
Title | EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples |
Authors | Pin-Yu Chen, Yash Sharma, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh |
Abstract | Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on $L_2$ and $L_\infty$ distortion metrics. However, despite the fact that $L_1$ distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting $L_1$-based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art $L_2$ attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small $L_1$ distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging $L_1$ distortion in adversarial machine learning and security implications of DNNs. |
Tasks | |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.04114v3 |
http://arxiv.org/pdf/1709.04114v3.pdf | |
PWC | https://paperswithcode.com/paper/ead-elastic-net-attacks-to-deep-neural |
Repo | https://github.com/ysharma1126/EAD-Attack |
Framework | tf |
Diverse Weighted Bipartite b-Matching
Title | Diverse Weighted Bipartite b-Matching |
Authors | Faez Ahmed, John P. Dickerson, Mark Fuge |
Abstract | Bipartite matching, where agents on one side of a market are matched to agents or items on the other, is a classical problem in computer science and economics, with widespread application in healthcare, education, advertising, and general resource allocation. A practitioner’s goal is typically to maximize a matching market’s economic efficiency, possibly subject to some fairness requirements that promote equal access to resources. A natural balancing act exists between fairness and efficiency in matching markets, and has been the subject of much research. In this paper, we study a complementary goal—balancing diversity and efficiency—in a generalization of bipartite matching where agents on one side of the market can be matched to sets of agents on the other. Adapting a classical definition of the diversity of a set, we propose a quadratic programming-based approach to solving a supermodular minimization problem that balances diversity and total weight of the solution. We also provide a scalable greedy algorithm with theoretical performance bounds. We then define the price of diversity, a measure of the efficiency loss due to enforcing diversity, and give a worst-case theoretical bound. Finally, we demonstrate the efficacy of our methods on three real-world datasets, and show that the price of diversity is not bad in practice. |
Tasks | |
Published | 2017-02-23 |
URL | http://arxiv.org/abs/1702.07134v2 |
http://arxiv.org/pdf/1702.07134v2.pdf | |
PWC | https://paperswithcode.com/paper/diverse-weighted-bipartite-b-matching |
Repo | https://github.com/faezahmed/diverse_matching |
Framework | none |