Paper Group ANR 225
Fusion of Methods Based on Minutiae, Ridges and Pores for Robust Fingerprint Recognition. Supervised Convolutional Sparse Coding. Stochastic Activation Pruning for Robust Adversarial Defense. An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation. Adversarial Sparse-View CBCT Artifact Reduction. An …
Fusion of Methods Based on Minutiae, Ridges and Pores for Robust Fingerprint Recognition
Title | Fusion of Methods Based on Minutiae, Ridges and Pores for Robust Fingerprint Recognition |
Authors | Lucas Alexandre Ramos, Aparecido Nilceu Marana |
Abstract | The use of physical and behavioral characteristics for human identification is known as biometrics. Among the many biometrics traits available, the fingerprint is the most widely used. The fingerprint identification is based on the impression patterns, as the pattern of ridges and minutiae, characteristics of first and second levels respectively. The current identification systems use these two levels of fingerprint features due to the low cost of the sensors. However, due the recent advances in sensor technology, it is possible to use third level features present within the ridges, such as the perspiration pores. Recent studies have shown that the use of third-level features can increase security and fraud protection in biometric systems, since they are difficult to reproduce. In addition, recent researches have also focused on multibiometrics recognition due to its many advantages. The goal of this work was to apply fusion techniques for fingerprint recognition in order to combine minutiae, ridges and pore-based methods and, thus, provide more robust biometrics recognition systems. We evaluated isotropic-based and adaptive-based automatic pore extraction methods and the fusion of pore-based method with the identification methods based on minutiae and ridges. The experiments were performed on the public database PolyU HRF and showed a reduction of approximately 16% in the Equal Error Rate compared to the best results obtained by the methods individually. |
Tasks | |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10949v1 |
http://arxiv.org/pdf/1805.10949v1.pdf | |
PWC | https://paperswithcode.com/paper/fusion-of-methods-based-on-minutiae-ridges |
Repo | |
Framework | |
Supervised Convolutional Sparse Coding
Title | Supervised Convolutional Sparse Coding |
Authors | Lama Affara, Bernard Ghanem, Peter Wonka |
Abstract | Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks. In this work, we extend the applicability of this model by proposing a supervised approach to convolutional sparse coding, which aims at learning discriminative dictionaries instead of purely reconstructive ones. We incorporate a supervised regularization term into the traditional unsupervised CSC objective to encourage the final dictionary elements to be discriminative. Experimental results show that using supervised convolutional learning results in two key advantages. First, we learn more semantically relevant filters in the dictionary and second, we achieve improved image reconstruction on unseen data. |
Tasks | Image Reconstruction, Image Restoration |
Published | 2018-04-08 |
URL | http://arxiv.org/abs/1804.02678v1 |
http://arxiv.org/pdf/1804.02678v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-convolutional-sparse-coding |
Repo | |
Framework | |
Stochastic Activation Pruning for Robust Adversarial Defense
Title | Stochastic Activation Pruning for Robust Adversarial Defense |
Authors | Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy Bernstein, Jean Kossaifi, Aran Khanna, Anima Anandkumar |
Abstract | Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples. Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration. |
Tasks | Adversarial Defense, Calibration |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01442v1 |
http://arxiv.org/pdf/1803.01442v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-activation-pruning-for-robust |
Repo | |
Framework | |
An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation
Title | An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation |
Authors | Gongbo Tang, Rico Sennrich, Joakim Nivre |
Abstract | Recent work has shown that the encoder-decoder attention mechanisms in neural machine translation (NMT) are different from the word alignment in statistical machine translation. In this paper, we focus on analyzing encoder-decoder attention mechanisms, in the case of word sense disambiguation (WSD) in NMT models. We hypothesize that attention mechanisms pay more attention to context tokens when translating ambiguous words. We explore the attention distribution patterns when translating ambiguous nouns. Counter-intuitively, we find that attention mechanisms are likely to distribute more attention to the ambiguous noun itself rather than context tokens, in comparison to other nouns. We conclude that attention mechanism is not the main mechanism used by NMT models to incorporate contextual information for WSD. The experimental results suggest that NMT models learn to encode contextual information necessary for WSD in the encoder hidden states. For the attention mechanism in Transformer models, we reveal that the first few layers gradually learn to “align” source and target tokens and the last few layers learn to extract features from the related but unaligned context tokens. |
Tasks | Machine Translation, Word Alignment, Word Sense Disambiguation |
Published | 2018-10-17 |
URL | http://arxiv.org/abs/1810.07595v1 |
http://arxiv.org/pdf/1810.07595v1.pdf | |
PWC | https://paperswithcode.com/paper/an-analysis-of-attention-mechanisms-the-case |
Repo | |
Framework | |
Adversarial Sparse-View CBCT Artifact Reduction
Title | Adversarial Sparse-View CBCT Artifact Reduction |
Authors | Haofu Liao, Zhimin Huo, William J. Sehnert, Shaohua Kevin Zhou, Jiebo Luo |
Abstract | We present an effective post-processing method to reduce the artifacts from sparsely reconstructed cone-beam CT (CBCT) images. The proposed method is based on the state-of-the-art, image-to-image generative models with a perceptual loss as regulation. Unlike the traditional CT artifact-reduction approaches, our method is trained in an adversarial fashion that yields more perceptually realistic outputs while preserving the anatomical structures. To address the streak artifacts that are inherently local and appear across various scales, we further propose a novel discriminator architecture based on feature pyramid networks and a differentially modulated focus map to induce the adversarial training. Our experimental results show that the proposed method can greatly correct the cone-beam artifacts from clinical CBCT images reconstructed using 1/3 projections, and outperforms strong baseline methods both quantitatively and qualitatively. |
Tasks | Cbct Artifact Reduction |
Published | 2018-12-09 |
URL | http://arxiv.org/abs/1812.03503v1 |
http://arxiv.org/pdf/1812.03503v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-sparse-view-cbct-artifact |
Repo | |
Framework | |
An Investigation of Few-Shot Learning in Spoken Term Classification
Title | An Investigation of Few-Shot Learning in Spoken Term Classification |
Authors | Yangbin Chen, Tom Ko, Lifeng Shang, Xiao Chen, Xin Jiang, Qing Li |
Abstract | In this paper, we investigate the feasibility of applying few-shot learning algorithms to a speech task. We formulate a user-defined scenario of spoken term classification as a few-shot learning problem. In most few-shot learning studies, it is assumed that all the N classes are new in a N-way problem. We suggest that this assumption can be relaxed and define a N+M-way problem where N and M are the number of new classes and fixed classes respectively. We propose a modification to the Model-Agnostic Meta-Learning (MAML) algorithm to solve the problem. Experiments on the Google Speech Commands dataset show that our approach outperforms the conventional supervised learning approach and the original MAML. |
Tasks | Few-Shot Learning, Keyword Spotting, Meta-Learning |
Published | 2018-12-26 |
URL | https://arxiv.org/abs/1812.10233v2 |
https://arxiv.org/pdf/1812.10233v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-for-few-shot-keyword-spotting |
Repo | |
Framework | |
Reinforcement Learning using Augmented Neural Networks
Title | Reinforcement Learning using Augmented Neural Networks |
Authors | Jack Shannon, Marek Grzes |
Abstract | Neural networks allow Q-learning reinforcement learning agents such as deep Q-networks (DQN) to approximate complex mappings from state spaces to value functions. However, this also brings drawbacks when compared to other function approximators such as tile coding or their generalisations, radial basis functions (RBF) because they introduce instability due to the side effect of globalised updates present in neural networks. This instability does not even vanish in neural networks that do not have any hidden layers. In this paper, we show that simple modifications to the structure of the neural network can improve stability of DQN learning when a multi-layer perceptron is used for function approximation. |
Tasks | Q-Learning |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07692v1 |
http://arxiv.org/pdf/1806.07692v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-using-augmented-neural |
Repo | |
Framework | |
Conversational AI: The Science Behind the Alexa Prize
Title | Conversational AI: The Science Behind the Alexa Prize |
Authors | Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, Art Pettigrue |
Abstract | Conversational agents are exploding in popularity. However, much work remains in the area of social conversation as well as free-form conversation over a broad range of domains and topics. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million-dollar university competition where sixteen selected university teams were challenged to build conversational agents, known as socialbots, to converse coherently and engagingly with humans on popular topics such as Sports, Politics, Entertainment, Fashion and Technology for 20 minutes. The Alexa Prize offers the academic community a unique opportunity to perform research with a live system used by millions of users. The competition provided university teams with real user conversational data at scale, along with the user-provided ratings and feedback augmented with annotations by the Alexa team. This enabled teams to effectively iterate and make improvements throughout the competition while being evaluated in real-time through live user interactions. To build their socialbots, university teams combined state-of-the-art techniques with novel strategies in the areas of Natural Language Understanding, Context Modeling, Dialog Management, Response Generation, and Knowledge Acquisition. To support the efforts of participating teams, the Alexa Prize team made significant scientific and engineering investments to build and improve Conversational Speech Recognition, Topic Tracking, Dialog Evaluation, Voice User Experience, and tools for traffic management and scalability. This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational AI. |
Tasks | Speech Recognition |
Published | 2018-01-11 |
URL | http://arxiv.org/abs/1801.03604v1 |
http://arxiv.org/pdf/1801.03604v1.pdf | |
PWC | https://paperswithcode.com/paper/conversational-ai-the-science-behind-the |
Repo | |
Framework | |
Generic Probabilistic Interactive Situation Recognition and Prediction: From Virtual to Real
Title | Generic Probabilistic Interactive Situation Recognition and Prediction: From Virtual to Real |
Authors | Jiachen Li, Hengbo Ma, Wei Zhan, Masayoshi Tomizuka |
Abstract | Accurate and robust recognition and prediction of traffic situation plays an important role in autonomous driving, which is a prerequisite for risk assessment and effective decision making. Although there exist a lot of works dealing with modeling driver behavior of a single object, it remains a challenge to make predictions for multiple highly interactive agents that react to each other simultaneously. In this work, we propose a generic probabilistic hierarchical recognition and prediction framework which employs a two-layer Hidden Markov Model (TLHMM) to obtain the distribution of potential situations and a learning-based dynamic scene evolution model to sample a group of future trajectories. Instead of predicting motions of a single entity, we propose to get the joint distribution by modeling multiple interactive agents as a whole system. Moreover, due to the decoupling property of the layered structure, our model is suitable for knowledge transfer from simulation to real world applications as well as among different traffic scenarios, which can reduce the computational efforts of training and the demand for a large data amount. A case study of highway ramp merging scenario is demonstrated to verify the effectiveness and accuracy of the proposed framework. |
Tasks | Autonomous Driving, Decision Making, Transfer Learning |
Published | 2018-09-09 |
URL | http://arxiv.org/abs/1809.02927v1 |
http://arxiv.org/pdf/1809.02927v1.pdf | |
PWC | https://paperswithcode.com/paper/generic-probabilistic-interactive-situation |
Repo | |
Framework | |
Road Segmentation in SAR Satellite Images with Deep Fully-Convolutional Neural Networks
Title | Road Segmentation in SAR Satellite Images with Deep Fully-Convolutional Neural Networks |
Authors | Corentin Henry, Seyed Majid Azimi, Nina Merkle |
Abstract | Remote sensing is extensively used in cartography. As transportation networks grow and change, extracting roads automatically from satellite images is crucial to keep maps up-to-date. Synthetic Aperture Radar satellites can provide high resolution topographical maps. However roads are difficult to identify in these data as they look visually similar to targets such as rivers and railways. Most road extraction methods on Synthetic Aperture Radar images still rely on a prior segmentation performed by classical computer vision algorithms. Few works study the potential of deep learning techniques, despite their successful applications to optical imagery. This letter presents an evaluation of Fully-Convolutional Neural Networks for road segmentation in SAR images. We study the relative performance of early and state-of-the-art networks after carefully enhancing their sensitivity towards thin objects by adding spatial tolerance rules. Our models shows promising results, successfully extracting most of the roads in our test dataset. This shows that, although Fully-Convolutional Neural Networks natively lack efficiency for road segmentation, they are capable of good results if properly tuned. As the segmentation quality does not scale well with the increasing depth of the networks, the design of specialized architectures for roads extraction should yield better performances. |
Tasks | |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01445v2 |
http://arxiv.org/pdf/1802.01445v2.pdf | |
PWC | https://paperswithcode.com/paper/road-segmentation-in-sar-satellite-images |
Repo | |
Framework | |
Predicting Learning Status in MOOCs using LSTM
Title | Predicting Learning Status in MOOCs using LSTM |
Authors | Zhemin Liu, Feng Xiong, Kaifa Zou, Hongzhi Wang |
Abstract | Real-time and open online course resources of MOOCs have attracted a large number of learners in recent years. However, many new questions were emerging about the high dropout rate of learners. For MOOCs platform, predicting the learning status of MOOCs learners in real time with high accuracy is the crucial task, and it also help improve the quality of MOOCs teaching. The prediction task in this paper is inherently a time series prediction problem, and can be treated as time series classification problem, hence this paper proposed a prediction model based on RNNLSTMs and optimization techniques which can be used to predict learners’ learning status. Using datasets provided by Chinese University MOOCs as the inputs of model, the average accuracy of model’s outputs was about 90%. |
Tasks | Time Series, Time Series Classification, Time Series Prediction |
Published | 2018-08-05 |
URL | http://arxiv.org/abs/1808.01616v1 |
http://arxiv.org/pdf/1808.01616v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-learning-status-in-moocs-using |
Repo | |
Framework | |
Detecting Features of Tools, Objects, and Actions from Effects in a Robot using Deep Learning
Title | Detecting Features of Tools, Objects, and Actions from Effects in a Robot using Deep Learning |
Authors | Namiko Saito, Kitae Kim, Shingo Murata, Tetsuya Ogata, Shigeki Sugano |
Abstract | We propose a tool-use model that can detect the features of tools, target objects, and actions from the provided effects of object manipulation. We construct a model that enables robots to manipulate objects with tools, using infant learning as a concept. To realize this, we train sensory-motor data recorded during a tool-use task performed by a robot with deep learning. Experiments include four factors: (1) tools, (2) objects, (3) actions, and (4) effects, which the model considers simultaneously. For evaluation, the robot generates predicted images and motions given information of the effects of using unknown tools and objects. We confirm that the robot is capable of detecting features of tools, objects, and actions by learning the effects and executing the task. |
Tasks | |
Published | 2018-09-23 |
URL | http://arxiv.org/abs/1809.08613v1 |
http://arxiv.org/pdf/1809.08613v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-features-of-tools-objects-and |
Repo | |
Framework | |
Repair-Based Degrees of Database Inconsistency: Computation and Complexity
Title | Repair-Based Degrees of Database Inconsistency: Computation and Complexity |
Authors | Leopoldo Bertossi |
Abstract | We propose a generic numerical measure of the inconsistency of a database with respect to a set of integrity constraints. It is based on an abstract repair semantics. In particular, an inconsistency measure associated to cardinality-repairs is investigated in detail. More specifically, it is shown that it can be computed via answer-set programs, but sometimes its computation can be intractable in data complexity. However, polynomial-time deterministic and randomized approximations are exhibited. The behavior of this measure under small updates is analyzed, obtaining fixed-parameter tractability results. Furthermore, alternative inconsistency measures are proposed and discussed. |
Tasks | |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10286v3 |
http://arxiv.org/pdf/1809.10286v3.pdf | |
PWC | https://paperswithcode.com/paper/repair-based-degrees-of-database |
Repo | |
Framework | |
Structural Isomprphism in Mathematical Expressions: A Simple Coding Scheme
Title | Structural Isomprphism in Mathematical Expressions: A Simple Coding Scheme |
Authors | Reza Shahbazi |
Abstract | While there exist many methods in machine learning for comparison of letter string data, most are better equipped to handle strings that represent natural language, and their performance will not hold up when presented with strings that correspond to mathematical expressions. Based on the graphical representation of the expression tree, here I propose a simple method for encoding such expressions that is only sensitive to their structural properties, and invariant to the specifics which can vary between two seemingly different, but semantically similar mathematical expressions. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.12495v1 |
http://arxiv.org/pdf/1805.12495v1.pdf | |
PWC | https://paperswithcode.com/paper/structural-isomprphism-in-mathematical |
Repo | |
Framework | |
Deep Learning in the Wavelet Domain
Title | Deep Learning in the Wavelet Domain |
Authors | Fergal Cotter, Nick Kingsbury |
Abstract | This paper examines the possibility of, and the possible advantages to learning the filters of convolutional neural networks (CNNs) for image analysis in the wavelet domain. We are stimulated by both Mallat’s scattering transform and the idea of filtering in the Fourier domain. It is important to explore new spaces in which to learn, as these may provide inherent advantages that are not available in the pixel space. However, the scattering transform is limited by its inability to learn in between scattering orders, and any Fourier domain filtering is limited by the large number of filter parameters needed to get localized filters. Instead we consider filtering in the wavelet domain with learnable filters. The wavelet space allows us to have local, smooth filters with far fewer parameters, and learnability can give us flexibility. We present a novel layer which takes CNN activations into the wavelet space, learns parameters and returns to the pixel space. This allows it to be easily dropped in to any neural network without affecting the structure. As part of this work, we show how to pass gradients through a multirate system and give preliminary results. |
Tasks | |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.06115v1 |
http://arxiv.org/pdf/1811.06115v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-in-the-wavelet-domain |
Repo | |
Framework | |