October 19, 2019

2935 words 14 mins read

Paper Group ANR 225

Fusion of Methods Based on Minutiae, Ridges and Pores for Robust Fingerprint Recognition. Supervised Convolutional Sparse Coding. Stochastic Activation Pruning for Robust Adversarial Defense. An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation. Adversarial Sparse-View CBCT Artifact Reduction. An …

Fusion of Methods Based on Minutiae, Ridges and Pores for Robust Fingerprint Recognition


Title	Fusion of Methods Based on Minutiae, Ridges and Pores for Robust Fingerprint Recognition
Authors	Lucas Alexandre Ramos, Aparecido Nilceu Marana
Abstract	The use of physical and behavioral characteristics for human identification is known as biometrics. Among the many biometrics traits available, the fingerprint is the most widely used. The fingerprint identification is based on the impression patterns, as the pattern of ridges and minutiae, characteristics of first and second levels respectively. The current identification systems use these two levels of fingerprint features due to the low cost of the sensors. However, due the recent advances in sensor technology, it is possible to use third level features present within the ridges, such as the perspiration pores. Recent studies have shown that the use of third-level features can increase security and fraud protection in biometric systems, since they are difficult to reproduce. In addition, recent researches have also focused on multibiometrics recognition due to its many advantages. The goal of this work was to apply fusion techniques for fingerprint recognition in order to combine minutiae, ridges and pore-based methods and, thus, provide more robust biometrics recognition systems. We evaluated isotropic-based and adaptive-based automatic pore extraction methods and the fusion of pore-based method with the identification methods based on minutiae and ridges. The experiments were performed on the public database PolyU HRF and showed a reduction of approximately 16% in the Equal Error Rate compared to the best results obtained by the methods individually.
Tasks
Published	2018-05-28
URL	http://arxiv.org/abs/1805.10949v1
PDF	http://arxiv.org/pdf/1805.10949v1.pdf
PWC	https://paperswithcode.com/paper/fusion-of-methods-based-on-minutiae-ridges
Repo
Framework

Supervised Convolutional Sparse Coding


Title	Supervised Convolutional Sparse Coding
Authors	Lama Affara, Bernard Ghanem, Peter Wonka
Abstract	Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks. In this work, we extend the applicability of this model by proposing a supervised approach to convolutional sparse coding, which aims at learning discriminative dictionaries instead of purely reconstructive ones. We incorporate a supervised regularization term into the traditional unsupervised CSC objective to encourage the final dictionary elements to be discriminative. Experimental results show that using supervised convolutional learning results in two key advantages. First, we learn more semantically relevant filters in the dictionary and second, we achieve improved image reconstruction on unseen data.
Tasks	Image Reconstruction, Image Restoration
Published	2018-04-08
URL	http://arxiv.org/abs/1804.02678v1
PDF	http://arxiv.org/pdf/1804.02678v1.pdf
PWC	https://paperswithcode.com/paper/supervised-convolutional-sparse-coding
Repo
Framework

Stochastic Activation Pruning for Robust Adversarial Defense


Title	Stochastic Activation Pruning for Robust Adversarial Defense
Authors	Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy Bernstein, Jean Kossaifi, Aran Khanna, Anima Anandkumar
Abstract	Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples. Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration.
Tasks	Adversarial Defense, Calibration
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01442v1
PDF	http://arxiv.org/pdf/1803.01442v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-activation-pruning-for-robust
Repo
Framework

An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation


Title	An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation
Authors	Gongbo Tang, Rico Sennrich, Joakim Nivre
Abstract	Recent work has shown that the encoder-decoder attention mechanisms in neural machine translation (NMT) are different from the word alignment in statistical machine translation. In this paper, we focus on analyzing encoder-decoder attention mechanisms, in the case of word sense disambiguation (WSD) in NMT models. We hypothesize that attention mechanisms pay more attention to context tokens when translating ambiguous words. We explore the attention distribution patterns when translating ambiguous nouns. Counter-intuitively, we find that attention mechanisms are likely to distribute more attention to the ambiguous noun itself rather than context tokens, in comparison to other nouns. We conclude that attention mechanism is not the main mechanism used by NMT models to incorporate contextual information for WSD. The experimental results suggest that NMT models learn to encode contextual information necessary for WSD in the encoder hidden states. For the attention mechanism in Transformer models, we reveal that the first few layers gradually learn to “align” source and target tokens and the last few layers learn to extract features from the related but unaligned context tokens.
Tasks	Machine Translation, Word Alignment, Word Sense Disambiguation
Published	2018-10-17
URL	http://arxiv.org/abs/1810.07595v1
PDF	http://arxiv.org/pdf/1810.07595v1.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-attention-mechanisms-the-case
Repo
Framework

Adversarial Sparse-View CBCT Artifact Reduction


Title	Adversarial Sparse-View CBCT Artifact Reduction
Authors	Haofu Liao, Zhimin Huo, William J. Sehnert, Shaohua Kevin Zhou, Jiebo Luo
Abstract	We present an effective post-processing method to reduce the artifacts from sparsely reconstructed cone-beam CT (CBCT) images. The proposed method is based on the state-of-the-art, image-to-image generative models with a perceptual loss as regulation. Unlike the traditional CT artifact-reduction approaches, our method is trained in an adversarial fashion that yields more perceptually realistic outputs while preserving the anatomical structures. To address the streak artifacts that are inherently local and appear across various scales, we further propose a novel discriminator architecture based on feature pyramid networks and a differentially modulated focus map to induce the adversarial training. Our experimental results show that the proposed method can greatly correct the cone-beam artifacts from clinical CBCT images reconstructed using 1/3 projections, and outperforms strong baseline methods both quantitatively and qualitatively.
Tasks	Cbct Artifact Reduction
Published	2018-12-09
URL	http://arxiv.org/abs/1812.03503v1
PDF	http://arxiv.org/pdf/1812.03503v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-sparse-view-cbct-artifact
Repo
Framework

An Investigation of Few-Shot Learning in Spoken Term Classification


Title	An Investigation of Few-Shot Learning in Spoken Term Classification
Authors	Yangbin Chen, Tom Ko, Lifeng Shang, Xiao Chen, Xin Jiang, Qing Li
Abstract	In this paper, we investigate the feasibility of applying few-shot learning algorithms to a speech task. We formulate a user-defined scenario of spoken term classification as a few-shot learning problem. In most few-shot learning studies, it is assumed that all the N classes are new in a N-way problem. We suggest that this assumption can be relaxed and define a N+M-way problem where N and M are the number of new classes and fixed classes respectively. We propose a modification to the Model-Agnostic Meta-Learning (MAML) algorithm to solve the problem. Experiments on the Google Speech Commands dataset show that our approach outperforms the conventional supervised learning approach and the original MAML.
Tasks	Few-Shot Learning, Keyword Spotting, Meta-Learning
Published	2018-12-26
URL	https://arxiv.org/abs/1812.10233v2
PDF	https://arxiv.org/pdf/1812.10233v2.pdf
PWC	https://paperswithcode.com/paper/meta-learning-for-few-shot-keyword-spotting
Repo
Framework

Reinforcement Learning using Augmented Neural Networks


Title	Reinforcement Learning using Augmented Neural Networks
Authors	Jack Shannon, Marek Grzes
Abstract	Neural networks allow Q-learning reinforcement learning agents such as deep Q-networks (DQN) to approximate complex mappings from state spaces to value functions. However, this also brings drawbacks when compared to other function approximators such as tile coding or their generalisations, radial basis functions (RBF) because they introduce instability due to the side effect of globalised updates present in neural networks. This instability does not even vanish in neural networks that do not have any hidden layers. In this paper, we show that simple modifications to the structure of the neural network can improve stability of DQN learning when a multi-layer perceptron is used for function approximation.
Tasks	Q-Learning
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07692v1
PDF	http://arxiv.org/pdf/1806.07692v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-using-augmented-neural
Repo
Framework

Conversational AI: The Science Behind the Alexa Prize


Title	Conversational AI: The Science Behind the Alexa Prize
Authors	Ashwin Ram, Rohit Prasad, Chandra Khatri, Anu Venkatesh, Raefer Gabriel, Qing Liu, Jeff Nunn, Behnam Hedayatnia, Ming Cheng, Ashish Nagar, Eric King, Kate Bland, Amanda Wartick, Yi Pan, Han Song, Sk Jayadevan, Gene Hwang, Art Pettigrue
Abstract	Conversational agents are exploding in popularity. However, much work remains in the area of social conversation as well as free-form conversation over a broad range of domains and topics. To advance the state of the art in conversational AI, Amazon launched the Alexa Prize, a 2.5-million-dollar university competition where sixteen selected university teams were challenged to build conversational agents, known as socialbots, to converse coherently and engagingly with humans on popular topics such as Sports, Politics, Entertainment, Fashion and Technology for 20 minutes. The Alexa Prize offers the academic community a unique opportunity to perform research with a live system used by millions of users. The competition provided university teams with real user conversational data at scale, along with the user-provided ratings and feedback augmented with annotations by the Alexa team. This enabled teams to effectively iterate and make improvements throughout the competition while being evaluated in real-time through live user interactions. To build their socialbots, university teams combined state-of-the-art techniques with novel strategies in the areas of Natural Language Understanding, Context Modeling, Dialog Management, Response Generation, and Knowledge Acquisition. To support the efforts of participating teams, the Alexa Prize team made significant scientific and engineering investments to build and improve Conversational Speech Recognition, Topic Tracking, Dialog Evaluation, Voice User Experience, and tools for traffic management and scalability. This paper outlines the advances created by the university teams as well as the Alexa Prize team to achieve the common goal of solving the problem of Conversational AI.
Tasks	Speech Recognition
Published	2018-01-11
URL	http://arxiv.org/abs/1801.03604v1
PDF	http://arxiv.org/pdf/1801.03604v1.pdf
PWC	https://paperswithcode.com/paper/conversational-ai-the-science-behind-the
Repo
Framework

Generic Probabilistic Interactive Situation Recognition and Prediction: From Virtual to Real


Title	Generic Probabilistic Interactive Situation Recognition and Prediction: From Virtual to Real
Authors	Jiachen Li, Hengbo Ma, Wei Zhan, Masayoshi Tomizuka
Abstract	Accurate and robust recognition and prediction of traffic situation plays an important role in autonomous driving, which is a prerequisite for risk assessment and effective decision making. Although there exist a lot of works dealing with modeling driver behavior of a single object, it remains a challenge to make predictions for multiple highly interactive agents that react to each other simultaneously. In this work, we propose a generic probabilistic hierarchical recognition and prediction framework which employs a two-layer Hidden Markov Model (TLHMM) to obtain the distribution of potential situations and a learning-based dynamic scene evolution model to sample a group of future trajectories. Instead of predicting motions of a single entity, we propose to get the joint distribution by modeling multiple interactive agents as a whole system. Moreover, due to the decoupling property of the layered structure, our model is suitable for knowledge transfer from simulation to real world applications as well as among different traffic scenarios, which can reduce the computational efforts of training and the demand for a large data amount. A case study of highway ramp merging scenario is demonstrated to verify the effectiveness and accuracy of the proposed framework.
Tasks	Autonomous Driving, Decision Making, Transfer Learning
Published	2018-09-09
URL	http://arxiv.org/abs/1809.02927v1
PDF	http://arxiv.org/pdf/1809.02927v1.pdf
PWC	https://paperswithcode.com/paper/generic-probabilistic-interactive-situation
Repo
Framework

Road Segmentation in SAR Satellite Images with Deep Fully-Convolutional Neural Networks


Title	Road Segmentation in SAR Satellite Images with Deep Fully-Convolutional Neural Networks
Authors	Corentin Henry, Seyed Majid Azimi, Nina Merkle
Abstract	Remote sensing is extensively used in cartography. As transportation networks grow and change, extracting roads automatically from satellite images is crucial to keep maps up-to-date. Synthetic Aperture Radar satellites can provide high resolution topographical maps. However roads are difficult to identify in these data as they look visually similar to targets such as rivers and railways. Most road extraction methods on Synthetic Aperture Radar images still rely on a prior segmentation performed by classical computer vision algorithms. Few works study the potential of deep learning techniques, despite their successful applications to optical imagery. This letter presents an evaluation of Fully-Convolutional Neural Networks for road segmentation in SAR images. We study the relative performance of early and state-of-the-art networks after carefully enhancing their sensitivity towards thin objects by adding spatial tolerance rules. Our models shows promising results, successfully extracting most of the roads in our test dataset. This shows that, although Fully-Convolutional Neural Networks natively lack efficiency for road segmentation, they are capable of good results if properly tuned. As the segmentation quality does not scale well with the increasing depth of the networks, the design of specialized architectures for roads extraction should yield better performances.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01445v2
PDF	http://arxiv.org/pdf/1802.01445v2.pdf
PWC	https://paperswithcode.com/paper/road-segmentation-in-sar-satellite-images
Repo
Framework

Predicting Learning Status in MOOCs using LSTM


Title	Predicting Learning Status in MOOCs using LSTM
Authors	Zhemin Liu, Feng Xiong, Kaifa Zou, Hongzhi Wang
Abstract	Real-time and open online course resources of MOOCs have attracted a large number of learners in recent years. However, many new questions were emerging about the high dropout rate of learners. For MOOCs platform, predicting the learning status of MOOCs learners in real time with high accuracy is the crucial task, and it also help improve the quality of MOOCs teaching. The prediction task in this paper is inherently a time series prediction problem, and can be treated as time series classification problem, hence this paper proposed a prediction model based on RNNLSTMs and optimization techniques which can be used to predict learners’ learning status. Using datasets provided by Chinese University MOOCs as the inputs of model, the average accuracy of model’s outputs was about 90%.
Tasks	Time Series, Time Series Classification, Time Series Prediction
Published	2018-08-05
URL	http://arxiv.org/abs/1808.01616v1
PDF	http://arxiv.org/pdf/1808.01616v1.pdf
PWC	https://paperswithcode.com/paper/predicting-learning-status-in-moocs-using
Repo
Framework

Detecting Features of Tools, Objects, and Actions from Effects in a Robot using Deep Learning


Title	Detecting Features of Tools, Objects, and Actions from Effects in a Robot using Deep Learning
Authors	Namiko Saito, Kitae Kim, Shingo Murata, Tetsuya Ogata, Shigeki Sugano
Abstract	We propose a tool-use model that can detect the features of tools, target objects, and actions from the provided effects of object manipulation. We construct a model that enables robots to manipulate objects with tools, using infant learning as a concept. To realize this, we train sensory-motor data recorded during a tool-use task performed by a robot with deep learning. Experiments include four factors: (1) tools, (2) objects, (3) actions, and (4) effects, which the model considers simultaneously. For evaluation, the robot generates predicted images and motions given information of the effects of using unknown tools and objects. We confirm that the robot is capable of detecting features of tools, objects, and actions by learning the effects and executing the task.
Tasks
Published	2018-09-23
URL	http://arxiv.org/abs/1809.08613v1
PDF	http://arxiv.org/pdf/1809.08613v1.pdf
PWC	https://paperswithcode.com/paper/detecting-features-of-tools-objects-and
Repo
Framework

Repair-Based Degrees of Database Inconsistency: Computation and Complexity


Title	Repair-Based Degrees of Database Inconsistency: Computation and Complexity
Authors	Leopoldo Bertossi
Abstract	We propose a generic numerical measure of the inconsistency of a database with respect to a set of integrity constraints. It is based on an abstract repair semantics. In particular, an inconsistency measure associated to cardinality-repairs is investigated in detail. More specifically, it is shown that it can be computed via answer-set programs, but sometimes its computation can be intractable in data complexity. However, polynomial-time deterministic and randomized approximations are exhibited. The behavior of this measure under small updates is analyzed, obtaining fixed-parameter tractability results. Furthermore, alternative inconsistency measures are proposed and discussed.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10286v3
PDF	http://arxiv.org/pdf/1809.10286v3.pdf
PWC	https://paperswithcode.com/paper/repair-based-degrees-of-database
Repo
Framework

Structural Isomprphism in Mathematical Expressions: A Simple Coding Scheme


Title	Structural Isomprphism in Mathematical Expressions: A Simple Coding Scheme
Authors	Reza Shahbazi
Abstract	While there exist many methods in machine learning for comparison of letter string data, most are better equipped to handle strings that represent natural language, and their performance will not hold up when presented with strings that correspond to mathematical expressions. Based on the graphical representation of the expression tree, here I propose a simple method for encoding such expressions that is only sensitive to their structural properties, and invariant to the specifics which can vary between two seemingly different, but semantically similar mathematical expressions.
Tasks
Published	2018-05-29
URL	http://arxiv.org/abs/1805.12495v1
PDF	http://arxiv.org/pdf/1805.12495v1.pdf
PWC	https://paperswithcode.com/paper/structural-isomprphism-in-mathematical
Repo
Framework

Deep Learning in the Wavelet Domain


Title	Deep Learning in the Wavelet Domain
Authors	Fergal Cotter, Nick Kingsbury
Abstract	This paper examines the possibility of, and the possible advantages to learning the filters of convolutional neural networks (CNNs) for image analysis in the wavelet domain. We are stimulated by both Mallat’s scattering transform and the idea of filtering in the Fourier domain. It is important to explore new spaces in which to learn, as these may provide inherent advantages that are not available in the pixel space. However, the scattering transform is limited by its inability to learn in between scattering orders, and any Fourier domain filtering is limited by the large number of filter parameters needed to get localized filters. Instead we consider filtering in the wavelet domain with learnable filters. The wavelet space allows us to have local, smooth filters with far fewer parameters, and learnability can give us flexibility. We present a novel layer which takes CNN activations into the wavelet space, learns parameters and returns to the pixel space. This allows it to be easily dropped in to any neural network without affecting the structure. As part of this work, we show how to pass gradients through a multirate system and give preliminary results.
Tasks
Published	2018-11-14
URL	http://arxiv.org/abs/1811.06115v1
PDF	http://arxiv.org/pdf/1811.06115v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-in-the-wavelet-domain
Repo
Framework