October 19, 2019

2664 words 13 mins read

Paper Group ANR 124

Paper Group ANR 124

Edit Probability for Scene Text Recognition. Learning Graph Embeddings from WordNet-based Similarity Measures. Features for Multi-Target Multi-Camera Tracking and Re-Identification. Convolutional neural networks in phase space and inverse problems. Learn and Pick Right Nodes to Offload. Attention-based sequence-to-sequence model for speech recognit …

Edit Probability for Scene Text Recognition

Title Edit Probability for Scene Text Recognition
Authors Fan Bai, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Shuigeng Zhou
Abstract We consider the scene text recognition problem under the attention-based encoder-decoder framework, which is the state of the art. The existing methods usually employ a frame-wise maximal likelihood loss to optimize the models. When we train the model, the misalignment between the ground truth strings and the attention’s output sequences of probability distribution, which is caused by missing or superfluous characters, will confuse and mislead the training process, and consequently make the training costly and degrade the recognition accuracy. To handle this problem, we propose a novel method called edit probability (EP) for scene text recognition. EP tries to effectively estimate the probability of generating a string from the output sequence of probability distribution conditioned on the input image, while considering the possible occurrences of missing/superfluous characters. The advantage lies in that the training process can focus on the missing, superfluous and unrecognized characters, and thus the impact of the misalignment problem can be alleviated or even overcome. We conduct extensive experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets. Experimental results show that the EP can substantially boost scene text recognition performance.
Tasks Scene Text Recognition
Published 2018-05-09
URL http://arxiv.org/abs/1805.03384v1
PDF http://arxiv.org/pdf/1805.03384v1.pdf
PWC https://paperswithcode.com/paper/edit-probability-for-scene-text-recognition
Repo
Framework

Learning Graph Embeddings from WordNet-based Similarity Measures

Title Learning Graph Embeddings from WordNet-based Similarity Measures
Authors Andrey Kutuzov, Mohammad Dorgham, Oleksiy Oliynyk, Chris Biemann, Alexander Panchenko
Abstract We present path2vec, a new approach for learning graph embeddings that relies on structural measures of pairwise node similarities. The model learns representations for nodes in a dense space that approximate a given user-defined graph distance measure, such as e.g. the shortest path distance or distance measures that take information beyond the graph structure into account. Evaluation of the proposed model on semantic similarity and word sense disambiguation tasks, using various WordNet-based similarity measures, show that our approach yields competitive results, outperforming strong graph embedding baselines. The model is computationally efficient, being orders of magnitude faster than the direct computation of graph-based distances.
Tasks Graph Embedding, Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation
Published 2018-08-16
URL http://arxiv.org/abs/1808.05611v4
PDF http://arxiv.org/pdf/1808.05611v4.pdf
PWC https://paperswithcode.com/paper/learning-graph-embeddings-from-wordnet-based
Repo
Framework

Features for Multi-Target Multi-Camera Tracking and Re-Identification

Title Features for Multi-Target Multi-Camera Tracking and Re-Identification
Authors Ergys Ristani, Carlo Tomasi
Abstract Multi-Target Multi-Camera Tracking (MTMCT) tracks many people through video taken from several cameras. Person Re-Identification (Re-ID) retrieves from a gallery images of people similar to a person query image. We learn good features for both MTMCT and Re-ID with a convolutional neural network. Our contributions include an adaptive weighted triplet loss for training and a new technique for hard-identity mining. Our method outperforms the state of the art both on the DukeMTMC benchmarks for tracking, and on the Market-1501 and DukeMTMC-ReID benchmarks for Re-ID. We examine the correlation between good Re-ID and good MTMCT scores, and perform ablation studies to elucidate the contributions of the main components of our system. Code is available.
Tasks Person Re-Identification
Published 2018-03-28
URL http://arxiv.org/abs/1803.10859v1
PDF http://arxiv.org/pdf/1803.10859v1.pdf
PWC https://paperswithcode.com/paper/features-for-multi-target-multi-camera
Repo
Framework

Convolutional neural networks in phase space and inverse problems

Title Convolutional neural networks in phase space and inverse problems
Authors Gunther Uhlmann, Yiran Wang
Abstract We study inverse problems consisting on determining medium properties using the responses to probing waves from the machine learning point of view. Based on the understanding of propagation of waves and their nonlinear interactions, we construct a deep convolutional neural network in which the parameters are used to classify and reconstruct the coefficients of nonlinear wave equations that model the medium properties. Furthermore, for given approximation accuracy, we obtain the depth and number of units of the network and their quantitative dependence on the complexity of the medium.
Tasks
Published 2018-11-09
URL http://arxiv.org/abs/1811.04022v1
PDF http://arxiv.org/pdf/1811.04022v1.pdf
PWC https://paperswithcode.com/paper/convolutional-neural-networks-in-phase-space
Repo
Framework

Learn and Pick Right Nodes to Offload

Title Learn and Pick Right Nodes to Offload
Authors Zhaowei Zhu, Ting Liu, Shengda Jin, Xiliang Luo
Abstract Task offloading is a promising technology to exploit the benefits of fog computing. An effective task offloading strategy is needed to utilize the computational resources efficiently. In this paper, we endeavor to seek an online task offloading strategy to minimize the long-term latency. In particular, we formulate a stochastic programming problem, where the expectations of the system parameters change abruptly at unknown time instants. Meanwhile, we consider the fact that the queried nodes can only feed back the processing results after finishing the tasks. We then put forward an effective algorithm to solve this challenging stochastic programming under the non-stationary bandit model. We further prove that our proposed algorithm is asymptotically optimal in a non-stationary fog-enabled network. Numerical simulations are carried out to corroborate our designs.
Tasks
Published 2018-04-20
URL http://arxiv.org/abs/1804.08416v2
PDF http://arxiv.org/pdf/1804.08416v2.pdf
PWC https://paperswithcode.com/paper/learn-and-pick-right-nodes-to-offload
Repo
Framework

Attention-based sequence-to-sequence model for speech recognition: development of state-of-the-art system on LibriSpeech and its application to non-native English

Title Attention-based sequence-to-sequence model for speech recognition: development of state-of-the-art system on LibriSpeech and its application to non-native English
Authors Yan Yin, Ramon Prieto, Bin Wang, Jianwei Zhou, Yiwei Gu, Yang Liu, Hui Lin
Abstract Recent research has shown that attention-based sequence-to-sequence models such as Listen, Attend, and Spell (LAS) yield comparable results to state-of-the-art ASR systems on various tasks. In this paper, we describe the development of such a system and demonstrate its performance on two tasks: first we achieve a new state-of-the-art word error rate of 3.43% on the test clean subset of LibriSpeech English data; second on non-native English speech, including both read speech and spontaneous speech, we obtain very competitive results compared to a conventional system built with the most updated Kaldi recipe.
Tasks Speech Recognition
Published 2018-10-31
URL http://arxiv.org/abs/1810.13088v2
PDF http://arxiv.org/pdf/1810.13088v2.pdf
PWC https://paperswithcode.com/paper/attention-based-sequence-to-sequence-model
Repo
Framework

Vulnerability of Deep Learning

Title Vulnerability of Deep Learning
Authors Richard Kenway
Abstract The Renormalisation Group (RG) provides a framework in which it is possible to assess whether a deep-learning network is sensitive to small changes in the input data and hence prone to error, or susceptible to adversarial attack. Distinct classification outputs are associated with different RG fixed points and sensitivity to small changes in the input data is due to the presence of relevant operators at a fixed point. A numerical scheme, based on Monte Carlo RG ideas, is proposed for identifying the existence of relevant operators and the corresponding directions of greatest sensitivity in the input data. Thus, a trained deep-learning network may be tested for its robustness and, if it is vulnerable to attack, dangerous perturbations of the input data identified.
Tasks Adversarial Attack
Published 2018-03-16
URL http://arxiv.org/abs/1803.06111v1
PDF http://arxiv.org/pdf/1803.06111v1.pdf
PWC https://paperswithcode.com/paper/vulnerability-of-deep-learning
Repo
Framework

Online Influence Maximization with Local Observations

Title Online Influence Maximization with Local Observations
Authors Julia Olkhovskaya, Gergely Neu, Gábor Lugosi
Abstract We consider an online influence maximization problem in which a decision maker selects a node among a large number of possibilities and places a piece of information at the node. The node transmits the information to some others that are in the same connected component in a random graph. The goal of the decision maker is to reach as many nodes as possible, with the added complication that feedback is only available about the degree of the selected node. Our main result shows that such local observations can be sufficient for maximizing global influence in two broadly studied families of random graph models: stochastic block models and Chung–Lu models. With this insight, we propose a bandit algorithm that aims at maximizing local (and thus global) influence, and provide its theoretical analysis in both the subcritical and supercritical regimes of both considered models. Notably, our performance guarantees show no explicit dependence on the total number of nodes in the network, making our approach well-suited for large-scale applications.
Tasks
Published 2018-05-28
URL http://arxiv.org/abs/1805.11022v1
PDF http://arxiv.org/pdf/1805.11022v1.pdf
PWC https://paperswithcode.com/paper/online-influence-maximization-with-local
Repo
Framework

Neural Lattice Language Models

Title Neural Lattice Language Models
Authors Jacob Buckman, Graham Neubig
Abstract In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions - including polysemy and existence of multi-word lexical items - into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.
Tasks Language Modelling
Published 2018-03-13
URL http://arxiv.org/abs/1803.05071v1
PDF http://arxiv.org/pdf/1803.05071v1.pdf
PWC https://paperswithcode.com/paper/neural-lattice-language-models
Repo
Framework

Discovering space - Grounding spatial topology and metric regularity in a naive agent’s sensorimotor experience

Title Discovering space - Grounding spatial topology and metric regularity in a naive agent’s sensorimotor experience
Authors Alban Laflaquière, J. Kevin O’Regan, Bruno Gas, Alexander Terekhov
Abstract In line with the sensorimotor contingency theory, we investigate the problem of the perception of space from a fundamental sensorimotor perspective. Despite its pervasive nature in our perception of the world, the origin of the concept of space remains largely mysterious. For example in the context of artificial perception, this issue is usually circumvented by having engineers pre-define the spatial structure of the problem the agent has to face. We here show that the structure of space can be autonomously discovered by a naive agent in the form of sensorimotor regularities, that correspond to so called compensable sensory experiences: these are experiences that can be generated either by the agent or its environment. By detecting such compensable experiences the agent can infer the topological and metric structure of the external space in which its body is moving. We propose a theoretical description of the nature of these regularities and illustrate the approach on a simulated robotic arm equipped with an eye-like sensor, and which interacts with an object. Finally we show how these regularities can be used to build an internal representation of the sensor’s external spatial configuration.
Tasks
Published 2018-06-07
URL http://arxiv.org/abs/1806.02739v2
PDF http://arxiv.org/pdf/1806.02739v2.pdf
PWC https://paperswithcode.com/paper/discovering-space-grounding-spatial-topology
Repo
Framework

Siamese Neural Networks with Random Forest for detecting duplicate question pairs

Title Siamese Neural Networks with Random Forest for detecting duplicate question pairs
Authors Ameya Godbole, Aman Dalmia, Sunil Kumar Sahu
Abstract Determining whether two given questions are semantically similar is a fairly challenging task given the different structures and forms that the questions can take. In this paper, we use Gated Recurrent Units(GRU) in combination with other highly used machine learning algorithms like Random Forest, Adaboost and SVM for the similarity prediction task on a dataset released by Quora, consisting of about 400k labeled question pairs. We got the best result by using the Siamese adaptation of a Bidirectional GRU with a Random Forest classifier, which landed us among the top 24% in the competition Quora Question Pairs hosted on Kaggle.
Tasks
Published 2018-01-22
URL http://arxiv.org/abs/1801.07288v3
PDF http://arxiv.org/pdf/1801.07288v3.pdf
PWC https://paperswithcode.com/paper/siamese-neural-networks-with-random-forest
Repo
Framework

Practical optimal registration of terrestrial LiDAR scan pairs

Title Practical optimal registration of terrestrial LiDAR scan pairs
Authors Zhipeng Cai, Tat-Jun Chin, Alvaro Parra Bustos, Konrad Schindler
Abstract Point cloud registration is a fundamental problem in 3D scanning. In this paper, we address the frequent special case of registering terrestrial LiDAR scans (or, more generally, levelled point clouds). Many current solutions still rely on the Iterative Closest Point (ICP) method or other heuristic procedures, which require good initializations to succeed and/or provide no guarantees of success. On the other hand, exact or optimal registration algorithms can compute the best possible solution without requiring initializations; however, they are currently too slow to be practical in realistic applications. Existing optimal approaches ignore the fact that in routine use the relative rotations between scans are constrained to the azimuth, via the built-in level compensation in LiDAR scanners. We propose a novel, optimal and computationally efficient registration method for this 4DOF scenario. Our approach operates on candidate 3D keypoint correspondences, and contains two main steps: (1) a deterministic selection scheme that significantly reduces the candidate correspondence set in a way that is guaranteed to preserve the optimal solution; and (2) a fast branch-and-bound (BnB) algorithm with a novel polynomial-time subroutine for 1D rotation search, that quickly finds the optimal alignment for the reduced set. We demonstrate the practicality of our method on realistic point clouds from multiple LiDAR surveys.
Tasks Point Cloud Registration
Published 2018-11-25
URL http://arxiv.org/abs/1811.09962v3
PDF http://arxiv.org/pdf/1811.09962v3.pdf
PWC https://paperswithcode.com/paper/practical-optimal-registration-of-terrestrial
Repo
Framework

Predicting Cyber Events by Leveraging Hacker Sentiment

Title Predicting Cyber Events by Leveraging Hacker Sentiment
Authors Ashok Deb, Kristina Lerman, Emilio Ferrara
Abstract Recent high-profile cyber attacks exemplify why organizations need better cyber defenses. Cyber threats are hard to accurately predict because attackers usually try to mask their traces. However, they often discuss exploits and techniques on hacking forums. The community behavior of the hackers may provide insights into groups’ collective malicious activity. We propose a novel approach to predict cyber events using sentiment analysis. We test our approach using cyber attack data from 2 major business organizations. We consider 3 types of events: malicious software installation, malicious destination visits, and malicious emails that surpassed the target organizations’ defenses. We construct predictive signals by applying sentiment analysis on hacker forum posts to better understand hacker behavior. We analyze over 400K posts generated between January 2016 and January 2018 on over 100 hacking forums both on surface and Dark Web. We find that some forums have significantly more predictive power than others. Sentiment-based models that leverage specific forums can outperform state-of-the-art deep learning and time-series models on forecasting cyber attacks weeks ahead of the events.
Tasks Sentiment Analysis, Time Series
Published 2018-04-14
URL http://arxiv.org/abs/1804.05276v1
PDF http://arxiv.org/pdf/1804.05276v1.pdf
PWC https://paperswithcode.com/paper/predicting-cyber-events-by-leveraging-hacker
Repo
Framework

NIMFA: A Python Library for Nonnegative Matrix Factorization

Title NIMFA: A Python Library for Nonnegative Matrix Factorization
Authors Marinka Zitnik, Blaz Zupan
Abstract NIMFA is an open-source Python library that provides a unified interface to nonnegative matrix factorization algorithms. It includes implementations of state-of-the-art factorization methods, initialization approaches, and quality scoring. It supports both dense and sparse matrix representation. NIMFA’s component-based implementation and hierarchical design should help the users to employ already implemented techniques or design and code new strategies for matrix factorization tasks.
Tasks
Published 2018-08-06
URL http://arxiv.org/abs/1808.01743v1
PDF http://arxiv.org/pdf/1808.01743v1.pdf
PWC https://paperswithcode.com/paper/nimfa-a-python-library-for-nonnegative-matrix
Repo
Framework

You Look Twice: GaterNet for Dynamic Filter Selection in CNNs

Title You Look Twice: GaterNet for Dynamic Filter Selection in CNNs
Authors Zhourong Chen, Yang Li, Samy Bengio, Si Si
Abstract The concept of conditional computation for deep nets has been proposed previously to improve model performance by selectively using only parts of the model conditioned on the sample it is processing. In this paper, we investigate input-dependent dynamic filter selection in deep convolutional neural networks (CNNs). The problem is interesting because the idea of forcing different parts of the model to learn from different types of samples may help us acquire better filters in CNNs, improve the model generalization performance and potentially increase the interpretability of model behavior. We propose a novel yet simple framework called GaterNet, which involves a backbone and a gater network. The backbone network is a regular CNN that performs the major computation needed for making a prediction, while a global gater network is introduced to generate binary gates for selectively activating filters in the backbone network based on each input. Extensive experiments on CIFAR and ImageNet datasets show that our models consistently outperform the original models with a large margin. On CIFAR-10, our model also improves upon state-of-the-art results.
Tasks
Published 2018-11-27
URL http://arxiv.org/abs/1811.11205v2
PDF http://arxiv.org/pdf/1811.11205v2.pdf
PWC https://paperswithcode.com/paper/gaternet-dynamic-filter-selection-in
Repo
Framework
comments powered by Disqus